1 ===================================================================
2 How To Add Your Build Configuration To LLVM Buildbot Infrastructure
3 ===================================================================
8 This document contains information about adding a build configuration and
9 buildbot-worker to private worker builder to LLVM Buildbot Infrastructure.
14 There are two buildmasters running.
16 * The main buildmaster at `<https://lab.llvm.org/buildbot>`_. All builders
17 attached to this machine will notify commit authors every time they break
19 * The staging buildmaster at `<https://lab.llvm.org/staging>`_. All builders
20 attached to this machine will be completely silent by default when the build
21 is broken. This buildmaster is reconfigured every two hours with any new
22 commits from the llvm-zorg repository.
24 In order to remain connected to the main buildmaster (and thus notify
25 developers of failures), a builbot must:
27 * Be building a supported configuration. Builders for experimental backends
28 should generally be attached to staging buildmaster.
29 * Be able to keep up with new commits to the main branch, or at a minimum
30 recover to tip of tree within a couple of days of falling behind.
32 Additionally, we encourage all bot owners to point their bots towards the
33 staging master during maintenance windows, instability troubleshooting, and
39 Each buildbot has an owner who is the responsible party for addressing problems
40 which arise with said buildbot. We generally expect the bot owner to be
41 reasonably responsive.
43 For some bots, the ownership responsibility is split between a "resource owner"
44 who provides the underlying machine resource, and a "configuration owner" who
45 maintains the build configuration. Generally, operational responsibility lies
46 with the "config owner". We do expect "resource owners" - who are generally
47 the contact listed in a workers attributes - to proxy requests to the relevant
48 "config owner" in a timely manner.
50 Most issues with a buildbot should be addressed directly with a bot owner
51 via email. Please CC `Galina Kistanova <mailto:gkistanova@gmail.com>`_.
53 Steps To Add Builder To LLVM Buildbot
54 =====================================
55 Volunteers can provide their build machines to work as build workers to
58 Here are the steps you can follow to do so:
60 #. Check the existing build configurations to make sure the one you are
61 interested in is not covered yet or gets built on your computer much
62 faster than on the existing one. We prefer faster builds so developers
63 will get feedback sooner after changes get committed.
65 #. The computer you will be registering with the LLVM buildbot
66 infrastructure should have all dependencies installed and be able to
67 build your configuration successfully. Please check what degree
68 of parallelism (-j param) would give the fastest build. You can build
69 multiple configurations on one computer.
71 #. Install buildbot-worker (currently we are using buildbot version 2.8.4).
72 This specific version can be installed using ``pip``, with a command such
73 as ``pip3 install buildbot-worker==2.8.4``.
75 #. Create a designated user account, your buildbot-worker will be running under,
76 and set appropriate permissions.
78 #. Choose the buildbot-worker root directory (all builds will be placed under
79 it), buildbot-worker access name and password the build master will be using
80 to authenticate your buildbot-worker.
82 #. Create a buildbot-worker in context of that buildbot-worker account. Point it
83 to the **lab.llvm.org** port **9994** (see `Buildbot documentation,
85 <http://docs.buildbot.net/current/tutorial/firstrun.html#creating-a-worker>`_
86 for more details) by running the following command:
90 $ buildbot-worker create-worker <buildbot-worker-root-directory> \
92 <buildbot-worker-access-name> \
93 <buildbot-worker-access-password>
95 Only once a new worker is stable, and
96 approval from Galina has been received (see last step) should it
97 be pointed at the main buildmaster.
103 $ buildbot-worker start <buildbot-worker-root-directory>
105 This will cause your new worker to connect to the staging buildmaster
106 which is silent by default.
108 Try this once then check the log file
109 ``<buildbot-worker-root-directory>/worker/twistd.log``. If your settings
110 are correct you will see a refused connection. This is good and expected,
111 as the credentials have not been established on both ends. Now stop the
112 worker and proceed to the next steps.
114 #. Fill the buildbot-worker description and admin name/e-mail. Here is an
115 example of the buildbot-worker description::
118 Core i7 (2.66GHz), 16GB of RAM
120 g++.exe (TDM-1 mingw32) 4.4.0
123 Microsoft(R) 32-bit C/C++ Optimizing Compiler Version 16.00.40219.01 for 80x86
125 See `here <http://docs.buildbot.net/current/manual/installation/worker.html>`_
126 for which files to edit.
128 #. Send a patch which adds your build worker and your builder to
129 `zorg <https://github.com/llvm/llvm-zorg>`_. Use the typical LLVM
130 `workflow <https://llvm.org/docs/Contributing.html#how-to-submit-a-patch>`_.
132 * workers are added to ``buildbot/osuosl/master/config/workers.py``
133 * builders are added to ``buildbot/osuosl/master/config/builders.py``
135 Please make sure your builder name and its builddir are unique through the
138 All new builders should default to using the "'collapseRequests': False"
139 configuration. This causes the builder to build each commit individually
140 and not merge build requests. To maximize quality of feedback to developers,
141 we *strongly prefer* builders to be configured not to collapse requests.
142 This flag should be removed only after all reasonable efforts have been
143 exhausted to improve build times such that the builder can keep up with
146 It is possible to allow email addresses to unconditionally receive
147 notifications on build failure; for this you'll need to add an
148 ``InformativeMailNotifier`` to ``buildbot/osuosl/master/config/status.py``.
149 This is particularly useful for the staging buildmaster which is silent
152 #. Send the buildbot-worker access name and the access password directly to
153 `Galina Kistanova <mailto:gkistanova@gmail.com>`_, and wait until she
154 lets you know that your changes are applied and buildmaster is
157 #. Make sure you can start the buildbot-worker and successfully connect
158 to the silent buildmaster. Then set up your buildbot-worker to start
159 automatically at the start up time. See the buildbot documentation
160 for help. You may want to restart your computer to see if it works.
162 #. Check the status of your buildbot-worker on the `Waterfall Display (Staging)
163 <http://lab.llvm.org/staging/#/waterfall>`_ to make sure it is
164 connected, and the `Workers Display (Staging)
165 <http://lab.llvm.org/staging/#/workers>`_ to see if administrator
166 contact and worker information are correct.
168 #. At this point, you have a working builder connected to the staging
169 buildmaster. You can now make sure it is reliably green and keeps
170 up with the build queue. No notifications will be sent, so you can
171 keep an unstable builder connected to staging indefinitely.
173 #. (Optional) Once the builder is stable on the staging buildmaster with
174 several days of green history, you can choose to move it to the production
175 buildmaster to enable developer notifications. Please email `Galina
176 Kistanova <mailto:gkistanova@gmail.com>`_ for review and approval.
178 To move a worker to production (once approved), stop your worker, edit the
179 buildbot.tac file to change the port number from 9994 to 9990 and start it
182 Best Practices for Configuring a Fast Builder
183 =============================================
185 As mentioned above, we generally have a strong preference for
186 builders which can build every commit as they come in. This section
187 includes best practices and some recommendations as to how to achieve
191 In 2020, the monorepo had just under 35 thousand commits. This works
192 out to an average of 4 commits per hour. Already, we can see that a
193 builder must cycle in less than 15 minutes to have a hope of being
194 useful. However, those commits are not uniformly distributed. They
195 tend to cluster strongly during US working hours. Looking at a couple
196 of recent (Nov 2021) working days, we routinely see ~10 commits per
197 hour during peek times, with occasional spikes as high as ~15 commits
198 per hour. Thus, as a rule of thumb, we should plan for our builder to
199 complete ~10-15 builds an hour.
201 Resource Appropriately
202 At 10-15 builds per hour, we need to complete a new build on average every
203 4 to 6 minutes. For anything except the fastest of hardware/build configs,
204 this is going to be well beyond the ability of a single machine. In buildbot
205 terms, we likely going to need multiple workers to build requests in parallel
206 under a single builder configuration. For some rough back of the envelope
207 numbers, if your build config takes e.g. 30 minutes, you will need something
208 on the order of 5-8 workers. If your build config takes ~2 hours, you'll
209 need something on the order of 20-30 workers. The rest of this section
210 focuses on how to reduce cycle times.
212 Restrict what you build and test
213 Think hard about why you're setting up a bot, and restrict your build
214 configuration as much as you can. Basic functionality is probably
215 already covered by other bots, and you don't need to duplicate that
216 testing. You only need to be building and testing the *unique* parts
217 of the configuration. (e.g. For a multi-stage clang builder, you probably
218 don't need to be enabling every target or building all the various utilities.)
220 It can sometimes be worthwhile splitting a single builder into two or more,
221 if you have multiple distinct purposes for the same builder. As an example,
222 if you want to both a) confirm that all of LLVM builds with your host
223 compiler, and b) want to do a multi-stage clang build on your target, you
224 may be better off with two separate bots. Splitting increases resource
225 consumption, but makes it easy for each bot to keep up with commit flow.
226 Additionally, splitting bots may assist in triage by narrowing attention to
227 relevant parts of the failing configuration.
229 In general, we recommend Release build types with Assertions enabled. This
230 generally provides a good balance between build times and bug detection for
231 most buildbots. There may be room for including some debug info (e.g. with
232 `-gmlt`), but in general the balance between debug info quality and build
233 times is a delicate one.
236 Ninja really does help build times over Make, particularly for highly
237 parallel builds. LLD helps to reduce both link times and memory usage
238 during linking significantly. With a build machine with sufficient
239 parallelism, link times tend to dominate critical path of the build, and are
240 thus worth optimizing.
242 Use CCache and NOT incremental builds
243 Using ccache materially improves average build times. Incremental builds
244 can be slightly faster, but introduce the risk of build corruption due to
245 e.g. state changes, etc... At this point, the recommendation is not to
246 use incremental builds and instead use ccache as the latter captures the
247 majority of the benefit with less risk of false positives.
249 One of the non-obvious benefits of using ccache is that it makes the
250 builder less sensitive to which projects are being monitored vs built.
251 If a change triggers a build request, but doesn't change the build output
252 (e.g. doc changes, python utility changes, etc..), the build will entirely
253 hit in cache and the build request will complete in just the testing time.
255 With multiple workers, it is tempting to try to configure a shared cache
256 between the workers. Experience to date indicates this is difficult to
257 well, and that having local per-worker caches gets most of the benefit
258 anyways. We don't currently recommend shared caches.
260 CCache does depend on the builder hardware having sufficient IO to access
261 the cache with reasonable access times - i.e. a fast disk, or enough memory
262 for a RAM cache, etc.. For builders without, incremental may be your best
263 option, but is likely to require higher ongoing involvement from the
267 As a last resort, you can configure your builder to batch build requests.
268 This makes the build failure notifications markedly less actionable, and
269 should only be done once all other reasonable measures have been taken.
271 Leave it on the staging buildmaster
272 While most of this section has been biased towards builders intended for
273 the main buildmaster, it is worth highlighting that builders can run
274 indefinitely on the staging buildmaster. Such a builder may still be
275 useful for the sponsoring organization, without concern of negatively
276 impacting the broader community. The sponsoring organization simply
277 has to take on the responsibility of all bisection and triage.