Tupfiles in m-c: A First Attempt

2012-12-18 by Mike Shal, tagged as mozilla, tup

Here is a first look at the beginnings of a Tupfile overlay for the mozilla-central source tree. So far it just handles the XPIDLSRCS variable, so it installs .idl files into dist/idl and generates .h files from them in dist/include. This should work in both Linux and OSX, though Windows won't run at the moment because some tup features that are used have not yet been implemented on that platform.

Goals

The ultimate goals of this effort are as follows:

  1. Make incremental builds as fast as possible, mostly by avoiding unnecessary work.
  2. Make incremental builds as reliable as possible, so that a 'clobber' is never needed.
  3. Avoid a Flag Day, so that people who are used to the current system can try to switch at their leisure.

The first two goals are design goals of the tup build system, which is what I'll be using here. In order to support the third goal, I'll be trying to make use of the existing infrastructure as much as possible, so that maintaining both tup and make in the same tree requires little overhead. In some cases this may cause a slight performance impact with regard to parsing Tupfiles vs. a pure tup-based solution; more on that below.

Background

Many Makefiles in the mozilla-central tree are data-driven - they just set some variables, possibly with if-conditionals to test configuration settings. The real logic is handled in separate utility makefiles, like config/rules.mk. Since tup can parse some Makefile syntax directly (mostly just variable definitions and if statements), my plan was to use very simple Tupfiles that would essentially do:

include_rules
include Makefile.in
include $(MOZ_ROOT)/config/build.tup

The first and third lines are the boiler-plate, similar to the 'DEPTH = @DEPTH@' and 'include $(topsrcdir)/config/rules.mk' lines in Makefile.ins. Tup would then be able to use the same logic directly from the Makefile.in, so updates to a make-based build would automatically carry over to a tup build. However, a few things make this difficult:

  1. Makefile.in's are going away, to be replaced by python snippets (tentatively called moz.build files).
  2. Some Makefile.in's are not completely data-driven, and include make rules, shell commands, and other fun things directly in the front-end files.
  3. Due to the way tup's logic works, for a process to write a file in a directory, there needs to be a Tupfile in that directory.

Part 1) would make any effort in this direction rather useless in a short time, since all the infrastructure that tup is relying on would be gone. Part 2) makes it difficult because tup would stumble on parsing the extra information, so either tup or the Makefile.in would have to change. Part 3) also makes it difficult in some circumstances to directly use the same variables, in particular for exporting things into the dist/ directory. To put files into dist/idl, for example, tup is expecting dist/idl/Tupfile rather than dom/alarm/Tupfile and dom/base/Tupfile (which both define some XPIDLSRCS). However, the data we need to use will stay in dom/alarm/Makefile.in and dom/base/Makefile.in (or dom/alarm/moz.build, etc).

Tup Building XPIDLSRCS

In an attempt to work around these issues, I tried to get the XPIDLSRCS compiling as they do with make by, perhaps oddly enough, using pymake. I have only two Tupfiles so far - one for dist/idl and one for dist/include. The dist/include/Tupfile is very simple, since it just needs to run header.py on all the .idl files after they are installed. The dist/idl/Tupfile is where the pymake part comes in. Essentially what I wanted to do in dist/idl is run a python script that does something like:

  1. Read each Makefile.in that defines XPIDLSRCS
  2. Generate tup rules to copy .idl files from the source directory to dist/idl.

Step 1) currently uses pymake to read the Makefile.in data. During the transition to moz.build, this can easily accommodate having the data split between Makfile.in/moz.build, and eventually all in moz.build. When all the logic is in moz.build, instead of doing tup -> python -> pymake -> Makefile.in, we'll do tup -> python -> moz.build. I think this allows progress to be made on a tup back-end in concert with the moz.build switchover.

This setup allows tup to install and compile the .idl files very similar to a make-based build. There are a few minor differences, but I don't think any of these will have an impact in terms of getting a full build up and running:

  • Tup compiles the .idl files directly into dist/include, rather than compiling it in a directory like obj/dom/alarm/_xpidlgen/ and then copying the result to dist/include. It is easier to do it this way given that we need to copy all idl files to dist/idl before compiling them, and saves a step.
  • The header comment includes the filename, which tup is passing in using a relative path. Diffing the generated headers shows:
    2c2
    <  * DO NOT EDIT.  THIS FILE IS GENERATED FROM ../idl/xpctest_params.idl
    ---
    >  * DO NOT EDIT.  THIS FILE IS GENERATED FROM /home/mshal/mozilla-central-git/js/xpconnect/tests/idl/xpctest_params.idl
    
  • Things are not stored in obj-*/ by default. Tup supports an obj directory with the variants feature (one of the things currently not available in Windows), but the Tupfiles are always written as if it were compiling in-tree.
  • Currently every Makefile.in that defines XPIDLSRCS works, except for hal/Makefile.in, since it uses VPATH to grab the .idl file from hal/gonk/. I think this can be made to work by adding support for VPATH in tup_xpidl.py, or maybe just changing the Makefile.in to not use VPATH (and add a hal/gonk/Makefile.in to define XPIDLSRCS=nsIRecoveryStatus.idl).

There are also a few downsides to this approach. For one, the dist/idl/Tupfile needs to specify all of the directories that define XPIDLSRCS. This is redundant information, and may be out-of-date when someone adds a new Makefile.in, or adds XPIDLSRCS to an already existing Makefile.in. Another downside is that parsing the dist/idl directory takes much longer than I would like. Part of this is just from using pymake. Just adding the 'import pymake.parser' line adds 18ms, and each directory parsed by pymake adds a few ms. Right now there are ~167 directories, so this adds up to a noticeable delay in parsing. It also generates a lot of rules, which tup must then put into its database. Right now the break-down looks like:

1) (pymake) Running tup_xpidl.py:  0.793s
2) (tup) Adding rules to database: 0.085s
3) (overhead): IPC? Maybe fixable: 0.122s
-----------------------------------------
total time to parse dist/idl:      1.000s

Fortunately, tup caches the result of parsing the Tupfiles, so this 1s hit only takes affect when tup_xpidl.py, the Tupfile, the relevant Makefile.ins, or any other file opened for reading during the parsing stage changes. Afterward, tup starts to create all of the .idl symlinks and then generate the headers.

Trying It Out

The repo is currently up on github, and my branch is based off of the build-system branch. Note that no attempt has been made to integrate autoconf yet, so for now I just run configure using make and copy over the resulting autoconf.mk to get sensible defaults for my platform. In the future, the autoconf part may need to be handled by mach before tup runs. Since this uses tup, you'll need to install that first. This assumes you already have it in your path.

(Note: instead of cloning, maybe add this as a remote
to your existing repo and fetch the changesets)
$ git clone git://github.com/mshal/mozilla-central.git mozilla-central-tup
$ cd mozilla-central-tup
$ git checkout origin/tup-dist-idl
(This is the commit that adds tup_xpidl.py and such)
$ git show 6a9a5370fb1e2
$ make -f client.mk configure
$ cp obj-*/config/autoconf.mk .
$ tup init
$ tup upd

Now we can test some incremental builds. First, suppose we just change an .idl file. This should not parse anything, but generate the new header. Relevant bits are highlighted

$ touch dom/alarm/nsIAlarmHalService.idl
$ time tup upd
[ tup ] [0.000s] No filesystem scan - monitor is running.
[ tup ] [0.000s] Reading in new environment variables...
[ tup ] [0.001s] No Tupfiles to parse.
[ tup ] [0.001s] No files to delete.
[ tup ] [0.001s] Executing Commands...
 1) [0.082s] dist/include: python header.py [../idl/nsIAlarmHalService.idl -> nsIAlarmHalService.h]
 [ ] 100%
[ tup ] [0.109s] Updated.

real	0m0.120s
user	0m0.068s
sys	0m0.016s

Suppose instead we wanted to remove nsIAlarmHalService.idl from the Makefile.in (we think it's no longer needed, for example). Edit the dom/alarm/Makefile.in and remove it from XPIDLSRCS:

$ $EDITOR dom/alarm/Makefile.in
$ time tup upd
[ tup ] [0.000s] No filesystem scan - monitor is running.
[ tup ] [0.000s] Reading in new environment variables...
[ tup ] [0.001s] Parsing Tupfiles...
 1) [0.974s] dist/idl
 2) [0.621s] dist/include
 [  ] 100%
[ tup ] [1.600s] Deleting files...
[ tup ] [1.600s] Deleting 2 commands...
 1) rm: dist/idl/nsIAlarmHalService.idl
 2) rm: dist/include/nsIAlarmHalService.h
[ tup ] [1.658s] No commands to execute.
[ tup ] [1.674s] Updated.

real	0m1.686s
user	0m1.544s
sys	0m0.072s

Here we pay the penalty of the long parsing times, but then the two files that should no longer be generated are removed from the build. This is one area where tup avoids having to clobber to get rid of stale files.

Of course, we can always quickly add a new .idl file (here, just restore the Makefile.in to include nsIAlarmHalService.idl again).

$ time tup upd
[ tup ] [0.000s] No filesystem scan - monitor is running.
[ tup ] [0.000s] Reading in new environment variables...
[ tup ] [0.001s] Parsing Tupfiles...
 1) [0.991s] dist/idl
 2) [0.623s] dist/include
 [  ] 100%
[ tup ] [1.620s] No files to delete.
[ tup ] [1.651s] Executing Commands...
 1) [0.003s] dist/idl: LN nsIAlarmHalService.idl -> ../../dom/alarm/nsIAlarmHalService.idl
 2) [0.087s] dist/include: python header.py [../idl/nsIAlarmHalService.idl -> nsIAlarmHalService.h]
 [  ] 100%
[ tup ] [1.780s] Updated.

real	0m1.796s
user	0m1.636s
sys	0m0.084s

Now we are right back where we started. Though once we start actually using these headers during the compiling stage, we won't be able to do a complete incremental build in under 2s. However, it is much easier to experiment with changes not only to idl files, but also the Makefile.in's and see how it affects the program, without waiting for a full clobber build.

Questions You May Have

Why doesn't it work on Windows?

The features that allow tup to run external scripts (like tup_xpidl.py) and variants (not covered here) aren't implemented on Windows yet. No technical reason, just time.

Why doesn't it work on OSX 10.8?

Still trying to work this one out. For some reason 10.8 is delaying write() calls on files: https://groups.google.com/forum/#!topicsearchin/osxfuse-group/10.8/osxfuse-group/b0U7bAsDU3c

Why not generate Tupfiles during configure, like Makefiles?

The short answer is that tup is not make, so things designed for make do not necessarily map easily to tup.

The longer answer is two-fold: first, tup doesn't support generating it's build configuration from a sub-process within itself and then re-executing like make does. (ie: creating a rule to generate Makefile will cause make to run the rule, then reload the Makefile and start over. There is no analogue for this in tup). This doesn't mean we couldn't use something else to generate Tupfiles before running tup, but then we'd have to duplicate some functionality in order to handle deleted files, watching sub-processes for file accesses, etc. The second reason is that we don't want to just convert a dom/alarm/Makefile.in straight into a dom/alarm/Tupfile, due to the fact that we really want dist/idl/Tupfile to create the symlinks to dom/alarm/*.idl. This doesn't make it impossible either, but it is a bit more difficult since the generator needs to be smart enough to disassemble the Makefile.in and put the necessary parts into separate Tupfiles. Coupled with the first issue, it makes it difficult to get this right under all scenarios (consider: a Makefile.in goes away, or includes another file, or removes/adds XPIDLSRCS, etc).

The solution proposed here does necessitate Tupfiles being created in the source tree. However, since they can mostly re-use the existing configuration data, maintenance should be fairly low. It also allows tup to do what it is designed to do, by performing correct updates in minimal time.

Post-tup Conversion Recommendations

(Note: This section is not needed for the current conversion, but may be useful for cleanups & speedups in the future).

A one-second parsing time may not sound too bad, but it is a lot longer than I would expect given other Tupfiles that I use. (Eg: in tup's own source tree, most files are <10ms, with the longest at 19ms). A more "tup-like" solution would be to ditch dist/idl/Tupfile entirely, and just use dom/alarm/Tupfile to convert dom/alarm/nsIAlarmHalService.idl into dom/alarm/nsIAlarmHalService.h, and then install the header directly into dist/include. This would require either changing the .idl files like so:

< #include "nsISupports.idl"
---
> #include "xpcom/base/nsISupports.idl"

Then we could just pass in -I(root dir) and compile all the idl files in their own subdirectories rather than in a dist directory. Now if we were to add a new .idl file in dom/alarm, tup would just re-parse dom/alarm/Tupfile very quickly, rather than re-parse dist/idl/Tupfile (and by consequence, all 167 Makefile.in's that define XPIDLSRCS) a bit slower.

Of course, if this can be done with .idl files, why not .h files as well? Is there a need for dist/include at all?

comments powered by Disqus