Dependencies

Tup handles dependencies a little differently from other build systems. In this example, we'll show how the dependencies that you specify in a Tupfile work together with the dependencies determined automatically during the program's execution.

First setup a test directory with a Tupfile and shell script:

$ tup init tup_test2
$ cd tup_test2
test.sh
#! /bin/sh
echo "Output from test.sh"
Tupfile
: |> ./test.sh > %o |> output.txt
$ chmod +x test.sh
$ tup
[ tup ] Scanning filesystem...0.006s
[ tup ] No tup.config changes.
[ tup ] Parsing Tupfiles...
[    1/1    ] .
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] ./test.sh > output.txt
[ tup ] Updated.
$ cat output.txt
Output from test.sh

So our simple script ran and created the output.txt file. When will tup decide to run the script again to create a new output? The answer is simple: whenever the test.sh script changes! Watch as tup does not re-run the script until the script is touched:

$ tup
[ tup ] Scanning filesystem...0.000s
[ tup ] No tup.config changes.
[ tup ] No Tupfiles to parse.
[ tup ] No files to delete.
[ tup ] No commands to execute.
[ tup ] Updated.
$ touch test.sh
$ tup
[ tup ] Scanning filesystem...0.005s
[ tup ] No tup.config changes.
[ tup ] No Tupfiles to parse.
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] ./test.sh > output.txt
[ tup ] Updated.

We can change the script to read from other files, as well. Suppose we create a new text file, called header.txt, that gets cat'd at the top of the script:

header.txt
This is the file header
test.sh
#! /bin/sh
cat header.txt
echo "Output from test.sh"
$ tup
[ tup ] Scanning filesystem...0.007s
[ tup ] No tup.config changes.
[ tup ] Parsing Tupfiles...
[    1/1    ] .
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] ./test.sh > output.txt
[ tup ] Updated.
$ cat output.txt
This is the file header
Output from test.sh

Note that we don't need to specify the header.txt dependency in the Tupfile, but tup will still re-run the script if it changes:

header.txt
This is the *new* file header
$ tup
[ tup ] Scanning filesystem...0.007s
[ tup ] No tup.config changes.
[ tup ] No Tupfiles to parse.
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] ./test.sh > output.txt
[ tup ] Updated.
$ cat output.txt
This is the *new* file header
Output from test.sh

Similar to the first Tupfile example, tup has executed the script in such a way that it can track file accesses. Tup sees that the script read from header.txt and automatically adds the dependency. We can see this from the dependency graph:

$ tup graph --stickies . | dot -Tpng > ~/ex_deps_1.png

Both the header.txt and test.sh files point to the command that executes the test script. If either of those files change, then tup will re-execute the script. Otherwise there's no need to run it again!

Dependencies on Generated Files

At this point, the header.txt and test.sh files are written by you, the author. Tup refers to these as "normal" files (in the tup source code, this is the TUP_NODE_FILE enum). The output.txt file is generated by tup, so it is called a "generated" file (TUP_NODE_GENERATED in the source). Generated files are treated a little differently in tup. Let's see what happens if we create a new generated file and try to read from it in the test script:

Tupfile
: |> echo "generated text" > %o |> generated.txt
: |> ./test.sh > %o |> output.txt
test.sh
#! /bin/sh
cat header.txt
cat generated.txt
echo "Output from test.sh"
$ tup
[ tup ] Scanning filesystem...0.007s
[ tup ] No tup.config changes.
[ tup ] Parsing Tupfiles...
[    1/1    ] .
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/2    ] echo "generated text" > generated.txt
[    2/2    ] ./test.sh > output.txt
 *** tup errors ***
tup error: Missing input dependency - a file was read from, and was not
specified as an input link for the command. This is an issue because the file
was created from another command, and without the input link the commands may
execute out of order. You should add this file as an input, since it is
possible this could randomly break in the future.
 - [8] generated.txt
 *** Command ID=6 ran successfully, but tup failed to save the dependencies.

Oops, it seems tup didn't like that. The issue here is that tup has no way of knowing that the command to create generated.txt must run before the test.sh script. Therefore, it is possible that tup will schedule them in the wrong order (so that generated.txt isn't created by the time the script runs), or it may even schedule them in parallel.

To give tup the information that generated.txt must be created first, we simply list it as an input to the test script:

Tupfile
: |> echo "generated text" > %o |> generated.txt
: generated.txt |> ./test.sh > %o |> output.txt
$ tup
[ tup ] Scanning filesystem...0.007s
[ tup ] No tup.config changes.
[ tup ] Parsing Tupfiles...
[    1/1    ] .
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] ./test.sh > output.txt
[ tup ] Updated.
$ cat output.txt
This is the *new* file header
generated text
Output from test.sh

Tupfile Dependencies are for Ordering

Now we will see what happens if you add an input to the Tupfile, but the input goes unused. Let's add another generated file, but the shell script is unchanged so we don't actually read from it:

Tupfile
: |> echo "generated text" > %o |> generated.txt
: |> echo "unused text" > %o |> unused.txt
: generated.txt unused.txt |> ./test.sh > %o |> output.txt
$ tup
[ tup ] Scanning filesystem...0.007s
[ tup ] No tup.config changes.
[ tup ] Parsing Tupfiles...
[    1/1    ] .
[ tup ] No files to delete.
[ tup ] Executing Commands...
[    1/1    ] echo "unused text" > unused.txt
[ tup ] Updated.

Curiously, the test.sh script did not execute even though unused.txt is listed as an input. How then were we able to bypass the earlier error message by adding an input in the Tupfile, if those inputs don't actually cause the commands to execute? Let's take a look at our new graph:

$ tup graph . | dot -Tpng > ~/ex_deps_2.png

Here we can see that not all of the input dependency arrows are the same. There are solid lines and dotted lines, as well as filled in arrows and empty arrows. The actual details of how tup handles these are beyond the scope of this example. The important thing to see here is that tup still keeps track of the fact that the shell script has a dependency on the unused.txt file, but because the file was never actually read from by the script, it cannot possibly have an effect on the output file (this fact is represented by the dotted line). Therefore, tup knows that if unused.txt changes, the script does not need to run again.

However, the presence of the dotted-line dependency means that if unused.txt is changed and at the same time test.sh is changed to read from that file, tup is guaranteed to execute them in the correct order. In this sense, inputs in Tupfiles are only for ordering, and dependencies determined automatically during program execution are used for re-executing commands. For more insight into why this is useful, see the generated header example.