Bash Bonanza Part 5: External Commands

bash-bonanza-part-5-process-management

Welcome back, fellow bashstronauts!

Having previously discussed how arrays work with bash, I'd now like to discuss some subtleties of running external commands in Bash - that is, commands that are not Bash builtins.

This is a particularly important topic since a lot of the time, Bash is used as a sort of glue between processes - connecting or redirecting their outputs and inputs, running them in the background, e.t.c.

A fair warning before we start - Bash does not come with particularly rich process management functionality. If you need to do anything complicated, consider another programming language.

With that out of the way let's begin!

Running External Commands

When running external commands in a bash script, you have two choices - running them in the foreground, or in the background:

Running them in the foreground involves simply writing out the command. This will then run to completion, and then continue with the next command.

These can be chained together with semicolons - then they will run in sequence.

#!/usr/bin/env bash
/bin/echo "Order"
/bin/echo "Is"
/bin/echo "Very"; echo "Important"
> Order
> Is
> Very
> Important

Running them in the background involves writing out the command and appending an ampersand to the end, in which case it will immediately execute in the background and the bash script will continue to the next command without waiting.

These can be chained with ampersands, in which case they will be put one after another to execute in the background (concurrently).

#!/usr/bin/env bash
/bin/echo "Order" &
/bin/echo "Is" &
/bin/echo "Very" & echo "Important" &
Important
Order
Is
Very

Imposing Order

Our previous output was in a random order, since the processes run in parallel. To avoid sounding like Yoda, we may want to enforce some order on our echoes with the wait builtin.

When running an external command in bash, it is run as a separate process - more on that later. Each process has a process ID (PID) associated with it, which is, for all intents and purposes, random. There is a limited amount of them, so when once process completes, the process ID will be made available for other processes. This is commonly known as PID recycling.

The PID of the last command run in the background is held in the $! variable. The wait builtin can then take the PID of a process as an argument, in which case it will wait until that process runs to completion before continuing. In this case, it will also return the exit code of that process.

It can also be run without arguments, in which case it will wait for all background processes to complete.

Since Bash 4.3, you also have wait -n available, which will wait for any single job to complete before continuing - so if you have five processes running in the background, it will only wait for one of them.

#!/usr/bin/env bash
/bin/echo "Hello!" & HELLO_PID="$!"
wait "$HELLO_PID"
HELLO_EXIT_CODE="$?"
echo "Hello exit code: $HELLO_EXIT_CODE"

/bin/echo "Sometimes..." &
/bin/echo "I..." &
wait

/bin/echo "eat pies"

> Hello!
> Hello exit code: 0
> I...
> Sometimes...
> eat pies

Signalling Processes

Given a PID, you can send a signal to the process with the kill builtin. There are various signals (see info kill for a list), and they are usually used to terminate the process - by default, the TERM signal is sent. However, any program can overwrite what will be done when they receive the signal, so it is implementation dependent. For example, there is usually a signal that long-running daemons will accept to reload the configuration, or increase the log level.

A notable exception is the KILL signal, which can not be captured by the application, and is what the OS uses to kill the process immediately. This means that the application signalled will NOT have the opportunity to clean up and exit gracefully. Absolutely do not use the KILL signal if you can avoid it.

#!/usr/bin/env bash
sleep 10000 & SLEEP_ID="$!"
kill "$SLEEP_ID"
wait "$SLEEP_ID"
SLEEP_EXIT_CODE="$?"
echo "/bin/sleep was rudely interrupted and exits the stage with $SLEEP_EXIT_CODE"

> /bin/sleep was rudely interrupted and exits the stage with 143

Important Considerations

When you first start using the previously discussed tools, you will encounter some possibly counter-intuitive behaviour.

For example, try running the following script:

#!/usr/bin/env bash
echo "bash script is running with PID $$ - kill me!"
/bin/sleep 123

Running kill on that PID, and then checking ps aux | grep [s]leep, you may be surprised to see that sleep is still running!

To understand this behaviour, we need to understand how bash runs external commands. It will use the fork system call to create an almost exact copy of itself (called a child process), run the exec system call in the child (which will change the currently running command to the one specified - for example /bin/sleep), and then run the wait system call from the parent (the one we called the external command from) in order to wait for the child process to complete, and to get its exit code.

Note that when running a command in the background, it will simply not run the wait system call in the parent - meaning that we have to do so manually.

This means that the script and /bin/sleep are now two almost autonomous processes. If you send a signal to the script, bash will NOT propagate the signal to the child by default. If you want that to happen, you will have to do that yourself!

One important thing that makes the parent and child not completely autonomous is that the parent knows the child PID, and under UNIX, is guaranteed that the PID will not be recycled until the wait system call is called from the parent. This makes it safe to send signals from the parent to the child - you can be sure that it is not another process receiving it. This also makes it extremely dangerous to do so if you are not the parent - there is no guarantee the PID was not recycled between when you obtained it and when you send the signal. You should avoid doing that at all costs!

Closing thoughts

Here we covered how Bash runs external commands in a fair amount of detail, and covered some common gotchas. As usual, I hope these will help you avoid the common pitfalls and long debugging sessions!