I’m a huge fan of the unofficial Bash strict mode. I use it for all my personal scripts as well as for other projects when I can. One part of Bash strict mode is the errexit
option, which is turned on by doing set -e
or set -o errexit
. With errexit
turned on, any expression that exits with a non-zero exit code terminates execution of the script, and the exit code of the expression becomes the exit code of the script. errexit
is a big improvement, since any unexpected error ends execution of a script rather than being ignored and allowing execution of the script to continue. If you have a command that you expect to fail, you can do one of two things:
# Use the OR operator
maybe_fails || true
# Use the command as an if condition
if maybe_fails; then
# success
fi
This ensures every command must succeed and the only commands allowed to fail are that are part of expressions designed to handle non-zero exit codes like if statements. If you have a command that you expect to fail and don’t care about the failure you are best off using the OR operator with true
as the second command. If you have a command that may fail, and you want to execute code conditional based on the exit code of the command an if
statement is the best option. But as I recently learned, there is one pitfall with the way errexit
works with functions invoked from if
conditions.
The Problem
errexit
isn’t used when executing functions inside an if
condition. A function executed normally with errexit
would return the exit code of the first expression that returned a non-zero exit code, and execution of the function would stop. A function executed inside an if statement with errexit
would ignore non-zero exit codes from commands invoked by the function and would continue to execute until a return
or exit
command is encountered. Here is an example script:
#!/usr/bin/env bash
set -e
f() {
false
echo "after false"
}
if f; then
echo "f was successful"
fi
f
In this code we invoke the function f
twice. Once inside the if statement and once by itself as a single expression. You might think this script will not output anything, since the first expression in the function f
is false
, which is a command that always returns an exit code of 1. However, the output of this script is actually:
after false f was success
And the exit code of this script is 1, indicating a failure. When f
is executed inside an if condition it is considered successful, and both lines of the function are executed. When it is executed outside of an if statement, as we would expect, only the first line of the function is evaluated and the function returns the exit code of the false
command, which is always 1.
To sum up, when using errexit
, any expression that returns an non-zero exit code will halt execution of the current function and return the non-zero exit code, except for expressions in the following places:
-
The command list following
until
orwhile
-
Part of the test following
if
orelif
-
Preceding
&&
or||
-
Any expression in a pipeline except the last, unless you are using
set -o pipeline
In these locations non-zero exit codes are ignored.
The Solution
Note
|
I originally had a solution here that was flat out wrong. I originally stated:
Here is the same if statement above modified to use this approach:
This code does not work because I was using |
The real solution here is not quite as elegant as I had hoped. The solution is to define a function that disables errexit
, runs a subshell, enables errexit
inside the subshell, executes provided the function, captures the exit code of the subshell, and then turns errexit
back on before returning. The code looks like this:
get_exit_code() {
# We first disable errexit in the current shell
set +e
(
# Then we set it again inside a subshell
set -e;
# ...and run the function
"$@"
)
exit_code=$?
# And finally turn errexit back on in the current shell
set -e
}
exit_code=0
get_exit_code f
if [ "$exit_code" = 0 ]; then
echo "f was successful"
else
echo "f failed"
fi
While this is not very elegant it is easy to use in practice. Since the function is executed outside of the if statement it will always be executed with errexit
set as expected. Output from the original function can still be captured and used if desired.
Update 8/6/2022
Olivier wrote in and shared his own solution to this problem:
#!/usr/bin/env bash
set -e
g(){
return 42
}
f() {
g || return
echo "after false"
}
if f; then
echo "f was successful"
fi
This works about the same as my solution above with two significant differences:
-
It doesn’t run code in a subshell so a separate process is not spawned. It may therefore be slightly faster.
-
It doesn’t capture the exit code to a variable. The if statement will only be able to indicate success or failure unless you capture and compare the exit code yourself inside the if condition with something like
if (f; [ $? -eq 42 ]); then
.