Shell Programming Secrets Nobody Talks About


Most tutorials about shell programming are often part of larger guides on Linux. They gloss over the numerous ways that your code might work but still fail under certain circumstances. Given that shell scripts are used to manage billions of dollars of assets, it is important to learn how to write clean and safe code with them.

Last year, I wrote a book on Linux command-line tips and tricks, and made several updates to it. Annoyingly, I continue to discover something new and important about the Bash shell program almost every week. I did not want this happening after I had ordered my author copy. The discoveries made me wonder what I have been doing all these years without knowing these bash secrets.

sh and bash are not the same

The Bourne shell (sh) program began its life in the 70s with the UNIX operating system. Ubuntu Linux continues to have this old shell alongside its new avatar, the Bourne-Again shell (bash). Try bash -version in the command line, it will display its version number. Try sh -version, you get an error. The two are different. While sh remains an ancient relic, bash continues to be developed and has a lot more features.

It was my practice (in the late 90s) to run my shell scripts in SCO UNIX with the sh command. I continued this in Ubuntu and found that a lot of online script examples did not work with it. (As a security measure, I never give the extension .sh or the +x permission to my scripts. My scripts remain anonymous with an innocuous .txt extension

Aware of this problem, a lot of script authors place a comment #!/bin/bash on the first line. This comment ensures that the script will be run with bash even if it is invoked with sh.

Some overzealous fanatics use the comment #!/usr/bin/env bash instead as a more failsafe measure. They say that bash may not always be at /bin, so it is better to make env to find it. By this, they assume that env will always be found at /usr/bin. Seems overkill to me. If you are on Ubuntu, as most people are, then #!/bin/bash should do fine.

if statements are not what they seem to be

The shell’s if statement is very unusual.

if test-expression; then





The test-expression needs to return 0 (zero) to be true and any non-zero value to be false. In most languages, 1 (one) evaluates as true and 0 (zero) evaluates as false. Why does bash behave differently?

This is because shell scripts often need to determine how other programs have performed. They do this by reading the exit value of those programs. By convention, when a program exits without an error, it returns control to the invoking program with an exit code of 0 (zero). If it needs to exit after encountering an error, it returns with a non-zero exit code. To help in troubleshooting, program authors publish special meaning to each non-zero exit code.

Thus, in the if statement, the test-expression could be a program. If the program executed successfully and returned 0 to the shell, then the if statement behaves as if it evaluated to true. If the program exited with a non-zero value, then the if statement behaves as if it evaluated to false.

In Figure 1, the if statement evaluates commands and checks their exit values. It does not evaluate expressions as true or false.

Exit code for if condition
Figure 1: Exit code for if condition

What you need to remember is that the if statement is not looking for the boolean values — true or false.

Does this mean that if true; then it will evaluate to false because it is not 0 (zero)? No!

That brings us to another strange feature of the shell. true is actually a program! It is not part of the shell language. In Ubuntu, it resides in /usr/bin/true and exits with a return code of 0. There is also a false program residing at /usr/bin/false, which exits with a return code of 1.

[ is a program

To test whether a file exits, you can use if [ -f the-file.ext ]; then. Here, the single bracket [ is not part of the language. It is a program at /usr/bin/[ and its arguments are: –f, the-file.ext and ].

To ensure that the [ commands are executed properly, there has to be a space after the opening bracket and before the closing bracket. If you omit the first, you are not invoking the correct program. If you omit the latter, you have failed to terminate the command with the correct closing argument.

Beware of space in string comparisons

When you assign a value to a string variable, DO NOT leave any space before and after the = sign. If you do, it seems to the shell that the variable is a command and the = and the attempted value for the variable are its arguments.

When you test whether two strings are equal, DO leave a space before and after the = sign. If you do not, the [ program will think you are trying to make an assignment. This assignment statement has an exit value of 0 (zero). This means the if statement will always be forced to evaluate to true!

# Causes an error because ‘sTest’ looks like a command 
# and ‘=’ and ‘”hello”’ become its arguments
sTest = “hello”

# Assigns string variable correctly

# Temporary assignment evaluates to true whatever the value
if [ “$sTest”=”hellooooooooo” ]; then
  echo “Yep”
  echo “Nope”

# String comparison evaluates to true
if [ “$sTest” = “hello” ]; then
  echo “Yep”
  echo “Nope”

[[ is not the fail-safe version of [

Unlike [, which is a program, the [[ construct is a part of the shell language. Some misguided fellows on the internet recommend that you replace all your [ evaluations with [[ ones. Do not follow this advice.

[[ is used for a more literal evaluation of text strings. You do not have to quote everything.

  • Words and file names are not expanded. However, other forms of expansion such as text expansions and substitutions are performed.
  • The = operator behaves like the way = or == operators do with [.
  • The != and == operators compare the text expression on the left with a pattern on the right.
  • A pattern is a text string containing at least one wildcard character (* or ?) or a bracket expression [..]. A bracket expression encloses a set of characters or a range of characters (separated by a hyphen (-)) between the square brackets ([ and ]).
  • A new =~ operator is available. (It cannot be used with [.) It compares the text expression on the left with a regular expression on the right. (It will exit with a return value of 2 if the regular expression is invalid.) The =~ operator is great for matching substrings.
# Matches substring ell
$ if [[ “Hello?” =~ ell ]]; then echo “Yes”; else echo “No”; fi

# Matches substring Hell at beginning
$ if [[ “Hello?” =~ ^Hell ]]; then echo “Yes”; else echo “No”; fi

# Does not match substring ? (a regex special character) at the end
$ if [[ “Hello?” =~ ?$ ]]; then echo “Yes”; else echo “No”; fi

# Matches substring ? at the end when quoted
$ if [[ “Hello?” =~ “?”$ ]]; then echo “Yes”; else echo “No”; fi

Evaluations with [ and [[ have their legitimate use cases. Do not use one for the other (se Table below).

Operator use Result
[ -f file ] Does it exist as a file?
[ -d file ] Does it exist as a directory?
[ -h file ] Does it exist as a soft link?
[ -r file ] Is the file readable?
[ -w file ] Is the file writeable?
[ -z file ] Is the string empty?
[ -n file ] Is the string not empty?
[ string1 = string2 ] Are the strings same?= is same as ==
[ string1 != string2 ] Are the strings different?
[ n1 -eq n2 ] Are the numbers same?
[ n1 -ne n2 ] Are the numbers different?
[ n1 -le n2 ] Is n1 less than or equal to n2?
[ n1 -ge n2 ] Is n1 greater than or equal to n2?
[ n1 -lt n2 ] Is n1 less than n2
[ n1 -gt n2 ] Is n1 greater than n2
[ ! e ] Is the expression false?
[ e1 ] && [ e2 ] Are both expressions true?
[ e1 ] || [ e2 ] Is one of the expressions true?
[[ string1 = string2 ]] Are the strings same? Behaves like == in single-square-bracket evaluations.
[[ string1 == string2 ]] Does string1 match the pattern string2?
[[ string1 != string2 ]] Does string1 not match the pattern string2?
[[ string1 =~ string2 ]] Does string1 not match the regular expression string2?

Do not use -a and -o logical operators. You will make mistakes reading and writing them. They are the sh way of doing things. Square brackets and operators && and || are so bash.

Arithmetic operations are not straightforward

If you set a=1 and then try a=a+1, does $a echo as 2 or 11? The answer is a+1. Until a few years back, I did not know how to perform arithmetic operations in bash. I never had to so I never learned it. I just assumed that it must be the same as in other languages but it was not to be. To add one plus one, you can use:

let a=a+1

# or

a=$(( a+1 ))

Array operations can be cryptic

Does every language out there need to have a totally different method to create and use arrays? Who is so evil? Why?

# Creates an array
var=(hello world how are you)

# Displays hello
echo $var

# Displays world
echo ${var[2]} 

# Changes hello to howdy

# Displays howdy
echo ${var[0]}

# Displays values — hello world how are you
echo ${var[@]}

# Displays values — hello world how are you
echo ${var[*]}

# Displays indexes or keys — 0 1 2 3 4
echo ${!var[@]}

# Displays indexes or keys — 0 1 2 3 4
echo ${!var[*]}

# Displays w
echo ${var:2:1}

# Displays 5, the number of variables in the array
echo ${#var}

Bash is a minefield of careless errors

A shell script will execute like there is no tomorrow, irrespective of any errors it encounters. If a statement encounters an error and exits with a non-zero exit code, bash is happy to display any error message it wants but will nonchalantly continue to execute the subsequent statements.

What seem like key words or programming constructs are actually programs
Figure 2: What seem like key words or programming constructs are actually programs

If you try to use an undefined variable, bash will not treat it as an error. Bash will substitute an empty string and proceed. If you try sudo rm -rf $non-existent-variable/, the command evaluates to sudo rm -rf /. I have not tried it yet so I cannot tell what protections Linux has.

These shell behaviours are extremely dangerous. To fail early, place the following statement at the top of your scripts:

set -eu

That is, after the #!/bin/bash comment.

It is not possible to cover all Bash secrets in one article. In the next article, I will cover how Bash performs text expansions, substitutions and removals.

Read 2nd Part of this article


Please enter your comment!
Please enter your name here