Linux and Mac system administrators are typically familiar with scripting via the terminal, but even Windows users can get in on the action with the Windows Subsystem for Linux.

How Bash Scripts Work

A bash script is simply a plain text file containing a series of commands that the bash shell can read and execute. Bash is the default shell in pre-Catalina macOS, and most Linux distributions.

If you’ve never worked with a shell script before, you should begin with the absolute simplest case. This will allow you to practice key concepts including the creation of the script and its execution.

First, create the following file in a convenient location (ideally, open a terminal and navigate to the desired directory first):

The first line tells whatever runs this program how to run it (i.e. using the bash interpreter). The second is just a command like any other you might enter on the command line. Save that file as hello_world.sh, then:

The chmod command on the first line makes the file executable, meaning that it can be run by typing its name, as in the second line.

If you see the words “Hello, World” appear printed on a line in your terminal, then everything’s working as required.

How For Loops Work

In general programming, there are two main types of for loop: numeric and foreach. The numeric type is traditionally the most common, but in bash usage, it’s usually the other way round.

Numeric for loops typically focus on a single integer which determines how many iterations will be carried out, for example:

This is a familiar-looking for loop that will iterate exactly 100 times, unless i is altered within the loop, or another statement causes execution of the for loop to halt.

Foreach loops, in contrast, tend to operate on structures such as lists or arrays, and iterate for every item within that collection:

Some languages use a slightly different syntax which swaps the order of collection and item:

For in Loops

In bash, the foreach—or for in—loop is more common. The basic syntax is, simply:

For example, to iterate through three explicitly-named files:

If such files exist in the current directory, the output from this script will be:

Instead of a fixed set of files, the list can be obtained via a glob pattern (one including wildcards – special characters that represent other characters). In the following example, the for loop iterates across all files (in the current directory) whose names end in “.xml”:

Here’s some example output:

This may look very much like a long-winded way of doing:

But there’s a significant difference: the for loop executes the ls program 2 separate times, with a single filename passed to it each time. In the separate ls example, the glob pattern (*.xml) matches filenames first and then sends all of them, as individual command-line parameters, to one instance of ls.

Here’s an example that uses the wc (word count) program to make the difference more obvious:

The wc program counts the number of lines in each file separately, then prints a total count across all of them. In contrast, if wc operates within a for loop:

You’ll still see the count for each file:

But there is no overall summary total because wc is run in isolation, each time the loop iterates.

When a List is Not a List

There’s a very easy and common mistake when dealing with for loops, due to the way bash handles quoted arguments/strings. Looping through a list of files should be done like this:

Not like this:

The second example encloses filenames in double-quotes which results in a list with just a single parameter; the for loop will only execute one time. This problem can be avoided by using a variable in such cases:

Note that the variable declaration itself does need to enclose its value in double-quotes!

For Without a List

With nothing to iterate through, a for loop operates on whatever command-line arguments were provided to the script when invoked. For example, if you have a script named args.sh containing the following:

Then executing args.sh will give you the following:

Bash recognizes this case and treats for a do as the equivalent of for a in $@ do where $@ is a special variable representing command-line arguments.

Emulating a Traditional Numeric For Loop

Bash scripts often deal with lists of files or lines of output from other commands, so the for in type of loop is common. However, the traditional c-style operation is still supported:

This is the classic form with three parts in which:

a variable is initialised (i=1) when the loop is first encountered the loop continues so long as the condition (i<=5) is true each time around the loop, the variable is incremented (i++)

Iterating between two values is a common enough requirement that there’s a shorter, slightly less confusing alternative:

The brace expansion that takes place effectively translates the above for loop into:

Finer Loop Control With Break and Continue

More complex for loops often need a way of exiting early or immediately restarting the main loop with the next value in turn. To do so, bash borrows the break and continue statements that are common in other programming languages. Here’s an example that uses both to find the first file that’s more than 100 characters long:

The for loop here operates on all files in the current directory. If the file is not a regular file (e.g. if it’s a directory), the continue statement is used to restart the loop with the next file in turn. If it’s a regular file, the second conditional block will determine if it contains more than 100 characters. If so, the break statement is used to immediately leave the for loop (and reach the end of the script).

Conclusion

A bash script is a file containing a set of instructions that can be executed. A for loop allows part of a script to be repeated many times. With the use of variables, external commands, and the break and continue statements, bash scripts can apply more complex logic and carry out a wide range of tasks.