Parsing Command Line Options in Shell Scripts

In programs written in C, command line argument parsing has always been done using the getopt(3) library function. This function has set the standards Linux/Unix users have come to expect from command line interfaces. Fortunately, there’s a getopt(3) equivalent for almost every programming language and the shell is no exception.

The getopts command available in all POSIX-compliant Bourne shell derivates (like bash, dash, or ksh) provides convenient command line parsing capabilities. It is a builtin accepting POSIX-style argument lists (as opposed to GNU-style, which is a bit more fancy) and should not be confused with the getopt utility.

For my shell scripts, I use the following template to implement command line parsing:

#! /bin/sh

USAGE="Usage: `basename $0` [-hv] [-o arg] args"

# Parse command line options.
while getopts hvo: OPT; do
    case "$OPT" in
        h)
            echo $USAGE
            exit 0
            ;;
        v)
            echo "`basename $0` version 0.1"
            exit 0
            ;;
        o)
            OUTPUT_FILE=$OPTARG
            ;;
        \?)
            # getopts issues an error message
            echo $USAGE >&2
            exit 1
            ;;
    esac
done

# Remove the switches we parsed above.
shift `expr $OPTIND - 1`

# We want at least one non-option argument. 
# Remove this block if you don't need it.
if [ $# -eq 0 ]; then
    echo $USAGE >&2
    exit 1
fi

# Access additional arguments as usual through 
# variables $@, $*, $1, $2, etc. or using this loop:
for PARAM; do
    echo $PARAM
done

# EOF

It is easy to add more command line switches. If you want to add an -x switch requiring an argument (-x arg) and a y flag without an argument, you would have to change the getopts call to getopts hvo:x:y OPT and add two case labels to the loop. Note that the colon indicates that an argument is required for a flag.

Advertisements
This entry was posted in shell and tagged , , . Bookmark the permalink.

14 Responses to Parsing Command Line Options in Shell Scripts

  1. Pingback: A srcipt for running processes in parallel in Bash « Pebbles in the Sand

  2. meowsqueak says:

    Useful, thanks.

  3. mark foltz says:

    thanks, this was a great help. beats digging through the man pages.

  4. Pingback: So Much for 2010 « Unmaintainable

  5. Le Hong says:

    This is very helpful.
    Could you explain what these 3 lines does?
    Thjanks

    for PARAM; do
    echo $PARAM
    done

    • mafr says:

      That’s a shorthand form of the for loop. Expanded it looks like this:

      for PARAM in “$@”; do
      echo $PARAM
      done

      It iterates over the arguments given to the shell script. At least those, that haven’t been removed already via the “shift” call above. If you call the script as “myscript -o out -v foo bar”, then it will print “foo” and “bar”.

  6. Thomson says:

    Such a command line parser works correctly as long as users pass all optional arguments
    before mandatory arguments, i.e. “script -f -g -o foo bar”. Unfortunately, some users tend
    to write, for instance, “script -f bar -g -o foo”. In this case, a command line parser as shown
    above fails. It can easily be demonstrated by changing
    o)
    OUTPUT_FILE=$OPTARG
    ;;
    to
    o)
    OUTPUT_FILE=$OPTARG
    echo “output file: ${OUTPUT_FILE}”
    ;;
    in above code example. If called correctly, you will get:
    $> ./script -o my_output foo bar
    output file: my_output
    foo
    bar

    If called by a lazy user, you will get:
    $> ./script foo -o my_output bar
    foo
    -o
    my_output
    bar

    and the internal variable isn’t set correctly because the parser while loop exited prematurely.
    From experience I wouldn’t recommend such a command line parser implemented in the way
    shown here (although it is a sort of standard example shown on the internet).

    • mafr says:

      Well, the behavior you describe shouldn’t be surprising, it has been the standard way of parsing command lines on UNIX for more than 30 years. From the Single Unix Specification: “Any of the following identifies the end of options: the special option –, finding an argument that does not begin with a “-“, or encountering an error.”

      In your example, “foo” clearly marks the “end of options” according to the spec. You’ll find lots of tools that behave this way, even though on Linux people got used to GNU-style parsing which works differently (like I wrote, GNU-style getopt is a bit more fancy, but it’s not compatible). Inexperienced users may be surprised at first, but if you don’t use a tool correctly, you shouldn’t blame the tool.

      Do you have a better suggestion other than writing non-portable scripts or give up on shell scripts altogether? :)

      • Thomson says:

        Sure, it may be “the standard” but as I mentioned in my posting from experience (and I have quite a bit of that) I would recommend a more robust solution. There are far too many lazy / ignorant / users out there. :-) Have, for instance, a look at http://code.google.com/p/shflags/, a portable shell command line parser implementation.

  7. evilalmus says:

    Thank you for you help, this has really helped with the scripts I am working on right now.
    I could still use a little help if you have time.
    The section:
    for PARAM; do
    echo $PARAM
    done
    is giving me a little trouble as well. I have had no problem modifying the rest of this to fit my needs, however I would like to limit the users to only one non-option arg and then dump that one argument into a variable.
    Separately (two scripts) I would like to allow more arguments however when I assign PARAM to a variable to use outside the loop I only record the result from the last loop through.

    • mafr says:

      For your first script, change the if statement, throw away the for loop and assign the first posititonal parameter to a variable.

      if [ $# != 1 ]; then
      echo “$USAGE” >&1
      exit 1
      fi
      VALUE=$1

      For your second script: If you don’t want to loop over parameters, just access them via $1, $2, $3 etc.:

      FIRST=$1
      SECOND=$2

      • evilalmus says:

        @mafr Thank you, That makes total sense and I was overlooking the easy answers to try to use the script above as it is. I will work on this today but it shouldn’t be a problem now :)

  8. I found this useful whilst fixing some of my scripts to remove bashisms. I have a couple of suggestions; I’m fairly sure these are POSIX-compliant:
    1) Prefer $(basename $0) over `basename $0` (it can be nested, and it’s easier on the eyes)
    2) Prefer $(($OPTIND – 1)) over `expr $OPTIND – 1` (no need to fork a new process)

    • Matthias says:

      Yup, both changes are POSIX-compliant, so you can use them safely in portable shell scripts. My syntax actually dates back to Unix Version 7 from 1979 :)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s