Sign In/My Account | View Cart  
advertisement


Listen Print Discuss

Option and Configuration Processing Made Easy

by Jon Allen
July 12, 2007

When you first fire up your editor and start writing a program, it's tempting to hardcode any settings or configuration so you can focus on the real task of getting the thing working. But as soon as you have users, even if the user is only yourself, you can bet there will be things they want to choose for themselves.

A search on CPAN reveals almost 200 different modules dedicated to option processing and handling configuration files. By anyone's standards that's quite a lot, certainly too many to evaluate each one.

Luckily, you already have a great module right in front of you for handling options given on the command line: Getopt::Long, which is a core module included as standard with Perl. This lets you use the standard double-dash style of option names:

myscript --source-directory "/var/log/httpd" --verbose \ --username=JJ

Using Getopt::Long

When your program runs, any command-line arguments will be in the @ARGV array. Getopt::Long exports a function, GetOptions(), which processes @ARGV to do something useful with these arguments, such as set variables or run blocks of code. To allow specific option names, pass a list of option specifiers in the call to GetOptions() together with references to the variables in which you want the option values to be stored.

As an example, the following code defines two options, --run and --verbose. The call to GetOptions() will then assign the value 1 to the variables $run and $verbose respectively if the relevant option is present on the command line.

use Getopt::Long;
my ($run,$verbose);
GetOptions( 'run'     => \$run,
             'verbose' => \$verbose );

When Getopt::Long has finished processing options, any remaining arguments will remain in @ARGV for your script to handle (for example, specified filenames). If you use this example code and call your script as:

myscript --run --verbose file1 file2 file3

then after GetOptions() has been called the @ARGV array will contain the values file1, file2, and file3.

Types of Command-Line Options

The option specifier provided to GetOptions() controls not only the option name, but also the option type. Getopt::Long gives a lot of flexibility in the types of option you can use. It supports Boolean switches, incremental switches, options with single values, options with multiple values, and even options with hash values.

Some of the most common specifiers are:

name     # Presence of the option will set $name to 1
name!    # Allows negation, e.g. --name will set $name to 1,
         #    --noname will set $name to 0
name+    # Increments the variable each time the option is found, e.g.
         # if $name = 0 then --name --name --name will set $name to 3
name=s   # String value required
         #    --name JJ or --name=JJ will set $name to JJ
         # Spaces need to be quoted
         #    --name="Jon Allen" or --name "Jon Allen"

So, to create an option that requires a string value, format the call to GetOptions() like this:

my $name;
GetOptions( 'name=s' => \$name );

The value is required. If the user omits it, as in:

myscript --name

then the call to GetOptions() will die() with an appropriate error message.

Options with Multiple Values

The option specifier consists of four components: the option name; data type (Boolean, string, integer, etc.); whether to expect a single value, a list, or a hash; and the minimum and maximum number of values to accept. To require a list of string values, build up the option specifier:

Option name:   name
Option value:  =s    (string)
Option type:   @     (array)
Value counter: {1,}  (at least 1 value required, no upper limit)

Putting these all together gives:

my $name;
GetOptions('name=s@{1,}' => \$name);

Now invoking the script as:

myscript --name Barbie Brian Steve

will set $name to the array reference ['Barbie','Brian','Steve'].

Giving a hash value to an option is very similar. Replace @ with % and on the command line give arguments as key=value pairs:

my $name;
GetOptions('name=s%{1,}',\$name);

Running the script as:

myscript --name Barbie=Director JJ=Member

will store the hash reference { Barbie => 'Director', JJ => 'Member' } in $name.

Storing Options in a Hash

By passing a hash reference as the first argument to GetOptions, you can store the complete set of option values in a hash instead of defining a separate variable for each one.

my %options;
GetOptions( \%options, 'name=s', 'verbose' );

Option names will be hash keys, so you can refer to the name value as $options{name}. If an option is not present on the command line, then the corresponding hash key will not be present.

Options that Invoke Subroutines

A nice feature of Getopt::Long is that, as an alternative to simply setting a variable when an option is found, you can tell the module to run any code of your choosing. Instead of giving GetOptions() a variable reference to store the option value, pass either a subroutine reference or an anonymous code reference. This will then be executed if the relevant option is found.

GetOptions( version => sub{ print "This is myscript, version 0.01\n"; exit; }
            help    => \&display_help );

When used in this way, Getopt::Long also passes the option name and value as arguments to the subroutine:

GetOptions( name => sub{ my ($opt,$value) = @_; print "Hello, $value\n"; } );

You can still include code references in the call to GetOptions() even if you use a hash to store the option values:

my %options;
GetOptions( \%options, 'name=s', 'verbose', 'dest=s',
            'version' => sub{ print "This is myscript, version 0.01\n"; exit; } );

Dashes or Underscores?

If you need to have option names that contain multiple words, such as a setting for "Source directory," you have a few different ways to write them:

--source-directory
--source_directory
--sourcedirectory

To give a better user experience, Getopt::Long allows option aliases to allow either format. Define an alias by using the pipe character (|) in the option specifier:

my %options;
GetOptions( \%options, 'source_directory|source-directory|sourcedirectory=s' );

Note that if you're storing the option values in a hash, the first option name (in this case, source_directory) will be the hash key, even if your user gave an alias on the command line.

Pages: 1, 2, 3

Next Pagearrow