When you write code, whether in Python or some other language, you probably have a goal that you or someone else will one day run it. A very common, useful, and powerful option for running your code is to call it from a command line. Writing your code so that it runs from the command line is easy. In fact, you don’t have to do anything special to get it working on the command line.
If you’ve spent even a trivial amount of time on the command line, particularly a UNIX shell, you may have noticed that when you run a utility, you are often able to specify various runtime parameters that change the utility’s runtime behavior. For example, if you’ve ever used the UNIX ls utility, you probably know that you can modify the way it outputs the files and directories that it matches by giving it a parameter, such as -l. These parameters are also called “options”.
Here is a plain ls command:
$ ls ZConfig-2.6.0-py2.5.egg zdaemon-2.0.2-py2.5.egg ZODB3-3.8.1b7-py2.5-macosx-10.5-i386.eggzope.interface-3.4.1-py2.5-macosx-10.5-i386.egg easy-install.pthzope.proxy-3.4.2-py2.5-macosx-10.5-i386.egg setuptools-0.6c8-py2.5.egg zope.testing-3.6.0-py2.5.egg setuptools.pth
And here is ls -l (or long listing):
$ ls -l total 656 drwxr-xr-x4 jmjones staff 136 Sep 16 08:25 ZConfig-2.6.0-py2.5.egg drwxr-xr-x 10 jmjones staff 340 Sep 16 08:25 ZODB3-3.8.1b7-py2.5-macosx-10.5-i386.egg -rw-r--r--1 jmjones staff 494 Sep 16 08:26 easy-install.pth -rwxr-xr-x1 jmjones staff 324858 Jun 19 19:05 setuptools-0.6c8-py2.5.egg -rw-r--r--1 jmjones staff 29 Sep 16 08:23 setuptools.pth drwxr-xr-x4 jmjones staff 136 Sep 16 08:25 zdaemon-2.0.2-py2.5.egg drwxr-xr-x4 jmjones staff 136 Sep 16 08:25 zope.interface-3.4.1-py2.5-macosx-10.5-i386.egg drwxr-xr-x4 jmjones staff 136 Sep 16 08:25 zope.proxy-3.4.2-py2.5-macosx-10.5-i386.egg drwxr-xr-x4 jmjones staff 136 Sep 16 08:25 zope.testing-3.6.0-py2.5.egg
Both runs of the ls utility listed the same files. However, ls -l listed more information than ls.
Have you ever wondered how you could create utilities that had a rich command line interface? This article will show you how to use Python to do just that.
First, you’ll need to understand how your scripts handle the command line arguments from a user. If I run the following script with some random command line arguments, it will show how Python treats those arguments. Here is the script.
#!/usr/bin/env python import sys print "sys.argv", sys.argv
And here is an example of running the script.
$ ./argv.py foo bar bam sys.argv ['./argv.py', 'foo', 'bar', 'bam']
The command line arguments are treated as a list, which is accessible by referencing sys.argv. The first element of the list is the name of the script and the remaining elements are the arguments passed in after the script name. If I wanted to, I could figure out what options I wanted to take, iterate through sys.argv, and map what the user gave me to the options I wanted to handle. I could do that, but it would be wasted time.
There is a better way. A much better way. It’s called optparse.
With optparse, you declaratively build up an option parser by adding the behavior you want it to exhibit when it sees various options. Here’s a simple example of a script that uses optparse.
#!/usr/bin/env python from optparse import OptionParser usage = "usage: %prog [options]" parser = OptionParser(prog='testprog', usage=usage) parser.add_option('-t', '--true', dest='true', action='store_true', help='example of using store_true, default %default') parser.add_option('-v', '--value', dest='value', action='store', help='example of using multiple arguments') parser.set_defaults(true=False ) options, args = parser.parse_args() print 'OPTIONS::', options print 'ARGS::', args
The basics of optparse are as follows:
-
Create an instance of the OptionParser
-
Add options to it that will handle various command line arguments passed in from the user
-
Set any default values for any of the options
-
Tell the parser to parse the arguments from sys.argv
In the previous script, I created an instance of the OptionParser class and named it parser. Then I added an option so that if the script is called with a -t or –true option, it will store a True value for that particular option attribute. I also added an option so that users can pass in either a -v or –value followed by an argument and have that argument stored for later use.
The action keyword in the add_option() method specifies how the value will be handled. store_true expects no arguments and will just set the specified dest attribute to True. store (which is the default if no action is specified) expects a value and will store that value in the dest attribute. Other actions include store_false, count, store_const, and append.
I process the results from the command line by calling parse_args() on the parser object. This method returns a tuple of two objects: an object that contains the options, and another that contains the arguments. The options are the result of how the script handles the -t, –true, -v, and –value that users may pass in. The options object is a pretty simple object. It stores the values from parsing the command line arguments based on the rules given to the option parser. You access the values by using standard dotted notation (options.value). The arguments object is simply a list of all the arguments not handled by one of the options.
Here is a series of calls to the script with the output of the options.
$ ./simple_sample.py OPTIONS:: {'true': False, 'value': None} ARGS:: [] $ ./simple_sample.py --true OPTIONS:: {'true': True, 'value': None} ARGS:: [] $ ./simple_sample.py -t OPTIONS:: {'true': True, 'value': None} ARGS:: [] $ ./simple_sample.py --value=foo OPTIONS:: {'true': False, 'value': 'foo'} ARGS:: [] $ ./simple_sample.py --value foo OPTIONS:: {'true': False, 'value': 'foo'} ARGS:: [] $ ./simple_sample.py -v foo OPTIONS:: {'true': False, 'value': 'foo'} ARGS:: []
Calling the script with no arguments resulted in the true attribute being set to False (since that was the default) and the value attribute being set to None (since there was no default for it). Passing in either –true or -t sets the true attribute to True. Passing in either –value=foo, –value foo, or -v foo sets the value attribute to foo.
Here is a more real-world example. This is a very basic find utility. It walks a directory tree looking for things that you specify on the command line.
#!/usr/bin/env python from optparse import OptionParser import os file_types = ['f', 'd'] def match_files(file_list, dirpath, pattern, ls, file_type, show_type): for f in file_list: if pattern in f: file_path = os.path.join(dirpath, f) if ls: os.system('ls -ld "%s"' % file_path) else: if show_type: print file_type, print file_path usage = "usage: %prog [options]" parser = OptionParser(prog='pyfind', usage=usage) parser.add_option('-t', '--type', dest='file_type', action='append', help='the type of file to search for, limited to %s' % ', '.join(file_types), choices=file_types) parser.add_option('-n', '--name', dest='pattern', action='store', help='match the file by name using the specified pattern') parser.add_option('-l', '--ls', dest='list', action='store_true', help='do a long file listing on on each matching file') parser.add_option('-s', '--show_type', dest='show_type', action='store_true', help='show file type in output') parser.add_option('-v', '--verbose', dest='verbose', action='store_true', help='verbose (for debugging)') parser.set_defaults(file_type=[], pattern='', list=False, show_type=False) options, args = parser.parse_args() for d in args: if options.verbose: print 'Walking', d for dirpath, dirs, files in os.walk(d): if options.verbose: print dirpath, dirs, files pattern = options.pattern ls = options.list file_type = options.file_type show_type = options.show_type if 'f' in options.file_type: match_files(files, dirpath, pattern, ls, 'f', show_type) if 'd' in options.file_type: match_files(dirs, dirpath, pattern, ls, 'd', show_type) if options.verbose: print 'Done walking', d
Here is an example of running this script in the same directory that all of the scripts in this article are contained in.
$ ./find.py --name py --type f . ./.argv.py.swp ./.find.py.swp ./.sample.py.swp ./.simple_sample.py.swp ./argv.py ./find.py ./sample.py ./simple_sample.py
This run looked for files that contain “py” in the name. One of the useful options on the UNIX find utility is -ls, which does an ls -l on each file that matches its find criteria. I built a similar functionality into this script:
$ ./find.py --name py --type f --ls . -rw------- 1 jmjones staff 12288 Oct 31 05:40 ./.argv.py.swp -rw-r--r-- 1 jmjones staff 12288 Oct 31 06:00 ./.find.py.swp -rw-r--r-- 1 jmjones staff 12288 Oct 31 05:58 ./.sample.py.swp -rw-r--r-- 1 jmjones staff 12288 Oct 31 05:07 ./.simple_sample.py.swp -rwxr-xr-x 1 jmjones staff 60 Oct 30 22:32 ./argv.py -rwxr-xr-x 1 jmjones staff 1896 Oct 31 05:59 ./find.py -rwxr-xr-x 1 jmjones staff 1084 Oct 31 04:59 ./sample.py -rwxr-xr-x 1 jmjones staff 505 Oct 31 05:05 ./simple_sample.py
I’ll leave it as an exercise to the interested reader to see how I sent the options from the command line to the code that walks the directory tree. It’s really not that complicated.
Building usable, useful command line utilities is a skill worth honing. But if building tools in Python is what you’re interested in, don’t waste your time trying to come up with a new method for parsing command line arguments. I think I’ve shown that optparse in the Python standard library is more than sufficient for most command line utility tasks you’ll find yourself working on.
It will save you a headache and a bunch of redundant code.
This article was first published on EnterpriseITPlanet.com.