Understanding Argparse – argument_parser.py
Argparse is a module in the standard library and will be used throughout this book as a means of obtaining user input. Argparse can help develop more complicated command-line interfaces. By default, argparse creates a -h switch or a help switch to display help and usage information for the scripts. In this section, we will build a sample argparse implementation that has required, optional, and default arguments.
We import the argparse module, following our usual print_function and script description. We then specify our usual script header details as __author__, __date__, and __description__ as we will be using all three in our argparse implementation. On line 38, we then define an overly simple main() function to print the parsed argument information, as we don't have any plans for this script other than to show off some neat user argument handling. To accomplish that goal, we first need to initiate our ArgumentParser class instance, as shown on lines 43 through 48. Notice how we only implement this if the script is called from the command line with the conditional on line 42.
On line 43, we initialize ArgumentParser with three optional arguments. The first is the description of the script, which we will read in from the __description__ variable that was previously set. The second argument is the epilog or details provided at the end of the help section. This can be any arbitrary text, as can the description field, though we chose to use this to provide authorship and version information. For getting started, using date values as a version number is helpful for user reference and prevents complications with numbering schemes. The last optional argument is a formatter specification, instructing our argument parser to display any default values set by the script so that the user can know whether options will be set if they do not modify them through an argument. It is highly recommended to include this as a force of habit:
001 """Sample argparse example."""
002 from __future__ import print_function
003 import argparse
...
033 __authors__ = ["Chapin Bryce", "Preston Miller"]
034 __date__ = 20181027
035 __description__ = "Argparse command-line parser sample"
036
037
038 def main(args):
039 print(args)
040
041
042 if __name__ == '__main__':
043 parser = argparse.ArgumentParser(
044 description=__description__,
045 epilog='Built by {}. Version {}'.format(
046 ", ".join(__authors__), __date__),
047 formatter_class=argparse.ArgumentDefaultsHelpFormatter
048 )
We can now leverage our newly instantiated parser object to add an argument specification. To start, let's discuss some healthy practices for required and optional arguments. Argparse, by default, uses the presence of one or two dashes prior to an argument name to note whether the argument should be considered optional or not. If the argument specification has a leading dash, it will be considered both optional and non-positional; the inverse, a lack of a leading dash, will instruct argparse to interpret an argument as required and positional.
Use the following as an example; in this script, the timezone and input_file arguments are required and must be provided in that order. Additionally, the arguments for these two items do not require an argument specifier; instead, argparse will look for an unpaired value to assign to the timezone argument and then look for a second unpaired value to assign to the input_file argument. Inversely, the --source, --file-type, -h (or --help), and -l (or --log) arguments are non-positional and can be provided in any order as long as the appropriate value is immediately following, that is, paired with, with the argument specifier.
To make things a little more complex, but more customizable, we can require non-positional arguments. This has an advantage, as we can now allow the user to enter the arguments in an arbitrary order, though as a disadvantage it requires additional typing for fields that are required for the script to operate. You'll notice in the following code that the --source argument on the second line does not have square brackets surrounding the value. This is argparse's (subtle) way of indicating that this is a required non-positional argument. It can be tricky for a user to understand this at first glance, though argparse will halt the execution of the script and alert the user if the argument is missing from the provided arguments. You may want to use non-positional required arguments in your scripts or avoid them all together—it is up to you as the developer to find the most comfortable and fitting interface for your users:
$ python argument_parser.py --help
usage: argument_parser.py [-h] --source SOURCE [-l LOG]
[--file-type {E01,RAW,Ex01}]
timezone input_file
Argparse command-line parser sample
positional arguments:
timezone timezone to apply
input_file
optional arguments:
-h, --help show this help message and exit
--source SOURCE source information (default: None)
-l LOG, --log LOG Path to log file (default: None)
--file-type {E01,RAW,Ex01}
Built by Chapin Bryce, Preston Miller. Version 20181027
Mini-tangent aside, let's start adding arguments to the parser object we initiated. We will start with one of the positional arguments we previously discussed. The timezone argument is defined using the add_argument() method, allowing us to provide a string representing the argument name and optional parameters for additional detail. On line 51, we simply offer helpful information to provide context to how this argument should be used:
050 # Add positional required arguments
051 parser.add_argument('timezone', help='timezone to apply')
The next argument we add, on line 54, is the non-positional required argument previously discussed. Notice how we use the required=True statement to indicate that, regardless of the leading dashes, this argument is required for execution:
053 # Add non-positional required argument
054 parser.add_argument('--source',
055 help='source information', required=True)
We now add our first non-positional and optional argument for the log file. Here, we are providing two options for how the user can specify the argument, -l or --log. This is recommended for common arguments, as it provides the frequent user shorthand and the novice user context for argument use:
057 # Add optional arguments, allowing shorthand argument
058 parser.add_argument('-l', '--log', help='Path to log file')
Not all arguments need to accept a value; in some instances, we just need a Boolean answer from the argument. Additionally, we may want to allow the argument to be specified multiple times or have custom functionality when called. To support this, the argparse library allows for the use of actions. The actions we will commonly use in this book are demonstrated as follows.
The first action that is handy is store_true and is the opposite of store_false. These are handy for getting information on enabling or disabling functionality in your script. As shown in the following code block on lines 61 through 64, we can see the action parameter being used to specify whether True or False should be stored as a result of the argument. In this case, this is duplicative, and one of these two arguments could be used to determine whether the email in this example should be sent. Additional actions are available, such as append, as shown on line 66 and 67, where each instance of an email address, in this example, will be added to a list that we can iterate through and use.
The last action example in the following code is used to count the number of times an argument is called. We see this implementation primarily for increasing verbosity or debugging messages, though it can be used elsewhere in the same fashion:
060 # Using actions
061 parser.add_argument('--no-email',
062 help='disable emails', action="store_false")
063 parser.add_argument('--send-email',
064 help='enable emails', action="store_true")
065 # Append values for each argument instance.
066 parser.add_argument('--emails',
067 help='email addresses to notify', action="append")
068 # Count the number of instances. i.e. -vvv
069 parser.add_argument('-v', help='add verbosity', action='count')
The default keyword dictates the default value of an argument. We can also use the type keyword to store our argument as a certain object. Instead of being stuck with strings as our only input, we can now store the input directly as the desired object, such as an integer, and remove the need for user input conversions from our scripts:
071 # Defaults
072 parser.add_argument('--length', default=55, type=int)
073 parser.add_argument('--name', default='Alfred', type=str)
Argparse can be used to directly open a file for reading or writing. On line 76, we open the required argument, input_file, in reading mode. By passing this file object into our main script, we can immediately begin to process our data of interest. This is repeated on the next line to handle opening a file for writing:
075 # Handling Files
076 parser.add_argument('input_file', type=argparse.FileType('r'))
077 parser.add_argument('output_file', type=argparse.FileType('w'))
The last keyword we will discuss is choices, which takes a list of case-sensitive options that the user can select from. When the user calls this argument, they must then provide one of the valid options. For example, --file-type RAW would set the file-type argument to the RAW choice, as follows:
079 # Allow only specified choices
080 parser.add_argument('--file-type',
081 choices=['E01', 'RAW', 'Ex01'])
Finally, once we have added all of our desired arguments to our parser, we can parse the arguments. On line 84, we call the parse_args() function, which creates a Namespace object. To access, for example, the length argument that we created on line 72, we need to call the Namespace object such as arguments.length. On line 85, we pass our arguments into our main() function, which prints out all of the arguments in the Namespace object. We have the following code:
083 # Parsing arguments into objects
084 arguments = parser.parse_args()
085 main(arguments)
These Namespace objects may be reassigned to variables for easier recall.
With the basics of the argparse module behind us, we can now build simple and more advanced command-line arguments for our scripts. Therefore, this module is used extensively to provide command-line arguments for most of the code we will build. When running the following code with the --help switch, we should see our series of required and optional arguments for the script: