This article will show you how to make perfect Python command line interfaces, to improve your team productivity and comfort.
As Python developers, we always use and write command-line interfaces. On my Data Science projects, for example, I run several scripts from command-line to train my models and to compute the accuracy of my algorithms.
This is why a good way to improve your productivity is to make your scripts as handy and straightforward as possible, especially when you are several developers working on the same project.
In order to achieve that, I advise you to respect 4 guidelines:
- You should provide default arguments values when possible
- All error cases should be handled (ex: a missing argument, a wrong type, a file not found)
- All arguments and options have to be documented
- A progress bar should be printed for not instantaneous tasks
Let’s try to apply these rules to a concrete simple example: a script to encrypt and decrypt messages using Caesar cipher.
Imagine that you have an already written encrypt function (implemented as below), and you want to create a simple script which allows to encrypt and decrypt messages. We want to let the user choose the mode between encryption (by default) and decryption, and choose the key (1 by default) with command line arguments.
The first thing our script needs to do is to get the values of command line arguments. And when I google “python command line arguments”, the first result I get is about sys.argv. So let’s try to use this method…
The “beginners” method
sys.argv is a list containing all the arguments typed by the user when running your script (including the script name itself).
For example, if I type:
> python caesar_script.py --key 23 --decrypt my secret message
pb vhfuhw phvvdjh
the list contains:
['caesar_script.py', '--key', '23', '--decrypt', 'my', 'secret', 'message']
So we would loop on this arguments list, looking for a '--key' (or '-k' ) to know the key value, and looking for a '--decrypt' to use the decryption mode (actually by simply using the opposite of the key as the key).
Our script would finally look like this piece of code:
This script more or less respects the recommendations stated above:
- There are a default key value and a default mode
- Basic error cases are handled (no input text provided or unknown arguments)
- A succinct documentation is printed in these error cases, and when calling the script with no argument:
Usage: python caesar.py [ --key <key> ] [ --encrypt|decrypt ] <text>
However, this version of the Caesar script is quite long (39 lines, which doesn’t even include the logic of the encryption itself) and ugly.
There has to be a better way to parse command line arguments…
What about argparse?
argparse is the Python standard library module for parsing command-line arguments.
Let us see how would our Caesar script look like using argparse :
This would respect our guidelines, and provide a more precise documentation and a more interactive error handling than the handmade script above:
> python caesar_script_using_argparse.py --encode My message
usage: caesar_script_using_argparse.py [-h] [-e | -d] [-k KEY] [text [text ...]]
caesar_script_using_argparse.py: error: unrecognized arguments: --encode
> python caesar_script_using_argparse.py --help
usage: caesar_script_using_argparse.py [-h] [-e | -d] [-k KEY] [text [text ...]]
positional arguments:
text
optional arguments:
-h, --help show this help message and exit
-e, --encrypt
-d, --decrypt
-k KEY, --key KEY
However, regarding the code, I find that — this is a bit subjective — the first lines of my function (from line 7 to line 13), where the arguments are defined, are not very elegant: it is too verbose and programmatic whereas it could be done in a more compact and declarative way.
Do better with click!
We’re in luck: there is a Python library which offers the same features as argparse (and more), with a prettier code style: its name is click.
Here is the third version of our Caesar script, using click:
Notice that the arguments and options are now declared in decorators which make them directly accessible as parameters of my function.
Let me clarify some subtleties in the above code:
- The nargs parameter for a script argument specifies the number of successive words expected for this argument (with “a quoted string like this” counting as 1 word). The default value is 1. Here, nargs=-1 allow providing any number of words.
- The notation --encrypt/--decrypt allow to define mutually exclusive options (like with the add_mutually_exclusive_group function from argparse) which will result in a boolean parameter.
- The click.echo is a small utility provided by the library, which does the same as a print but which is compatible with Python 2 and 3, and has some extra features (colors handling, etc.).
Our script arguments are supposed to be top secret messages that will be encrypted. Isn’t it ironic to ask the user to type his plain texts directly in his terminal, leaving them in his commands history?
A solution to do it in a more secure way is to use a hidden prompt. Or we could read the text from an input file, which would be much more practical for long texts. Or why not let the choice to the user?
And let’s do the same for the output: the user could ever save it into a file, or print it in the terminal. This leads us to this last improved version of our Caesar script:
What is new in this version?
- First, notice that I added a help parameter to each argument or option. Since the script becomes a little bit more complex, it allows to add some details about its behavior to the documentation, which now looks like this:
Usage: caesar_script_v2.py [OPTIONS]
Options:
--input_file FILENAME File in which there is the text you want to encrypt/decrypt. If not provided, a prompt will allow you to type the input text.
--output_file FILENAME File in which the encrypted/decrypted text will be written. If not provided, the output text will just be printed.
-d, --decrypt / -e, --encrypt Whether you want to encrypt the input text or decrypt it.
-k, --key INTEGER The numeric key to use for the caesar encryption / decryption.
--help Show this message and exit.
- We have two new parameters, input_file, and output_file, of type click.File. The library manages to open the files in the correct mode before entering into the function and handle the errors that can happen. For instance:
Usage: caesar_script_v2.py [OPTIONS]
Error: Invalid value for "--input_file": Could not open file: wrong_file.txt: No such file or directory
- As explained in the help text, if the input_file is not provided, we use click.prompt to allow the user to directly type his text in a prompt, which will be hidden for the encryption mode. This will look like this:
Enter a text: **************
yyy.ukectc.eqo
Let’s break the cipher!
You are now a hacker: you want to decrypt a secret text encrypted with Caesar cipher, but you don’t know the key.
A simple strategy could be to call our decryption function 25 times, with all the possible keys, and to read all the resulting texts, looking for one which makes sense.
But since you’re smart and lazy, you would prefer to automate the process. A way to select the most likely original text among all of these 25 texts, is to count the numbers of real English words in all these texts. Let’s do this using the PyEnchant module:
This works like a charm, but if you remember well, there is one rule of a good command line interface which is not respected:
4. A progress bar should be printed for not instantaneous tasks
With the example text of 10⁴ words that I used, the script takes about 5 seconds to output the decrypted text. This is quite normal, considering that it has to check for 25 values of the key, for 10⁴ words, if they belong to the English dictionary.
Imagine that you want to decrypt a text containing 10⁵ words, it would take 50 seconds before printing any results, which could be very frustrating for a user.
That is why I recommend printing progress bars for this kind of tasks, especially since it is so easy to implement.
Here is the same script printing a progress bar:
Do you see any difference? It is not so easy to spot because the difference consists of 4 letters: TQDM.
This is the name of a Python library and this is the name of its unique class, with which you wrap any iterable to print the corresponding progression.
for key in tqdm(range(26)):
And this results in a beautiful progress bar. Personally, I still find it too good to be true.
By the way, click also provides a similar utility to print progress bars (click.progress_bar), but I find the appearance a bit less readable and the code to write is less minimalist.
To conclude, I hope that I have convinced you to make more efforts in improving the developer experience of your scripts.
If some of you have other recommendations or tips that you use in your own scripts, do not hesitate to share them in the comments!
Want more articles like this one? Don’t forget to click the follow button!