docopt: A brief introduction

The biggest advancement in argument parsing in the last 40 years
2015-01-19
~
5 min read

There are two ways to start developing a new command-line application: by looking up the documentation for an argument parser or copy-pasting one you made before. The parsing libraries are fairly clever these days. After all, terminal apps have been around for over 40 years. The code may not always be elegant, but it does the job well. It goes without saying that these libraries also generate an useful help message. But hang on, if the pretty help message can be generated from the not-so-lovely parser code, couldn’t it be done the other way around? Well, YES, it could!

docopt is a simple declarative language for describing command line interfaces. It’s based on the notation that is common in usage messages to explain the subcommands, options and arguments of programs. Most importantly though, docopt is a brilliant idea. It cuts down the boilerplate to the very minimum, so you can move on to the the reason why you were making the app in the first place.

Pioneered by Vladimir Keleshev in 2012 for Python, the idea caught on rather quickly and contributors started making docopt available for other programming languages as well. There are now many implementations available in languages including C, Go, Java, Ruby, CoffeScript, Rust, and PHP. See the complete list of all the officially supported ports on Github.

An implementation of the docopt language will read your spec and generate an argument parser based on it. The parser will then process the command line and return a dictionary with the results. Scroll down a bit for the examples.

The language

The language itself is a formalised version of the common usage messages as you may know them from existing applications. Despite its simplicity, the syntax very powerful with all the features you might expect from a modern argument parser. It supports subcommands, options and positional arguments. Any of these can be made either optional or mandatory, and also mutually exclusive with other options or arguments. Option arguments can have default values assigned. And it is also possible to have a variable number of positional arguments, e.g., zero or more of them. Here’s an example of a program description using docopt:

Build and release software packages.

Usage:
  pkg add <repository> [--branch=<b>]
  pkg build <name>... [--force] [--env=<env>]
  pkg push <name> [<suite>] [--force]
  pkg clean (builds|cache)
  pkg -h | --help
  pkg --version

Options:
  -h, --help        Print this.
  -b, --branch=<b>  Git branch name [default: master].
  -f, --force       Perform the operation despite warnings.
  -e, --env=<e>     Select build environment [default: wheezy].

The syntax is almost identical with the help descriptions generated by Ruby’s OptionParser or Python’s ArgumentParser.

Parsing the command line

Now that the description is ready, how do we get from that to actually processing the arguments of your program? You’ll be pleased to hear that this step is even simpler. The API consists of a single function that expects to receive the command-line string and returns a dictionary with all the options and arguments. It may differ slightly between implementations though.

I’ve tried the Python, Ruby and CoffeScript/JS versions and they worked really well. Using the same options description in all of them wasn’t a problem at all. A few pointers and an example of how it works in each of the languages follow.

Python

docopt is available as a package in the Python Package Index, so you can install it using the following command:

sudo pip install docopt

PEP 257 recommends that the docstring of a script corresponds with its ‘usage’ message. We’ll put the docopt spec in there, right at the beginning of the script and refer to it through the __doc__ special variable. The docopt function will read sys.argv and parse it. If we pass the version of our program to the function as wel, it will create the --version option automatically.

#!/usr/bin/env python

"""
Example program.

Usage:
  example command [<cmd_arg>]...
  example [-br] -p=<opt_arg> <argument>
  example -h | --help
  example --version

 Options:
   -h, --help       Show this message.
   -b, --beer       Drink beer.
   -r, --rock       Play AC/DC.
   -p, --pub=<p>    Which pub.
   --version        Print the version.
"""

from docopt import docopt
from pprint import pprint

if __name__ == '__main__':
    arguments = docopt(__doc__, version='FIXME')
    pprint(arguments)

Ruby

Likewise for Ruby, there is a docopt gem. Use the following command to install it:

sudo gem install docopt

The interface in Ruby is much the same as in Python.

#!/usr/bin/env ruby

require 'docopt'
require 'pp'

doc = <<DOCOPT
Example program.

Usage:
  example command [<cmd_arg>]...
  example [-br] -p=<opt_arg> <argument>
  example -h | --help
  example --version

 Options:
   -h, --help       Show this message.
   -b, --beer       Drink beer.
   -r, --rock       Play AC/DC.
   -p, --pub=<p>    Which pub.
   --version        Print the version.
DOCOPT

begin
  pp Docopt::docopt(doc, version: '1.2.3')
rescue Docopt::Exit => e
  puts e.message
end

CoffeeScript

You can use npm to get docopt for Coffee/JS. The following command will install it globally on your system:

sudo npm install -g docopt

And again, the interface is almost identical with the

doc = """
Example program.

Usage:
  example command [<cmd_arg>]...
  example [-br] -p=<opt_arg> <argument>
  example -h | --help
  example --version

 Options:
   -h, --help       Show this message.
   -b, --beer       Drink beer.
   -r, --rock       Play AC/DC.
   -p, --pub=<p>    Which pub.
   --version        Print the version.
"""
{docopt} = require 'docopt'

console.log docopt(doc, version: '1.2.3')

Summary

Docopt is a delightful way of describing command line interfaces. It means that you can skip the ritual copy-pasting and docs searching to put together the option parser before you get to the interesting bit. Was that add_argument() or addArgument()? Who remembers these things?

docopt is free, open-source, and available for many languages. Check it out on Github.