Plotting with Matplotlib

For a long time, Gnuplot has been my tool of choice for plotting diagrams. The diagrams it produces out of the box look very scientific, but it takes a lot of tweaking to produce something that's visually pleasing. I got used to Gnuplot's weird ways but I was never entirely happy with it. For example, I found it quite annoying that there's no easy way to plot a simple histogram. The only way to do this is by beating a bar chart into submission (and counting the buckets yourself in a script). Not entirely my idea of fun.

With help from a colleague I managed to solve the histogram use case with R, but I'm not really keen on learning yet another language just for a few diagrams. Fortunately, I recently discovered matplotlib, a flexible, high-quality plotting package written in Python. It has an interactive mode that you can use from the interactive Python interpreter and works nicely with NumPy. NumPy is a powerful scientific computing package that warrants a closer look, too.

Matplotlib is shipped with Ubuntu (and probably other Debian-based systems, too). All you have to do is to install the python-matplotlib package.

With matplotlib, plotting a histogram is as simple as this:

import numpy
import matplotlib.pyplot as plot

a = numpy.random.randn(10000)
plot.hist(a, bins=20)

We plot a set of 10000 random numbers and fit them into 20 bins. The show() function should open a new window that displays the histogram. The example uses a NumPy array (a Python list would have worked, too). NumPy arrays work much like standard Python lists but they support typical matrix operations like transposition, scalar multiplication and the like. You can use numpy.loadtxt() to load a numpy array from a file. The loadtxt() function is quite powerful: It supports comments, configurable column delimiters, and gzip/bzip2 decompression.

The following example customizes the bins (from -4 to 4 with steps of 0.5) and labels. It also saves the diagram in an SVG and a PNG file:

import numpy
import matplotlib.pyplot as plot

a = numpy.random.randn(10000)
plot.hist(a, bins=numpy.arange(-4, 4, 0.5))

plot.xlabel('X Axis')
plot.ylabel('Y Axis')
plot.title('Diagram Title')


There's a lot more you could customize and many different diagram types. The gallery page gives a good overview on what's possible.