## Plotting Time Series Data with Matplotlib

It’s been a while since my last article on Matplotlib. Today we’re going to plot time series data for visualizing web page impressions, stock prices and the like over time.

If you haven’t already, install Matplotlib (package `python-matplotlib` on Debian-based systems) and fire up a Python interpreter. For the rest of this article, we’ll need the following imports:

```>>> import numpy as np
>>> import matplotlib.pyplot as plt
>>> import matplotlib.dates as mdates
```

Usually, when plotting a diagram, the process is something like this: Create two arrays of the same length, one for the x axis and one for the y axis. Plotting time series data works the same way, but the data points on one axis (usually the x axis) are times or dates.

To get us started quickly, I have prepared sample data to play with:

```2012-01-23    147
2012-01-24    157
2012-01-25    156
...
2012-03-09    184
```

The first column is a date in ISO format and the second column is the number of page impressions on that particular day. To work with this data, we read it from file creating two one-dimensional arrays `days` and `impressions` (we would get one two-dimensional array if it weren’t for the `unpack` parameter):

```>>> days, impressions = np.loadtxt("page-impressions.csv", unpack=True,
converters={ 0: mdates.strpdate2num('%Y-%m-%d')})
```

What’s interesting here is the `converters` parameter. The loadtxt() function expects floating point data, so we have to register a converter that turns the date strings in column 0 into floating point numbers. Matplotlib represents dates and times as floats starting at January 1st, year 0001, so this is no problem for us. The `mdates.strpdate2num()` function is a factory function that returns a converter for the specified format. The format string uses the same conversion directives as strftime().

Let’s have a look at the result:

```>>> days[0:2]
array([ 734525.,  734526.])
```

The first array element represents 2012-01-23, the second 2012-01-24, and so on. We could easily convert those numbers back to dates using `mdates.num2date()` if we wanted to. In fact, this is what we’ll need later to label our x axis.

Now let’s plot the data using Matplotlib’s plot_date() function. We use `days` as x values and `impressions` as y values and don’t touch the default settings:

```>>> plt.plot_date(x=days, y=impressions)
>>> plt.show()
```

The diagram isn’t really impressive, but note how Matplotlib automatically scales the axes and adds date labels to the x axis, converting the floating point numbers back to strings.

To make the diagram easier to read, we’ll change the blue dots to a red line ("r-"), add some text and a grid:

```>>> plt.plot_date(x=days, y=impressions, fmt="r-")
>>> plt.title("Page impressions on example.com")
>>> plt.ylabel("Page impressions")
>>> plt.grid(True)
>>> plt.show()
```

That’s better. In a future article, we’ll use a bar chart that looks a lot better for small data sets. Until then, here’s the complete script for easy copy and pasting:

```import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates

converters={ 0: mdates.strpdate2num('%Y-%m-%d')})

plt.plot_date(x=days, y=impressions, fmt="r-")
plt.title("Page impressions on example.com")
plt.ylabel("Page impressions")
plt.grid(True)
plt.show()
```
This entry was posted in python and tagged , . Bookmark the permalink.

### 10 Responses to Plotting Time Series Data with Matplotlib

Thanks for the writeup and info – I had some real issues but it turned out to mostly be date and csv formatting from excel to csv.

I had previously been pulling data out of a larger set (50+ column spreadsheet) and trying to plot values against dates – but could not get them plotted correctly.

I don’t have much experience with the “converter” terminology in the np.loadtext command, I’ll look it up when I have some time, but does the ‘0: ‘ section mean that it is converting all (or trying to convert?) all elements of the text file?

Thanks!

2. mafr says:

Glad you like it! The “0: func” part only converts the first column, loadtxt()’s column numbering is zero-based.

3. George says:

fabulos little tutorial really easy to understand

4. Geoff says:

Is “strptime2num” a python 3 function ? I (using google) can’t find any mention of it anywhere ???

• Geoff says:

Ooops, my own error. I should have been looking for “strpdate2num”….

5. Casey says:

Thanks for the tutorial. It is the first I have seen to plot dates on a time series plot rather than numbers. Trying to apply it to my own needs, I have trouble getting a .csv file to format like yours did and also don’t have any experience with the “converter” terminology. Is there a way to do this with two data columns; one with date information and one with some secondary info or do they have to be combined into one column? When I tried, it viewed the second column entry and told me “unconverted data remains:” rather than assigning it to the second variable.

• Matthias says:

I’m not entirely sure I get you completely, but you can pass the usecols parameter to loadtxt() to only read the columns you need from your file: np.loadtxt(“filename”, unpack=True, converters={ 0: mdates.strpdate2num(‘%Y-%m-%d’) }, usecols=(0, 1)).

This would select the first and second columns (0 and 1) with column 0 being a timestamp that is converted.

6. RL says:

How about a animated thing in a sub plot.. I managed to draw a Ĺ›ingle’plot with real time graph update but subplots are just eluding me.. Like say you get quotes off a web every minute and then plot it for say the stock prices in a sub plot and the RSI in another one just below it. (Newbie to both python and matplot lib. Started about a week ago).

Regards

7. Alessandro says:

Useful post, thank you!

However, when using Python3.X it does not work (probably due to some issues related to the new handling of strings in Python3.X).
I tested the script with some modifications, inspired by the following post and it works!