Data Visualization and Plotting

One of the strongest feature of Python is the plotting capability. From the beginning, the plotting library based on MATLAB (hence, matplotlib) has been implemented. Over time, the library has developed into something that may well have more features than those in MATLAB. At least the flexibilities are more visible and easily accessible making edit of the figures easy.

Over the years, there have also been developments in independent plotting libraries in Python, such as Bokeh, Seaborn, ggplot, etc. A brief intro of the popular plotting libraries us here. We just introduce two of these libraries. The first part of this chapter introduces plotting standard figures using matplotlib. Readers are recommended to check the comprehensive set of examples with source code used to plot the figure using matplotlib.

2021 Lecture Notes and Tutorial

Presentation Slides

Data-Ink Ratio Animation

Basic matplotlib tutorial

Advanced Plotting

Cartopy

steps that worked for cartopy installation (Sujan)

conda create --name cartopy python=3.8
conda activate cartopy
conda install -c conda-forge cartopy
conda install ipython
ipython

[and copy the code into ipython]

Plot Styles

In terms of the plot styles, matplotlib offers several popular 'looks'. In essence, it allows you to change the designs and colors by inserting a command at the beginning of your script or in an interactive sessions.

Plotting a simple figure

Lineplots

Read the data in the data folder using:

import xarray as xr
ds = xr.open_dataset("data_site/FR-Pue.HH.2003.nc")
ds = ds.isel(latitude=0,longitude=0)

First, a figure object can be defined. fisize is the figure size ins (width,height) tuple. The unit is inches.

from matplotlib import pyplot as plt
plt.figure(figsize=(3,4))
plt.plot(ds['LE'])
plt.show()

There are several keyword arguments such as color, style and so on that control the appearance of the line object. They are listed here. The line and marker styles in matplotlib are shown in the Table below

Linestyle Lines Marker Signs
Solid o Circle
Dashed -- v Triangle_down
Dotted : ^ Triangle_up
< Triangle_left
> Triangle_right
s Square
h Hexagon
+ Plus
x X
d Diamond
p Pentagon

Setting basic figure elements

The horizontal, vertical, and figure labels can all be set easily as,

plt.xlabel('time')
plt.ylabel('Latent Heat Flux',color='k',fontsize=10)
plt.title('One Figure')

Similarly, The axis limits can be set by using xlim() and ylim() as:

plt.xlim(0,1000)
plt.ylim(0,300)

Note that the x-axis in not formatted properly as time.

so, let's try plotting the same thing again with time as the x-axis.

from matplotlib import pyplot as plt
plt.figure(figsize=(3,4))
plt.plot_date(ds['time'],ds['LE'])
plt.show()

Try changing the options of the plot_date function with options for linestyle, marker, etc.

Multiple plots in a figure

Matplotlib has several methods to make subplots within a figure. Here are some quick examples of using the 'mainstream' subplots.

selVars='LE CO2 NETRAD'.split()
nrows=2
ncols=1
plt.figure(figsize=(3,4))
for _var in selVars:
    spI=selVars.index(_var)+1
    dat=ds[_var]
    plt.subplot(nrows,ncols,spI)
    plt.plot_date(ds['time'],dat)

Adding text to a figure

One can easily add text (as strings) to a python figure at any given co-ordinate using,

plt.text(x,y,Str,**kwargs)

plt.text(0.1,0.5,'the first text',fontsize=12,color='red',rotation=45,va='bottom')
plt.text(0.95,0.95,'the second text',fontsize=12,color='green',ha='right',transform=plt.gca().transAxes)
plt.figtext(0.5,0.5,'the third text',fontsize=12,color='blue')

The color and fontsize can be change. For color, use color= some color name such as red or color= hexadecimal color code such as "#0000FF". For font size, use fontsize=number (number is > 0).

Also, grid lines can be turned on by using

plt.grid(which='major',axis='x',ls=':',lw=0.5)

To set the scale to log

plt.yscale('log')

Doing Temporal Aggregation

ds_d=ds.groupby('time.month').mean(dim='time')

selVars='LE CO2 NETRAD'.split()
nrows=2
ncols=2
plt.figure(figsize=(3,4))
for _var in selVars:
    spI=selVars.index(_var)+1
    dat=ds_d[_var]
    plt.subplot(nrows,ncols,spI)
    plt.plot(ds_d['month'],dat)

Scatter Plots

Let's read the data and import the modules first:

import numpy as np
from matplotlib import pyplot as plt
dat1=ds['LE']
dat2=ds['H']
dat3=ds['NETRAD']

Once the data is read, we can open a figure object and start adding things to it.

plt.figure(figsize=(3,4))
plt.scatter(dat1,dat2,color='blue',edgecolor=None)
plt.scatter(dat1,dat3,marker='d',color='red',alpha=0.4,linewidth=0.7)
plt.xlabel('LE ($MJ\ m^{2}\ d^{-1}$)')
plt.ylabel('Rn or H ($\\frac{MJ}{m^{2}d})$',color='k',fontsize=10)
plt.grid(which='major',axis='both',ls=':',lw=0.5)
plt.title('A scatter')

scatter has a slightly different name for colors. The color of the marker, and the lines around it can be set separately using facecolor or edgecolor respectively. It also allows changing the transparency using alpha argument. Note than the width of the the line around the markers is set by edgewidth and not linewidth like in plot.

plt.legend(('Rn','H'),loc='best')

Playing with the Elements

Until now, it's been a dull and standard plotting library. The figure comprises of several instances or objects which can be obtained from several methods, and then modified. This makes customization of a figure extremely fun. Here are some examples of what can be done.

  • The Ugly lines: The boxes around figures are stored as splines, which is actually a dictionary object with information of which line, and their properties. In the rem_axLine function of plotTools, you can see that the linewidth of some of the splines have been set to zero.
import plotTools as pt
pt.rem_axLine()
  • Getting the limits of the axis from the figure. Use gca() method of pyplot to get x and y limits.
ymin,ymax=plt.gca().get_ylim()
xmin,xmax=plt.gca().get_xlim()
  • Let's draw that 1:1 line.
plt.arrow(xmin,ymin,xmax,ymax,lw=0.1,zorder=0)
  • A legendary legend: Here is an example of how flexible a legend object can be. It has a tonne of options and methods. Sometimes, becomes a manual calibration.
leg=plt.legend(('Runoff','ET'),loc=(0.05,0.914),markerscale=0.5,scatterpoints=4,ncol=2,fancybox=True,handlelength=3.5,handletextpad=0.8,borderpad=0.1,labelspacing=0.1,columnspacing=0.25)
leg.get_frame().set_linewidth(0)
leg.get_frame().set_facecolor('firebrick')
leg.legendPatch.set_alpha(0.25) texts = leg.get_texts()
for t in texts:
    tI=texts.index(t)
    t.set_color(cc[tI])
    plt.setp(texts,fontsize=7.83)

Map Map Map!

This section explains the procedure to draw a map using basemap and matplotlib.

Global Data

Let's read the data that we will use to make the map. The data is stored as a big endian plain binary. It consists of float32 data values, and has unknown number of times steps, but it is at a spatial resolution of 1$^\circ$.

from matplotlib import pyplot as plt
from mpl_toolkits.basemap import Basemap

def rem_axLine(rem_list=['top','right'],axlw=0.4):
    ax=plt.gca()
    for loc, spine in ax.spines.items():
        if loc in rem_list:
            spine.set_position(('outward',0)) # outward by 10 points
            spine.set_linewidth(0.)
        else:
            spine.set_linewidth(axlw)
    return

import xarray as xr

ds_global = xr.open_mfdataset('data_global/*.nc')

fig=plt.figure(figsize=(9,7))
ax1=plt.axes([0.05,0.5,0.7,0.3])
rem_axLine(rem_list=['top','right','left','bottom'])
_map=Basemap( projection ='cyl', llcrnrlon = -180, urcrnrlon = 180,llcrnrlat = -90, urcrnrlat = 90, resolution = 'c')

_map.imshow(ds_global['LE'].mean(axis=0),cmap=plt.cm.jet,interpolation='none',origin='upper')
plt.colorbar(orientation='vertical',shrink=0.5)

ax2=plt.axes([0.05,0.05,0.7,0.3])
ax2.plot(ds_global['LE'].sel(latitude=0,longitude=20,method='nearest'))
rem_axLine()

Once the data is read, first a map object should be created using basemap module.

from mpl_toolkits.basemap import Basemap
_map=Basemap( projection ='cyl', llcrnrlon = lonmin, urcrnrlon = lonmax,llcrnrlat = latmin, urcrnrlat = latmax, resolution = 'c')
  • Set the projection and resolution of the background map:

    • resolution: specifies the resolution of the map. c, l, i, h, or None can be used.
      • c: crude
      • l: low
      • i: intermediate
      • h: high
      • f: full
  • The longitude and latitude for lower left corner and upper right corner can be specified as,

    • llcrnrlon: Longitude of Lower Left hand CoRNeR of the desired map.

    • llcrnrlat: Latitude of Lower Left hand CoRNeR of the desired map.

    • urcrnrlon: Longitude of Upper Right hand CoRNeR of the desired map.

    • urcrnrlat: Latitude of Upper Right hand CoRNeR of the desired map.

    • In the current case, the latitude and longitude of the lower left corner of the map are set at the following values:

      • latmin=-90
      • lonmin=-180
      • latmax=90
      • lonmax=180
  • To draw coastlines, country boundaries and rivers:

_map.drawcoastlines(color='k',linewidth=0.8)

coastlines with black color and linewidth 0.8.

_map.drawcountries(color='brown', linewidth=0.3)

draws country boundaries.

_map.drawrivers(color='navy', linewidth=0.3)

draws major rivers

  • To add longitude and latitude labels:
latint=30
lonint=30
parallels =np.arange(latmin+latint,latmax,latint)
_map.drawparallels(parallels,labels=[1,1,0,0],dashes=[1,3],linewidth=.5,color='gray',fontsize=3.33,xoffset=13)
meridians = np.arange(lonmin+lonint,lonmax,lonint)
_map.drawmeridians(meridians,labels=[1,1,1,0],dashes=[1,3],linewidth=.5,color='gray',fontsize=3.33,yoffset=13)
  • arange: Defines an array of latitudes (parallels) and longitude (meridians) to be plotted over the map. In the above example, the parallels (meridians) are drawn from 90°S to 90°N in every 30° (from -180° to 180° in every 30°).

  • color: Color of parallels (meridians). linewidth: Width of parallels (meridians). If you want to draw only axis label and don’t want to draw parallels (meridians) on the map, linewidths should be 0.

  • labels: List of 4 values (default [0,0,0,0]) that control whether parallels are labelled where they intersect the left, right, top or bottom of the plot. For e.g., labels=[1,0,0,1] will cause parallels to be labelled where they intersect the left and bottom of the plot, but not the right and top. xoffset: Distance of latitude labels against vertical axis. yoffset: Distance of longitude labels against horizontal axis.

In the example program, the lines and ticks around the map are also removed by

import plotTools as pt
pt.rem_axLine(['right','bottom','left','top'])
pt.rem_ticks()

Now the data are plotted over the map object as:

from matplotlib import pyplot as plt
fig=plt.figure(figsize=(9,7))
ax1=plt.subplot(211)
_map.imshow(ds_global['LE'].mean(axis=0),cmap=plt.cm.jet,interpolation='none',origin='upper',vmin=0,vmax=200)
plt.colorbar(orientation='vertical',shrink=0.5)

ax2=plt.axes([0.18,0.1,0.45,0.4])
data_gm=np.array([np.ma.masked_less(_data,0).mean() for _data in data])
plt.plot(data_gm)
data_gm_msc=data_gm.reshape(-1,12).mean(0)
pt.rem_axLine()
ax3=plt.axes([0.72,0.1,0.13,0.4])
plt.plot(data_gm_msc)
pt.rem_axLine()
plt.show()

A subplot can be combined with axes in a figure. In this case, a global mean of runoff and its mean seasonal scyle are plotted at axes ax2 and ax3, respectively.

Customizing a Colorbar

  • To specify orientation of colorbar,
colorbar()

default orientation is vertical colorbar on right side of the main plot.

colorbar(orientation='h')

will make a horizontal colorbar below the main plot.

  • To specify the area fraction of total plot area occupied by colorbar:
colorbar()

default fraction is 0.15 (see Fig. [f6-3-a]).

colorbar(fraction=0.5)

50% of the plot area is used by colorbar (see Fig. [f6-3-b]).

  • To specify the ratio of length to width of colorbar:
colorbar(aspect=20)

length:width = 20:1.

Various other colormaps are available in python. Fig. [f6-5] shows some commonly used colorbars and the names for it. More details of the options for colorbar can be found here.

Some commonly used colormaps[]{data-label="f6-5"}

For a list of all the colormaps available in python, click here.