http://xarray.pydata.org/en/stable/_static/dataset-diagram-logo.png

Xarray 3: Datasets and Plotting


Overview

  1. The Xarray Dataset

  2. Open and select a variable from the ERA-5 dataset

  3. Convert geopotential to height

  4. Create a contour plot

Prerequisites

Concepts

Importance

Notes

Xarray Lessons 1-2

Necessary

  • Time to learn: 15 minutes

Imports

import xarray as xr
import numpy as np
from metpy.units import units
import metpy.calc as mpcalc
import cartopy.crs as ccrs
import cartopy.feature as cfeature
import matplotlib.pyplot as plt

Work with an Xarray Dataset

An Xarray Dataset consists of one or more DataArrays. Let’s open the same NetCDF file as in the first Xarray notebook, but as a Dataset.

ds = xr.open_dataset('/spare11/atm533/data/2012103000_z500_era5.nc')

Let’s see what this Dataset looks like.

ds
<xarray.Dataset>
Dimensions:    (longitude: 1440, latitude: 721, time: 1)
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * time       (time) datetime64[ns] 2012-10-30
Data variables:
    z          (time, latitude, longitude) float32 ...
Attributes:
    Conventions:  CF-1.6
    history:      2020-09-29 00:40:14 GMT by grib_to_netcdf-2.16.0: /opt/ecmw...

It’s represented somewhat similarly to a DataArray, but now we have support for multiple data variables. In this case,

  1. There is just one data variable: z (Geopotential)

  2. The coordinate variables are longitude, latitude, and time

  3. The data variable has three dimensions (time x latitude x longitude).

  4. Although time is a dimension, there is only one time value on that coordinate.

  5. The Dataset itself has attributes. The attributes of the coordinate and data variables are separately accessible.

Select a data variable from the Dataset.

Z = ds['z'] # Similar to selecting a Series in Pandas

We can programmatically retrieve various attributes of the data variable that we browsed interactively above.

Z
<xarray.DataArray 'z' (time: 1, latitude: 721, longitude: 1440)>
[1038240 values with dtype=float32]
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
  * time       (time) datetime64[ns] 2012-10-30
Attributes:
    units:          m**2 s**-2
    long_name:      Geopotential
    standard_name:  geopotential
Z.shape
(1, 721, 1440)
Z.dims
('time', 'latitude', 'longitude')

We see that Z is a 3-dimensional DataArray, with time as the first dimension.

Z.units
'm**2 s**-2'

Convert geopotential to height

Create a contour plot of gridded data

We will use Cartopy and Matplotlib to plot our 500 hPa height contours. Matplotlib has two contour methods:

  1. Line contours: ax.contour

  2. Filled contours: ax.contourf

We will draw contour lines in meters, with a contour interval of 60 m.

As we’ve done before, let’s first define some variables relevant to Cartopy.

cLon = -80
cLat = 35
lonW = -100
lonE = -60
latS = 20
latN = 50
proj_map = ccrs.LambertConformal(central_longitude=cLon, central_latitude=cLat)
proj_data = ccrs.PlateCarree() # Our data is lat-lon; thus its native projection is Plate Carree.
res = '50m'

Now define the range of our contour values and a contour interval. 60 m is standard for 500 hPa.

minVal = 4680
maxVal = 6060
cint = 60
cintervals = np.arange(minVal, maxVal, cint)
cintervals
array([4680, 4740, 4800, 4860, 4920, 4980, 5040, 5100, 5160, 5220, 5280,
       5340, 5400, 5460, 5520, 5580, 5640, 5700, 5760, 5820, 5880, 5940,
       6000])

Matplotlib’s contour methods require three arrays to be passed to them … x- and y- arrays (longitude and latitude in our case), and a 2-d array (corresponding to x- and y-) of our data variable. So we need to extract the latitude and longitude coordinate variables from our DataArray. We’ll also extract the third coordinate value, time.

lats = ds.latitude 
lons = ds.longitude
times = ds.time

We’re set to make our map. After creating our figure, setting the bounds of our map domain, and adding cartographic features, we will plot the contours. We’ll assign the output of the contour method to an object, which will then be used to label the contour lines.

fig = plt.figure(figsize=(18,12))
ax = plt.subplot(1,1,1,projection=proj_map)
ax.set_extent ([lonW,lonE,latS,latN])
ax.add_feature(cfeature.COASTLINE.with_scale(res))
ax.add_feature(cfeature.STATES.with_scale(res))
CL = ax.contour(lons,lats,HGHT,cintervals,transform=proj_data,linewidths=1.25,colors='green')
ax.clabel(CL, inline_spacing=0.2, fontsize=11, fmt='%.0f');
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[15], line 6
      4 ax.add_feature(cfeature.COASTLINE.with_scale(res))
      5 ax.add_feature(cfeature.STATES.with_scale(res))
----> 6 CL = ax.contour(lons,lats,HGHT,cintervals,transform=proj_data,linewidths=1.25,colors='green')
      7 ax.clabel(CL, inline_spacing=0.2, fontsize=11, fmt='%.0f');

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/cartopy/mpl/geoaxes.py:318, in _add_transform.<locals>.wrapper(self, *args, **kwargs)
    313     raise ValueError(f'Invalid transform: Spherical {func.__name__} '
    314                      'is not supported - consider using '
    315                      'PlateCarree/RotatedPole.')
    317 kwargs['transform'] = transform
--> 318 return func(self, *args, **kwargs)

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/cartopy/mpl/geoaxes.py:362, in _add_transform_first.<locals>.wrapper(self, *args, **kwargs)
    360     # Use the new points as the input arguments
    361     args = (x, y, z) + args[3:]
--> 362 return func(self, *args, **kwargs)

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/cartopy/mpl/geoaxes.py:1614, in GeoAxes.contour(self, *args, **kwargs)
   1592 @_add_transform
   1593 @_add_transform_first
   1594 def contour(self, *args, **kwargs):
   1595     """
   1596     Add the "transform" keyword to :func:`~matplotlib.pyplot.contour`.
   1597 
   (...)
   1612 
   1613     """
-> 1614     result = super().contour(*args, **kwargs)
   1616     # We need to compute the dataLim correctly for contours.
   1617     bboxes = [col.get_datalim(self.transData)
   1618               for col in result.collections
   1619               if col.get_paths()]

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/__init__.py:1442, in _preprocess_data.<locals>.inner(ax, data, *args, **kwargs)
   1439 @functools.wraps(func)
   1440 def inner(ax, *args, data=None, **kwargs):
   1441     if data is None:
-> 1442         return func(ax, *map(sanitize_sequence, args), **kwargs)
   1444     bound = new_sig.bind(ax, *args, **kwargs)
   1445     auto_label = (bound.arguments.get(label_namer)
   1446                   or bound.kwargs.get(label_namer))

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/axes/_axes.py:6451, in Axes.contour(self, *args, **kwargs)
   6442 """
   6443 Plot contour lines.
   6444 
   (...)
   6448 %(contour_doc)s
   6449 """
   6450 kwargs['filled'] = False
-> 6451 contours = mcontour.QuadContourSet(self, *args, **kwargs)
   6452 self._request_autoscale_view()
   6453 return contours

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/contour.py:769, in ContourSet.__init__(self, ax, levels, filled, linewidths, linestyles, hatches, alpha, origin, extent, cmap, colors, norm, vmin, vmax, extend, antialiased, nchunk, locator, transform, negative_linestyles, *args, **kwargs)
    765 if self.negative_linestyles is None:
    766     self.negative_linestyles = \
    767         mpl.rcParams['contour.negative_linestyle']
--> 769 kwargs = self._process_args(*args, **kwargs)
    770 self._process_levels()
    772 self._extend_min = self.extend in ['min', 'both']

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/contour.py:1411, in QuadContourSet._process_args(self, corner_mask, algorithm, *args, **kwargs)
   1408         corner_mask = mpl.rcParams['contour.corner_mask']
   1409 self._corner_mask = corner_mask
-> 1411 x, y, z = self._contour_args(args, kwargs)
   1413 contour_generator = contourpy.contour_generator(
   1414     x, y, z, name=self._algorithm, corner_mask=self._corner_mask,
   1415     line_type=contourpy.LineType.SeparateCode,
   1416     fill_type=contourpy.FillType.OuterCode,
   1417     chunk_size=self.nchunk)
   1419 t = self.get_transform()

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/contour.py:1450, in QuadContourSet._contour_args(self, args, kwargs)
   1448 elif nargs <= 4:
   1449     x, y, z_orig, *args = args
-> 1450     x, y, z = self._check_xyz(x, y, z_orig, kwargs)
   1451 else:
   1452     raise _api.nargs_error(fn, takes="from 1 to 4", given=nargs)

File /knight/mamba_aug23/envs/aug23_env/lib/python3.11/site-packages/matplotlib/contour.py:1475, in QuadContourSet._check_xyz(self, x, y, z, kwargs)
   1472 z = ma.asarray(z)
   1474 if z.ndim != 2:
-> 1475     raise TypeError(f"Input z must be 2D, not {z.ndim}D")
   1476 if z.shape[0] < 2 or z.shape[1] < 2:
   1477     raise TypeError(f"Input z must be at least a (2, 2) shaped array, "
   1478                     f"but has shape {z.shape}")

TypeError: Input z must be 2D, not 3D
../../_images/ebd8929796b7b0441f49e39e2d78079f20009a1f0848542b477f3fd32ceb7e7f.png

Didn't work! The end of the traceback reads:

TypeError: Input z must be 2D, not 3D

In other words, our geopotential height array is not two dimensional in x and y ... recall that it also has *time* as its first dimension! Even though there is only one time value in this dimension, we need to eliminate this dimension from Z.

We’ll learn other ways to abstract out dimensions other than x and y (remember, there could be additional dimensions in our Datasets or DataArrays, such as vertical levels or ensemble members) in later notebooks, but for now, we can simply select the first (and only) element from the time dimension.

HGHT = HGHT[0,:,:] # Select the first element of the leading (time) dimension, and all lons and lats.
HGHT
<xarray.DataArray (latitude: 721, longitude: 1440)>
<Quantity([[5212.1777 5212.1777 5212.1777 ... 5212.1777 5212.1777 5212.1777]
 [5209.8657 5209.8657 5209.8823 ... 5209.8154 5209.832  5209.832 ]
 [5207.554  5207.5874 5207.604  ... 5207.4873 5207.504  5207.5376]
 ...
 [4974.5474 4974.5474 4974.564  ... 4974.497  4974.497  4974.514 ]
 [4970.8623 4970.8623 4970.8623 ... 4970.845  4970.845  4970.845 ]
 [4967.161  4967.161  4967.161  ... 4967.161  4967.161  4967.161 ]], 'meter')>
Coordinates:
  * longitude  (longitude) float32 0.0 0.25 0.5 0.75 ... 359.0 359.2 359.5 359.8
  * latitude   (latitude) float32 90.0 89.75 89.5 89.25 ... -89.5 -89.75 -90.0
    time       datetime64[ns] 2012-10-30

Now try again.

fig = plt.figure(figsize=(18,12))
ax = plt.subplot(1,1,1,projection=proj_map)
ax.set_extent ([lonW,lonE,latS,latN])
ax.add_feature(cfeature.COASTLINE.with_scale(res))
ax.add_feature(cfeature.STATES.with_scale(res))
CL = ax.contour(lons,lats,HGHT,cintervals,transform=proj_data,linewidths=1.25,colors='green')
ax.clabel(CL, inline_spacing=0.2, fontsize=11, fmt='%.0f');
../../_images/3ea63e42d632b2604b4708f7de5c2936efb2ab80c441797fbb71d267e6b952b8.png

There we have it! There are definitely some things we can improve about this plot (such as the lack of an informative title, and the fact that our contour labels seem not to be appearing on every contour line), but we’ll get to that soon!


Summary

  • An Xarray Dataset consists of one or more DataArrays.

  • MetPy can perform unit-aware calculations and conversions on Xarray objects.

  • Since Xarray DataArrays often have more than x- and y- dimensions, care must be taken to reduce dimensionality for the purposes of Matplotlib functions such as contour and contourf.

What’s Next?

In the next notebook, we’ll work with multiple Datasets.