{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Xarray 5: Time Series" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "1. Work with an ERA-5 `Dataset` hosted on NCAR's Research Data Archive\n", "2. Subset the Dataset along its dimensions\n", "3. Perform diagnostic calculations and unit conversions\n", "4. Perform Split-Apply-Combine\n", "5. Create and refine a time series plot of minimum SLP and maximum windspeed over the subsetted region" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| Xarray Lessons 1-4| Necessary | |\n", "\n", "* **Time to learn**: 30 minutes\n", "***" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import pandas as pd\n", "import numpy as np\n", "from datetime import datetime as dt\n", "from metpy.units import units\n", "import metpy.calc as mpcalc\n", "import cartopy.crs as ccrs\n", "import cartopy.feature as cfeature\n", "import matplotlib.pyplot as plt\n", "from matplotlib.dates import DateFormatter, AutoDateLocator,HourLocator,DayLocator,MonthLocator\n", "from matplotlib.ticker import (MultipleLocator, AutoMinorLocator)\n", "import requests" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Work with an ERA-5 Dataset hosted on NCAR's [Remote Data Archive](https://rda.ucar.edu)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create OpenDAP URLs pointing to the variables as they are stored in RDA's THREDDS server" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Note: Single-level fields are stored in one monthly file. Fields on pressure levels are stored in daily files. In this case, we will retrieve SLP and 10-meter winds, so we will be using the sfc directory for all three variables.\n", "\n", "
\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds_urlu = 'https://thredds.rda.ucar.edu/thredds/dodsC/files/g/ds633.0/e5.oper.an.sfc/201210/e5.oper.an.sfc.128_165_10u.ll025sc.2012100100_2012103123.nc'\n", "ds_urlv = 'https://thredds.rda.ucar.edu/thredds/dodsC/files/g/ds633.0/e5.oper.an.sfc/201210/e5.oper.an.sfc.128_166_10v.ll025sc.2012100100_2012103123.nc'\n", "ds_urlp = 'https://thredds.rda.ucar.edu/thredds/dodsC/files/g/ds633.0/e5.oper.an.sfc/201210/e5.oper.an.sfc.128_151_msl.ll025sc.2012100100_2012103123.nc'" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now open the three files as separate `Dataset`s" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dsSLP = xr.open_dataset(ds_urlp)\n", "dsU10 = xr.open_dataset(ds_urlu)\n", "dsV10 = xr.open_dataset(ds_urlv)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine one of the `Dataset`s." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "dsU10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create `DataArray` objects for each of the three variables in the `Dataset`s." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slp = dsSLP['MSL']\n", "u10 = dsU10['VAR_10U']\n", "v10 = dsV10['VAR_10V']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Subset the Dataset along its dimensions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As we did before, we will perform temporal and areal subsetting." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Take advantage of Pandas' date/time methods. In this case, we will use Pandas [date_range](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.date_range.html)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's define a function whose output we will use to label our plotted time series." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "timeRange = pd.date_range(start=\"2012-10-27 00:00\",end=\"2012-10-31 00:00\",freq='1H')\n", "timeRange" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "def areaBox(lonW,lonE,latS,latN):\n", " '''\n", " Returns a string containing the bounds of the selected region\n", " '''\n", " outputStr = ''\n", " for val in [lonW, lonE]:\n", " if (val < 0):\n", " val *= -1\n", " valStr = str(val) + 'W '\n", " else:\n", " valStr = str(val) + 'E '\n", " outputStr = outputStr + valStr\n", " for val in [latS, latN]:\n", " if (val< 0): \n", " val *= -1\n", " valStr = str(val) + 'S '\n", " else:\n", " valStr = str(val) + 'N '\n", " outputStr = outputStr + valStr\n", " \n", " return outputStr" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Areal extent\n", "lonW = -100\n", "lonE = -60\n", "latS = 20\n", "latN = 50\n", "domainStr = areaBox(lonW,lonE,latS,latN)\n", "latRange = np.arange(latS-5,latN+5,.25) # expand the data range a bit beyond the plot range\n", "lonRange = np.arange((lonW-5+360),(lonE+5+360),.25) # Need to match longitude values to those of the coordinate variable\n", "\n", "# Vertical level specificaton\n", "vlevel = 10\n", "levelStr = str(vlevel)\n", "\n", "# Data variable selection\n", "SLP = dsSLP['MSL'].sel(time=timeRange,latitude=latRange,longitude=lonRange)\n", "U10 = dsU10['VAR_10U'].sel(time=timeRange,latitude=latRange,longitude=lonRange)\n", "V10 = dsV10['VAR_10V'].sel(time=timeRange,latitude=latRange,longitude=lonRange)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine a couple of the DataArrays following our subsetting" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SLP" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "U10" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Xarray has [Split-Apply-Combine](http://xarray.pydata.org/en/stable/groupby.html) functionality similar to Pandas. In this next cell, we are grouping by date/time and then calculating the minimum value of SLP over the other two dimensions." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SLP.min(dim=['latitude','longitude'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Call Xarray's built-in `plot` method to get a quick look at the time series. Note that we use the Python convention where `_` is an object corresponding to the output of the previous line of code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "_.plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Perform unit conversions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert meters per second to knots, and Pascals to hectoPascals. Both are straightforward with MetPy's `convert_units` method." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SLP = SLP.metpy.convert_units('hPa')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "U10 = U10.metpy.convert_units('kts')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "V10 = V10.metpy.convert_units('kts')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Calculate wind speed from the u- and v- components using MetPy's diagnostic function, `wind_speed`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "WSPD = mpcalc.wind_speed(U10,V10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Perform Split-Apply-Combine" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Compute the minimum SLP, and then the maximum wind speed, for all latitude and longitude points ... for each time." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "SLPmin = SLP.min(dim=['latitude','longitude'])\n", "print(SLPmin)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "WSPDmax = WSPD.max(dim=['latitude','longitude'])" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "We'll take a deeper dive into Xarray's *Split-Apply-Combine* implementation in the next notebook. \n", "\n", "Here, we've effectively *reduced* the data array down to one dimension (time)!\n", "\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Create and refine a time series plot of minimum SLP and maximum windspeed over the subsetted region." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a meaningful title string." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tl1 = \"ERA-5 maximum \" + levelStr + \" m windspeed (kts) and minimum SLP (hPa)\"\n", "tl2 = str('Valid within: '+ domainStr)\n", "title_line = (tl1 + '\\n' + tl2 + '\\n')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Improve the look of the time-series figure by using Seaborn." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import seaborn as sns\n", "sns.set()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Plot two y-axes; one for SLP, the other for windspeed" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "scrolled": true }, "outputs": [], "source": [ "fig = plt.figure(figsize=(18,12))\n", "ax1 = plt.subplot(1,1,1)\n", "color = 'tab:red'\n", "ax1.plot(SLP.time,SLPmin,color=color,label = 'Min SLP')\n", "\n", "ax1.set_title (title_line,fontsize=14)\n", "\n", "ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis\n", "color = 'tab:blue'\n", "ax2.plot(WSPD.time,WSPDmax,color=color,label = 'Max Wind');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \n", "What's the meaning of `tab:` in our `color` definition? It's one of many [color table specifications](https://matplotlib.org/stable/gallery/color/colormap_reference.html) available in Matplotlib; in this case, *tab* refers to a [color table](https://matplotlib.org/stable/tutorials/colors/colors.html) derived from a commercial dataviz package, [Tableau](https://www.tableau.com).\n", "\n", "
\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### That's not a bad first plot, but much can be done to improve it. Here we use several Matplotlib `Axes` methods to customize the plot. " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
Often the best way to learn about these methods (and others) is to browse the relevant documentation at https://matplotlib.org and play around with them. You may be the type who always wants to keep making it look better, but avoid the temptation to keep trying for perfection.\n", "
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize=(18,12),dpi=150)\n", "ax1 = plt.subplot(1,1,1)\n", "\n", "# First Axes: SLP\n", "color = 'tab:red'\n", "ax1.plot(SLP.time,SLPmin,color=color,label = 'Min SLP')\n", "ax1.grid(color='black',linewidth=0.5)\n", "\n", "ax1.xaxis.set_major_locator(HourLocator(interval=6))\n", "ax1.xaxis.set_minor_locator(HourLocator(interval=1))\n", "dateFmt = DateFormatter('%H')\n", "ax1.xaxis.set_major_formatter(dateFmt)\n", "ax1.set_xlabel('Date/Time (UTC)')\n", "ax1.set_xlim(SLP.time[0],SLP.time[-1])\n", "\n", "ax1.set_ylabel('SLP (hPa)',color=color, fontsize=14)\n", "ax1.yaxis.set_minor_locator(MultipleLocator(5))\n", "ax1.tick_params(which='minor', length=6)\n", "ax1.tick_params(which='major',axis='y', labelcolor=color,direction='out', length=6, width=2, colors=color,\n", " grid_color=color, grid_alpha=0.5)\n", "ax1.tick_params(which='minor', axis = 'y', length=4, color=color)\n", "\n", "ax1.set_title (title_line,fontsize=14)\n", "\n", "# Second Axes: Windspeed\n", "ax2 = ax1.twinx() # instantiate a second axes that shares the same x-axis\n", "color = 'tab:blue'\n", "ax2.plot(WSPD.time,WSPDmax,color='tab:blue',label = 'Max Wind')\n", "# Do not draw grid lines (makes plot more readable)\n", "ax2.grid(visible=None)\n", "\n", "ax2.set_ylabel('Windspeed (kts)', color=color, fontsize=14) # we already handled the x-label with ax1\n", "ax2.tick_params(axis='y', labelcolor=color,direction='out', length=6, width=2, colors=color,\n", " grid_color=color, grid_alpha=0.5) # grid-related specs will be ignored since we are not plotting grid lines for this axes\n", "\n", "fig.legend(loc='upper right',frameon=True,bbox_to_anchor=(0.9,1.1), bbox_transform=ax1.transAxes,fontsize=16,shadow=True,edgecolor='purple')\n", "fig.autofmt_xdate(rotation=45);" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Save our figure to our current directory." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig.savefig('ERA5_Time_Series.png')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Summary\n", "* Xarray's Pandas-like support for time arrays allows for informative time-series plots.\n", "* Xarray's implements a split-apply-combine framework that is also Pandas-like.\n", "\n", "### What's Next?\n", "In the next notebook, we'll perform *Split-Apply-Combine* on an Xarray sea-surface temperature dataset.\n", "\n", "## Resources and References\n", "1. [Xarray Time Series](http://xarray.pydata.org/en/stable/user-guide/time-series.html)\n", "1. [Matplotlib's Date and Time library](https://matplotlib.org/stable/api/dates_api.html)\n", "1. [Matplotlib's tick locator and formatter library](https://matplotlib.org/stable/api/dates_api.html)\n", "1. [Matplotlib: two axis scales](https://matplotlib.org/gallery/subplots_axes_and_figures/two_scales.html)\n", "1. [Matplotlib: multiple axis legend I](https://stackoverflow.com/questions/14344063/single-legend-for-multiple-axes)\n", "1. [Matplotlib: multiple axis legend II](https://stackoverflow.com/questions/5484922/secondary-axis-with-twinx-how-to-add-to-legend)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 August 2023 Environment", "language": "python", "name": "aug23" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 4 }