{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 03_Xarray: Multiple Files" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "1. Work with multiple Xarray `Dataset`s\n", "2. Subset the Dataset along its dimensions\n", "3. Perform unit conversions\n", "4. Create a well-labeled multi-parameter contour plot of gridded CFSR reanalysis data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import pandas as pd\n", "import numpy as np\n", "from datetime import datetime as dt\n", "from metpy.units import units\n", "import metpy.calc as mpcalc\n", "import cartopy.crs as ccrs\n", "import cartopy.feature as cfeature\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Work with two Xarray `Dataset`s" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "year = 1993\n", "ds_slp = xr.open_dataset(f'/cfsr/data/{year}/pmsl.{year}.0p5.anl.nc')\n", "ds_g = xr.open_dataset(f'/cfsr/data/{year}/g.{year}.0p5.anl.nc')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take a look at the sizes of these datasets." ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Size of SLP dataset: 1.517948804 GB\n", "Size of geopotential height dataset: 48.573865732 GB\n" ] } ], "source": [ "print (f'Size of SLP dataset: {ds_slp.nbytes / 1e9} GB')\n", "print (f'Size of geopotential height dataset: {ds_g.nbytes / 1e9} GB')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The resulting Datasets are on the order of 50 GB in size altogether, but we aren't noticing any latency thus far when accessing this Dataset. Xarray employs what's called [lazy loading](http://xarray.pydata.org/en/stable/io.html). Key point:\n", "
\n", " \"Data is always loaded lazily from netCDF files. You can manipulate, slice and subset Dataset and DataArray objects, and no array values are loaded into memory until you try to perform some sort of actual computation.\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Subset the Datasets along their dimensions" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We noticed in the previous notebook that our contour labels were not appearing with every contour line. This is because we passed the entire horizontal extent (all latitudes and longitudes) to the `ax.contour` method. Since our intent is to plot only over a regional subset, we will use the `sel` method on the latitude and longitude dimensions as well as time and isobaric surface.\n", "\n", "We'll retrieve two data variables, geopotential and sea-level pressure, from the Dataset.\n", "\n", "We'll also use Datetime and string methods to more dynamically assign various dimensional specifications, as well as aid in figure-labeling later on." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [], "source": [ "# Areal extent\n", "lonW = -100\n", "lonE = -60\n", "latS = 25\n", "latN = 55\n", "cLat, cLon = (latS + latN)/2, (lonW + lonE)/2\n", "\n", "expand = 1\n", "latRange = np.arange(latS - expand,latN + expand,.5) # expand the data range a bit beyond the plot range\n", "lonRange = np.arange((lonW - expand),(lonE + expand),.5) # Need to match longitude values to those of the coordinate variable\n", "\n", "# Vertical level specificaton\n", "plevel = 500\n", "levelStr = str(plevel)\n", "\n", "# Date/Time specification\n", "Year = 1993\n", "Month = 3\n", "Day = 14\n", "Hour = 12\n", "Minute = 0\n", "dateTime = dt(Year,Month,Day, Hour, Minute)\n", "timeStr = dateTime.strftime(\"%Y-%m-%d %H%M UTC\")\n", "\n", "# Data variable selection\n", "\n", "Z = ds_g['g'].sel(time=dateTime,lev=plevel,lat=latRange,lon=lonRange)\n", "SLP = ds_slp['pmsl'].sel(time=dateTime,lat=latRange,lon=lonRange) # of course, no isobaric surface for SLP\n", "\n" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'500'" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "levelStr" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'1993-03-14 1200 UTC'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "timeStr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Let's look at some of the attributes" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(64, 84)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Z.shape" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('lat', 'lon')" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Z.dims" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As a result of selecting just a single time and isobaric surface, both of those dimensions have been abstracted out of the DataArray." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'gpm'" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "Z.units" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "(64, 84)" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SLP.shape" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "('lat', 'lon')" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SLP.dims" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "'Pa'" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "SLP.units" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Define our subsetted coordinate arrays of lat and lon. Pull them from either of the two DataArrays. We'll need to pass these into the contouring functions later on." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [], "source": [ "lats = Z.lat\n", "lons = Z.lon" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
<xarray.DataArray 'lat' (lat: 64)>\n", "array([24. , 24.5, 25. , 25.5, 26. , 26.5, 27. , 27.5, 28. , 28.5, 29. , 29.5,\n", " 30. , 30.5, 31. , 31.5, 32. , 32.5, 33. , 33.5, 34. , 34.5, 35. , 35.5,\n", " 36. , 36.5, 37. , 37.5, 38. , 38.5, 39. , 39.5, 40. , 40.5, 41. , 41.5,\n", " 42. , 42.5, 43. , 43.5, 44. , 44.5, 45. , 45.5, 46. , 46.5, 47. , 47.5,\n", " 48. , 48.5, 49. , 49.5, 50. , 50.5, 51. , 51.5, 52. , 52.5, 53. , 53.5,\n", " 54. , 54.5, 55. , 55.5], dtype=float32)\n", "Coordinates:\n", " time datetime64[ns] 1993-03-14T12:00:00\n", " * lat (lat) float32 24.0 24.5 25.0 25.5 26.0 ... 53.5 54.0 54.5 55.0 55.5\n", " lev float32 500.0\n", "Attributes:\n", " actual_range: [-90 90]\n", " delta_y: 0.5\n", " coordinate_defines: center\n", " mapping: cylindrical_equidistant_projection_grid\n", " grid_resolution: 0.5_degrees\n", " long_name: latitude\n", " units: degrees_north
<xarray.DataArray 'lon' (lon: 84)>\n", "array([-101. , -100.5, -100. , -99.5, -99. , -98.5, -98. , -97.5, -97. ,\n", " -96.5, -96. , -95.5, -95. , -94.5, -94. , -93.5, -93. , -92.5,\n", " -92. , -91.5, -91. , -90.5, -90. , -89.5, -89. , -88.5, -88. ,\n", " -87.5, -87. , -86.5, -86. , -85.5, -85. , -84.5, -84. , -83.5,\n", " -83. , -82.5, -82. , -81.5, -81. , -80.5, -80. , -79.5, -79. ,\n", " -78.5, -78. , -77.5, -77. , -76.5, -76. , -75.5, -75. , -74.5,\n", " -74. , -73.5, -73. , -72.5, -72. , -71.5, -71. , -70.5, -70. ,\n", " -69.5, -69. , -68.5, -68. , -67.5, -67. , -66.5, -66. , -65.5,\n", " -65. , -64.5, -64. , -63.5, -63. , -62.5, -62. , -61.5, -61. ,\n", " -60.5, -60. , -59.5], dtype=float32)\n", "Coordinates:\n", " time datetime64[ns] 1993-03-14T12:00:00\n", " * lon (lon) float32 -101.0 -100.5 -100.0 -99.5 ... -60.5 -60.0 -59.5\n", " lev float32 500.0\n", "Attributes:\n", " actual_range: [-180. 179.5]\n", " delta_x: 0.5\n", " coordinate_defines: center\n", " mapping: cylindrical_equidistant_projection_grid\n", " grid_resolution: 0.5_degrees\n", " long_name: longitude\n", " units: degrees_east
<xarray.DataArray 'g' (lat: 64, lon: 84)>\n", "<Quantity([[578.7279 578.53784 578.35785 ... 589.29584 589.3879 589.4259 ]\n", " [578.09985 577.9659 577.46185 ... 589.2239 589.3619 589.44183]\n", " [577.4139 577.3139 576.54584 ... 589.1059 589.2799 589.38983]\n", " ...\n", " [519.66187 518.9299 518.1839 ... 514.55585 515.2779 516.09985]\n", " [518.65186 517.93787 517.2099 ... 513.29584 514.0199 514.6259 ]\n", " [517.60785 516.9279 516.2239 ... 511.42587 512.1679 512.9239 ]], 'decameter')>\n", "Coordinates:\n", " time datetime64[ns] 1993-03-14T12:00:00\n", " * lat (lat) float32 24.0 24.5 25.0 25.5 26.0 ... 53.5 54.0 54.5 55.0 55.5\n", " * lon (lon) float32 -101.0 -100.5 -100.0 -99.5 ... -60.5 -60.0 -59.5\n", " lev float32 500.0\n", "Attributes:\n", " level_type: isobaric_level (hPa)\n", " long_name: height