{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "## 04_Xarray: Overlays from a cloud-served ERA5 archive" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "1. Work with a cloud-served ERA5 archive\n", "2. Subset the Dataset along its dimensions\n", "3. Perform unit conversions\n", "4. Create a well-labeled multi-parameter contour plot of gridded ERA5 data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import xarray as xr\n", "import pandas as pd\n", "import numpy as np\n", "from datetime import datetime as dt\n", "from metpy.units import units\n", "import metpy.calc as mpcalc\n", "import cartopy.crs as ccrs\n", "import cartopy.feature as cfeature\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Select the region, time, and (if applicable) vertical level(s) of interest." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Areal extent\n", "\n", "lonW = -100\n", "lonE = -60\n", "latS = 25\n", "latN = 55\n", "cLat, cLon = (latS + latN)/2, (lonW + lonE)/2\n", "\n", "# Recall that in ERA5, longitudes run between 0 and 360, not -180 and 180\n", "if (lonW < 0 ):\n", " lonW = lonW + 360\n", "if (lonE < 0 ):\n", " lonE = lonE + 360\n", " \n", "expand = 1\n", "latRange = np.arange(latS - expand,latN + expand,.25) # expand the data range a bit beyond the plot range\n", "lonRange = np.arange((lonW - expand),(lonE + expand),.25) # Need to match longitude values to those of the coordinate variable\n", "\n", "# Vertical level specificaton\n", "plevel = 500\n", "levelStr = str(plevel)\n", "\n", "# Date/Time specification\n", "Year = 1998\n", "Month = 5\n", "Day = 31\n", "Hour = 18\n", "Minute = 0\n", "dateTime = dt(Year,Month,Day, Hour, Minute)\n", "timeStr = dateTime.strftime(\"%Y-%m-%d %H%M UTC\")" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Work with a cloud-served ERA5 archive\n", "\n", "A team at [Google Research & Cloud](https://research.google/) are making parts of the [ECMWF Reanalysis version 5](https://www.ecmwf.int/en/forecasts/dataset/ecmwf-reanalysis-v5) (aka **ERA-5**) accessible in a [Analysis Ready, Cloud Optimized](https://www.frontiersin.org/articles/10.3389/fclim.2021.782909/full) (aka **ARCO**) format.\n", "\n", "Access the [ERA-5 ARCO](https://weatherbench2.readthedocs.io/en/latest/) catalog\n", "\n", "The ERA5 archive runs from 1/1/1959 through 1/10/2023. For dates subsequent to the end-date, we'll instead load a local archive." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### If the requested date is later than 1/10/2023, read in the dataset from a locally-stored archive." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "endDate = dt(2023,1,10)\n", "if (dateTime <= endDate):\n", "\n", " cloud_source = True\n", " ds = xr.open_dataset(\n", " 'gs://weatherbench2/datasets/era5/1959-2023_01_10-wb13-6h-1440x721.zarr', \n", " chunks={'time': 48},\n", " consolidated=True,\n", " engine='zarr'\n", ")\n", "\n", "else: \n", " import glob, os\n", " cloud_source = False\n", " input_directory = '/free/ktyle/era5'\n", " files = glob.glob(os.path.join(input_directory,'*.nc'))\n", " ds = xr.open_mfdataset(files)\n", "\n", "print(f'size: {ds.nbytes / (1024 ** 4)} TB')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The cloud-hosted dataset is a **big** (40+ TB) file! But Xarray is just reading only enough of the file for us to get a look at the dimensions, coordinate and data variables, and other metadata. We call this *lazy loading*, as opposed to *eager loading*, which we will do only after we have subset the dataset over time, lat-lon, and vertical level." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Examine the dataset" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "ds" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "