{ "cells": [ { "cell_type": "markdown", "id": "5e5cd21c-cab2-483a-8988-511cbe9821a5", "metadata": {}, "source": [ "# Shapefiles 2: Clip a raster within a shapefile" ] }, { "cell_type": "markdown", "id": "63c244d7-4a41-49d2-8b98-6aeeedd9fba8", "metadata": {}, "source": [ "### This notebook plots a raster (gridded) dataset and clips it to a vector shapefile." ] }, { "cell_type": "markdown", "id": "8719466e-453a-4407-b66f-bad48391d3a3", "metadata": {}, "source": [ "## Overview:\n", "1. Use **Xarray** to read in a raster file; in this case, the most recent NCEP Real-time surface analysis (RTMA)\n", "1. Use **MetPy**'s `parse_cf` method\n", "1. Read in the Natural Earth state/province shapefile, georeferenced with **Geopandas**\n", "1. Use **Matplotlib** and **Cartopy** methods to clip the raster field to fit exactly within the shapefile" ] }, { "cell_type": "markdown", "id": "18c3e4b3-1224-466f-84f7-830af95a7e0e", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": null, "id": "5d933080-670a-478e-926c-f9ca79a400fe", "metadata": { "tags": [] }, "outputs": [], "source": [ "import cartopy.crs as ccrs\n", "import cartopy.feature as cfeature\n", "from cartopy.mpl.patch import geos_to_path\n", "import cartopy.io.shapereader as shpreader\n", "from matplotlib import pyplot as plt\n", "import matplotlib.path as mpath\n", "import numpy as np\n", "import xarray as xr\n", "import os\n", "from datetime import datetime, timedelta\n", "os.environ['USE_PYGEOS'] = '0' # Eliminate warning message regarding PyGeos and Shapely with Geopandas \n", "import geopandas as gpd" ] }, { "cell_type": "markdown", "id": "f9a84629-6744-4695-a919-d1d018aa1a3c", "metadata": {}, "source": [ "Select the regional bounds to subset the map (and, eventually, the data)" ] }, { "cell_type": "code", "execution_count": null, "id": "72c8bdc7-9204-4dd8-b38a-77a4c19ecf6f", "metadata": { "tags": [] }, "outputs": [], "source": [ "mapExtent = [-80,-71.3,40.2,45]" ] }, { "cell_type": "markdown", "id": "fd313a6f-2315-4a6c-af44-eb2823c6ae6e", "metadata": {}, "source": [ "Specify the output map projection (Lambert Conformal, centered on New York State)" ] }, { "cell_type": "code", "execution_count": null, "id": "8804977c-dec9-478a-aa8e-4e03f9e56836", "metadata": { "tags": [] }, "outputs": [], "source": [ "lon1 = -75\n", "lat1 = 42.5\n", "slat = 42.5\n", "\n", "proj_map= ccrs.LambertConformal(central_longitude=lon1,\n", " central_latitude=lat1,\n", " standard_parallels=[slat,slat])" ] }, { "cell_type": "markdown", "id": "ffa86f78-3d48-4b2a-a0c9-0e9d7c43ea6d", "metadata": {}, "source": [ "## Access real-time NWP data from Unidata's THREDDS server" ] }, { "cell_type": "markdown", "id": "e85d1aff-cc72-408c-92a1-e8cde667a331", "metadata": {}, "source": [ "Parse the current time and choose a recent hour when we might expect the model run to be complete. There is normally a ~90 minute lag, but we'll give it a bit more than that." ] }, { "cell_type": "code", "execution_count": null, "id": "f8d0673d-9d98-4ad4-9e33-0d6c48372d7a", "metadata": {}, "outputs": [], "source": [ "now = datetime.now()\n", "year = now.year\n", "month = now.month\n", "day = now.day\n", "hour = now.hour\n", "minute = now.minute\n", "print (year, month, day, hour,minute)" ] }, { "cell_type": "code", "execution_count": null, "id": "5707f6f9-c61c-4893-9111-520f54b74dbe", "metadata": {}, "outputs": [], "source": [ "# The previous hour's data is typically available shortly after the half-hour, but give it a little extra time \n", "if (minute < 45): \n", " hrDelta = 3\n", "else:\n", " hrDelta = 2\n", "runTime = now - timedelta(hours=hrDelta)\n", "runTimeStr = runTime.strftime('%Y%m%d %H00 UTC')\n", "modelDate = runTime.strftime('%Y%m%d')\n", "modelHour = runTime.strftime('%H')\n", "modelMin = runTime.strftime('%M')\n", "modelDay = runTime.strftime('%D')\n", "titleTime = modelDay + ' ' + modelHour + '00 UTC'\n", "print (modelDay)\n", "print (modelHour)\n", "print (runTimeStr)\n", "print(titleTime)" ] }, { "cell_type": "code", "execution_count": null, "id": "be52794e-8199-434a-9661-736cc18f94a2", "metadata": { "tags": [] }, "outputs": [], "source": [ "runTime.strftime('%B')" ] }, { "cell_type": "markdown", "id": "1b0f2924-a748-463f-a41f-b042ee5b6bf1", "metadata": {}, "source": [ "Open the most recent NCEP real-time mesoscale analysis (RTMA) from [Unidata's THREDDS server](https://thredds.ucar.edu/thredds/catalog/grib/NCEP/RTMA/CONUS_2p5km/catalog.html)" ] }, { "cell_type": "code", "execution_count": null, "id": "a9bd8a7e-d93c-4be4-b960-4e765aea1eb2", "metadata": { "tags": [] }, "outputs": [], "source": [ "ds = xr.open_dataset(f'https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/RTMA/CONUS_2p5km/RTMA_CONUS_2p5km_{modelDate}_{modelHour}00.grib2')" ] }, { "cell_type": "code", "execution_count": null, "id": "cb0b44aa-b795-4a96-bebc-cea0368a017e", "metadata": { "tags": [] }, "outputs": [], "source": [ "ds" ] }, { "cell_type": "markdown", "id": "c8788b36-25ff-4ef6-9cd0-ee076598aace", "metadata": {}, "source": [ "Leverage MetPy's `parse_cf` function, which among other things will make it easy to extract the dataset's map projection." ] }, { "cell_type": "code", "execution_count": null, "id": "43bcfd6f-e85d-4e60-bce4-4a4abea1d338", "metadata": { "tags": [] }, "outputs": [], "source": [ "ds = ds.metpy.parse_cf()" ] }, { "cell_type": "code", "execution_count": null, "id": "3d276d87-0dd5-4ae6-8b91-de223a512a75", "metadata": { "tags": [] }, "outputs": [], "source": [ "ds" ] }, { "cell_type": "markdown", "id": "82cd9a44-2f62-49f9-842d-9d5c552e067a", "metadata": {}, "source": [ "Note that there now exists a `metpy_crs` coordinate variable, which we'll soon use to determine and use the `Dataset`'s map projection." ] }, { "cell_type": "markdown", "id": "a34c40f5-a535-4ccd-9439-dcd16bc920eb", "metadata": {}, "source": [ "Select one of the variables from the `Dataset`; in this case, we'll choose 2-meter temperature." ] }, { "cell_type": "code", "execution_count": null, "id": "0e3bd78a-ecb0-45d0-809d-7a4003f21fd8", "metadata": { "tags": [] }, "outputs": [], "source": [ "var = ds.Temperature_Analysis_height_above_ground" ] }, { "cell_type": "code", "execution_count": null, "id": "e0b854f8-1eaf-4272-8720-03dc9667ec00", "metadata": { "tags": [] }, "outputs": [], "source": [ "var" ] }, { "cell_type": "markdown", "id": "af7858a4-78c9-4811-a407-55e64a2dd552", "metadata": {}, "source": [ "Note that this `DataArray` has 3 dimensions; the time dimension has just one element. Let's eliminate that time dimension using the `.squeeze` function." ] }, { "cell_type": "code", "execution_count": null, "id": "41845840-5c4b-4e34-82fc-086339166c43", "metadata": { "tags": [] }, "outputs": [], "source": [ "var = var.squeeze()" ] }, { "cell_type": "code", "execution_count": null, "id": "d6045752-93a2-4ae9-998a-815ed14b7a2b", "metadata": { "tags": [] }, "outputs": [], "source": [ "var" ] }, { "cell_type": "markdown", "id": "a38a2a70-2ae6-4a10-b00d-80eb1442d913", "metadata": {}, "source": [ "Now we can assign the `Dataset`'s map projection." ] }, { "cell_type": "code", "execution_count": null, "id": "b69cf0b7-8da0-40ab-b14d-867a76987b43", "metadata": { "tags": [] }, "outputs": [], "source": [ "proj_data = var.metpy.cartopy_crs" ] }, { "cell_type": "code", "execution_count": null, "id": "4a76e746-b50b-4b43-aba0-0d6f320dede0", "metadata": { "tags": [] }, "outputs": [], "source": [ "proj_data" ] }, { "cell_type": "markdown", "id": "6e3677f0-a5ad-4b36-94b2-f15fee1ec7c1", "metadata": {}, "source": [ "Read in the 10m state/province shapefile" ] }, { "cell_type": "code", "execution_count": null, "id": "5d099432-b3c6-4bfb-8a5b-8026ee5011c2", "metadata": {}, "outputs": [], "source": [ "res = '10m'\n", "category = 'cultural'\n", "name = 'admin_1_states_provinces'\n", "states = shpreader.natural_earth(res, category, name)" ] }, { "cell_type": "markdown", "id": "ce1c2ad0-68cf-4dfc-8ec0-285cc9164b23", "metadata": {}, "source": [ "Read in the shapefile as a Geopandas dataframe" ] }, { "cell_type": "code", "execution_count": null, "id": "3071a9b7-eec7-4ecd-88fa-23895fc7094c", "metadata": { "tags": [] }, "outputs": [], "source": [ "gdf = gpd.read_file(states)" ] }, { "cell_type": "markdown", "id": "a09757cf-e3c6-4e43-8a58-881857a36bcf", "metadata": {}, "source": [ "Subset the Geopandas dataframe to just NYS and reset the index" ] }, { "cell_type": "code", "execution_count": null, "id": "05fef2c4-6d1d-416f-b76c-0b7ea67fcc91", "metadata": { "tags": [] }, "outputs": [], "source": [ "gdf = gdf[gdf['name_en'] == 'New York'].reset_index(drop=True)" ] }, { "cell_type": "code", "execution_count": null, "id": "d5c24671-bd69-4353-af27-9c7746b1bf70", "metadata": { "tags": [] }, "outputs": [], "source": [ "gdf" ] }, { "cell_type": "markdown", "id": "e9de3b4f-f53e-471d-b64a-15025ee47082", "metadata": {}, "source": [ "Express the `geometry` in the desired map projection and create a new column for the output geometry" ] }, { "cell_type": "code", "execution_count": null, "id": "ea061e33-5672-4f2f-9d31-0faef3a36c46", "metadata": { "tags": [] }, "outputs": [], "source": [ "nystate_LCC = gdf.geometry.to_crs(proj_map)" ] }, { "cell_type": "code", "execution_count": null, "id": "32dd6bf2-e90e-472f-bf15-83ea78803076", "metadata": { "tags": [] }, "outputs": [], "source": [ "gdf['geometryOutput'] = gdf.geometry.to_crs(proj_map)" ] }, { "cell_type": "code", "execution_count": null, "id": "65d87c8f-295d-4aa9-89c5-72524f432ce1", "metadata": { "tags": [] }, "outputs": [], "source": [ "gdf" ] }, { "cell_type": "markdown", "id": "e80ab994-9158-4d36-ae8a-f769a52dce6e", "metadata": {}, "source": [ "Express the output map projection geometry as a Matplotlib `path`. We pass in a Shapely geometry object from the first (and in this case, only) row to Cartopy's `geos_to_path` function." ] }, { "cell_type": "code", "execution_count": null, "id": "b07bb9e4-2018-4c21-b52f-999f0edbdf98", "metadata": { "tags": [] }, "outputs": [], "source": [ "path = geos_to_path(gdf.loc[0,'geometryOutput'])" ] }, { "cell_type": "code", "execution_count": null, "id": "30ce7a80-69a2-415f-9006-9afded214a6c", "metadata": { "tags": [] }, "outputs": [], "source": [ "var" ] }, { "cell_type": "code", "execution_count": null, "id": "275aeb52-046b-4a20-b481-8ea77ffc216d", "metadata": { "tags": [] }, "outputs": [], "source": [ "fig, ax1 = plt.subplots(nrows=1, figsize=(15,10), subplot_kw={\"projection\": proj_map})\n", "\n", "x, y = var.x, var.y\n", "\n", "mesh = ax1.pcolormesh(x, y, var, transform=proj_data)\n", "# Make a colorbar\n", "cbar = fig.colorbar(mesh,shrink=0.5)\n", "cbar.set_label(r'2m Temperature ($^\\circ$C)', size='large')" ] }, { "cell_type": "markdown", "id": "030f992c-6148-4cc9-af09-51a27aad318d", "metadata": {}, "source": [ "Now, let's use some Matplotlib magic to clip the raster data so it only is displayed within the NYS 10m shapefile." ] }, { "cell_type": "code", "execution_count": null, "id": "7b3f4ae9-e347-43ba-9ff6-80df6ecfc0a0", "metadata": { "tags": [] }, "outputs": [], "source": [ "poly_path = path\n", "poly_compound_path = mpath.Path.make_compound_path(*poly_path)\n", "poly_transform = proj_map._as_mpl_transform(ax1)\n", "mesh.set_clip_path(poly_compound_path, poly_transform)" ] }, { "cell_type": "code", "execution_count": null, "id": "6ece5b30-31fb-498d-90a6-b57fdc4f1cb4", "metadata": { "tags": [] }, "outputs": [], "source": [ "res='10m'\n", "\n", "fig, ax1 = plt.subplots(nrows=1, figsize=(15,10), subplot_kw={\"projection\": proj_map})\n", "\n", "ax1.set_extent(mapExtent,crs=ccrs.PlateCarree())\n", "\n", "mesh = ax1.pcolormesh(x, y, var, transform=proj_data)\n", "# Make a colorbar\n", "cbar = fig.colorbar(mesh,shrink=0.5)\n", "cbar.set_label(f'Temperature ($^\\circ$C)', size='large')\n", "ax1.coastlines()\n", "\n", "# Get the path of the polygons\n", "#poly_path = nypath\n", "poly_path = path\n", "poly_compound_path = mpath.Path.make_compound_path(*poly_path)\n", "poly_transform = proj_map._as_mpl_transform(ax1)\n", "mesh.set_clip_path(poly_compound_path, poly_transform)\n", "ax1.set_title(f'RTMA 2m Temperature ($^\\circ$C), {titleTime}');" ] }, { "cell_type": "markdown", "id": "31f4cacd-04ea-485f-9c4f-db77d49fc225", "metadata": {}, "source": [ "---\n", "### Things to try:\n", "1. Overlay additional data (point-based, contour-based, etc.) on the final map\n", "1. Use the data variable's metadata to dynamically create the figure title and/or colorbar label.\n", "1. When you create the Xarray dataset, use `sel` to restrict the spatial extent of the data retrieval. \n", "\n", "## Resources and References\n", "1. [Cartopy GitHub Issue regarding how to clip](https://github.com/SciTools/cartopy/issues/2251)\n", "1. [Metpy's Xarray accessors](https://unidata.github.io/MetPy/latest/api/generated/metpy.xarray.html)\n", "1. [Cartopy's Matplotlib-Shapely utilities ](https://scitools.org.uk/cartopy/docs/latest/reference/matplotlib.html#module-cartopy.mpl.patch)\n", "1. [NCEP's RTMA](https://www.nco.ncep.noaa.gov/pmb/products/rtma/)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 August 2023 Environment", "language": "python", "name": "aug23" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.11.4" } }, "nbformat": 4, "nbformat_minor": 5 }