{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\"pandas
\n", "\n", "# Pandas 2: NYS Mesonet Map\n", "---" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Overview\n", "In this notebook, we'll use Cartopy, Matplotlib, and Pandas (with a little help from [MetPy](https://unidata.github.io/MetPy)) to read in, manipulate, and visualize data from the [New York State Mesonet](https://www2.nysmesonet.org).\n", "\n", "We'll focus on a particular time on the evening of 1 September 2021, when the remnants of Hurricane Ida were impacting the greater New York City region." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| Matplotlib | Necessary | |\n", "| Cartopy | Necessary | |\n", "| Pandas | Necessary | Intro |\n", "| MetPy | Helpful | |\n", "\n", "* **Time to learn**: 30 minutes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd\n", "from cartopy import crs as ccrs\n", "from cartopy import feature as cfeature\n", "from metpy.calc import wind_components\n", "from metpy.units import units\n", "from metpy.plots import StationPlot" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create Pandas `DataFrame` objects based on two csv files; one contains the site locations, while the other contains the hourly data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_sites = pd.read_csv('/spare11/atm533/data/nysm_sites.csv')\n", "nysm_data = pd.read_csv('/spare11/atm533/data/nysm_data_2021090202.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create `Series` objects for several columns from the two `DataFrames`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, remind ourselves of the column names for these two `DataFrames`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_sites.columns" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_data.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create several `Series` objects for particular columns of interest. Pay attention to which `DataFrame` provides the particular `Series`!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "stid = nysm_sites['stid']\n", "lats = nysm_sites['lat']\n", "lons = nysm_sites['lon']\n", "\n", "time = nysm_data['time']\n", "tmpc = nysm_data['temp_2m [degC]']\n", "rh = nysm_data['relative_humidity [percent]']\n", "pres = nysm_data['station_pressure [mbar]']\n", "wspd = nysm_data['max_wind_speed_prop [m/s]']\n", "drct = nysm_data['wind_direction_prop [degrees]']\n", "pinc = nysm_data['precip_incremental [mm]']\n", "ptot = nysm_data['precip_local [mm]']\n", "pint = nysm_data['precip_max_intensity [mm/min]']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine one or more of these `Series`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Exercise: Read in at least one additional Series object.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your code below\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: Each of these Series objects contain data stored as NumPy arrays. As a result, we can take advantage of broadcasting to perform operations on all array elements without needing to construct a Python for loop.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the temperature and wind speed arrays to Fahrenheit and knots, respectively." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf = tmpc * 1.8 + 32\n", "wspk = wspd * 1.94384" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine the new `Series`. Note that every element of the array has been calculated using the arithemtic above ... in just one line of code per `Series`!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Note:Notice that the metadata did not change to reflect the change in units! That is something we'd have to change manually, via the Series' name attribute:
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf.name = 'temp_2m [degF]'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, use another statistical *method* available to `Series`: in this case, the maximum value among all elements of the array." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "wspk.max()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Our goal is to make a map of NYSM observations, which includes the wind velocity. The convention is to plot wind velocity using wind barbs. The **MetPy** library allows us to not only make such a map, but perform a variety of meteorologically-relevant calculations and diagnostics. Here, we will use such a calculation, which will determine the two scalar components of wind velocity (*u* and *v*), from wind speed and direction. We will use MetPy's `wind_components` method. \n", "\n", "This method requires us to do the following:\n", "1. Extract the Numpy arrays from the windspeed and direction `Series` objects via the `values` attribute\n", "2. Attach units to these arrays using MetPy's `units` class" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can accomplish everything in this single line of code:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "u, v = wind_components(wspk.values * units.knots, drct.values * units.degree)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Take a look at one of the output components:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "u" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: This is a unique type of object: not only does it have values, aka magnitude, but it also has Units attached. This is a very nice property, since it will make calculations that require awareness of units much easier!
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now, let's plot several of the meteorological values on a map. We will use **Matplotlib** and **Cartopy**, as well as **MetPy**'s `StationPlot` method." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot the map, centered over NYS, with add some geographic features, and the mesonet data.\n", "***\n", "### For a meteorological surface station plot, let's take advantage of the `Metpy` package, and its StationPlot method.\n", "\n", "#### Be patient: this may take a minute or so to plot, if you chose the highest resolution for the shapefiles!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Set the domain for defining the plot region.\n", "latN = 45.2\n", "latS = 40.2\n", "lonW = -80.0\n", "lonE = -72.0\n", "cLat = (latN + latS)/2\n", "cLon = (lonW + lonE )/2\n", "\n", "res = '50m'\n", "proj = ccrs.LambertConformal(central_longitude=cLon, central_latitude=cLat)\n", "\n", "fig = plt.figure(figsize=(18,12),dpi=150) # Increase the dots per inch from default 100 to make plot easier to read\n", "ax = fig.add_subplot(1,1,1,projection=proj)\n", "ax.set_extent ([lonW,lonE,latS,latN])\n", "ax.add_feature (cfeature.LAND.with_scale(res))\n", "ax.add_feature (cfeature.OCEAN.with_scale(res))\n", "ax.add_feature(cfeature.COASTLINE.with_scale(res))\n", "ax.add_feature (cfeature.LAKES.with_scale(res))\n", "ax.add_feature (cfeature.STATES.with_scale(res));\n", "\n", "# Create a station plot pointing to an Axes to draw on as well as the location of points\n", "stationplot = StationPlot(ax, lons, lats, transform=ccrs.PlateCarree(),\n", " fontsize=8)\n", "\n", "stationplot.plot_parameter('NW', tmpf, color='red')\n", "stationplot.plot_parameter('SW', rh, color='green')\n", "stationplot.plot_parameter('NE', pres, color='purple')\n", "stationplot.plot_barb(u, v,zorder=2) # zorder value set so wind barbs will display over lake features\n", "ax.set_title ('Temperature ($^\\circ$F), RH (%), Station Pressure (hPa), Peak 5-min Wind (kts) 0200 UTC 02 September 2021');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What if we wanted to plot sea-level pressure (SLP) instead of station pressure? In this case, we can apply what's called a **reduction to sea-level pressure** formula. This formula requires station elevation (accounting for sensor height) in meters, temperature in Kelvin, and station pressure in hectopascals. We assume each NYSM station has its sensor height .5 meters above ground level." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: At present, MetPy does not yet have a function that reduces station pressure to SLP, so we will do a units-unaware calculation.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "elev = nysm_sites['elevation']\n", "sensorHeight = .5\n", "# Reduce station pressure to SLP. Source: https://www.sandhurstweather.org.uk/barometric.pdf \n", "slp = pres/np.exp(-1*(elev+sensorHeight)/((tmpc+273.15) * 29.263))" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "slp" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Make a new map, substituting SLP for station pressure. We will also use the convention of the three least-significant digits to represent SLP in hectopascals." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Examples: 1018.4 hPa would be plotted as 184, while 977.2 hPa would be plotted as 772.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize=(18,12),dpi=150)\n", "ax = fig.add_subplot(1,1,1,projection=proj)\n", "ax.set_extent ([lonW,lonE,latS,latN])\n", "ax.add_feature (cfeature.LAND.with_scale(res))\n", "ax.add_feature (cfeature.OCEAN.with_scale(res))\n", "ax.add_feature(cfeature.COASTLINE.with_scale(res))\n", "ax.add_feature (cfeature.LAKES.with_scale(res))\n", "ax.add_feature (cfeature.STATES.with_scale(res));\n", "\n", "# Create a station plot pointing to an Axes to draw on as well as the location of points\n", "stationplot = StationPlot(ax, lons, lats, transform=ccrs.PlateCarree(),\n", " fontsize=8)\n", "\n", "stationplot.plot_parameter('NW', tmpf, color='red')\n", "stationplot.plot_parameter('SW', rh, color='green')\n", "# A more complex example uses a custom formatter to control how the sea-level pressure\n", "# values are plotted. This uses the standard trailing 3-digits of the pressure value\n", "# in tenths of millibars.\n", "stationplot.plot_parameter('NE', slp, color='purple', formatter=lambda v: format(10 * v, '.0f')[-3:])\n", "stationplot.plot_barb(u, v,zorder=10)\n", "ax.set_title ('Temperature ($^\\circ$F), RH (%), SLP (hPa), Peak 5-min Wind (kts) 0200 UTC 02 September 2021');" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " For further thought: Think about how you might calculate and then display dewpoint instead of relative humidity." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Summary\n", "* Multiple `Series` can be constructed from multiple `DataFrames`\n", "* A single mathematical function can be applied (*broadcasted*) to all array elements in a `Series`\n", "* The **MetPy** library provides methods to assign physical units to numerical arrays and perform units-aware calculations\n", "* **MetPy**'s `StationPlot` method offers a customized use of **Matplotlib**'s **Pyplot** library to plot several meteorologically-relevant parameters centered about several geo-referenced points.\n", "### What's Next?\n", "In the next notebook, we will explore several ways in Pandas to reference particular rows, columns, and/or row-column pairs in a Pandas `DataFrame`.\n", "## Resources and References\n", "1. [Hurricane Ida](https://en.wikipedia.org/wiki/Hurricane_Ida)\n", "1. [MetPy's *calc* library](https://unidata.github.io/MetPy/latest/api/generated/metpy.calc.html)\n", "1. [MetPy's *units* library](https://unidata.github.io/MetPy/latest/api/generated/metpy.units.html)\n", "1. [Sea-level pressure reduction formula (source: Sandhurst Weather site)](https://www.sandhurstweather.org.uk/barometric.pdf)\n", "1. [MetPy's *StationPlot* class](https://unidata.github.io/MetPy/latest/api/generated/metpy.plots.StationPlot.html#metpy.plots.StationPlot)" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 August 2022 Environment", "language": "python", "name": "aug22" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 4 }