pandas Logo

Pandas 2: NYS Mesonet Map


Overview

In this notebook, we’ll use Cartopy, Matplotlib, and Pandas (with a little help from MetPy) to read in, manipulate, and visualize data from the New York State Mesonet.

We’ll focus on a particular time on the evening of 1 September 2021, when the remnants of Hurricane Ida were impacting the greater New York City region.

Prerequisites

Concepts

Importance

Notes

Matplotlib

Necessary

Cartopy

Necessary

Pandas

Necessary

Intro

MetPy

Helpful

  • Time to learn: 30 minutes


Imports

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from cartopy import crs as ccrs
from cartopy import feature as cfeature
from metpy.calc import wind_components
from metpy.units import units
from metpy.plots import StationPlot
ERROR 1: PROJ: proj_create_from_database: Open of /knight/anaconda_aug22/envs/aug22_env/share/proj failed

Create Pandas DataFrame objects based on two csv files; one contains the site locations, while the other contains the hourly data.

nysm_sites = pd.read_csv('/spare11/atm533/data/nysm_sites.csv')
nysm_data = pd.read_csv('/spare11/atm533/data/nysm_data_2021090202.csv')

Create Series objects for several columns from the two DataFrames.

First, remind ourselves of the column names for these two DataFrames.

nysm_sites.columns
Index(['stid', 'number', 'name', 'lat', 'lon', 'elevation', 'county',
       'nearest_city', 'state', 'distance_from_town [km]',
       'direction_from_town [degrees]', 'climate_division',
       'climate_division_name', 'wfo', 'commissioned', 'decommissioned'],
      dtype='object')
nysm_data.columns
Index(['station', 'time', 'temp_2m [degC]', 'temp_9m [degC]',
       'relative_humidity [percent]', 'precip_incremental [mm]',
       'precip_local [mm]', 'precip_max_intensity [mm/min]',
       'avg_wind_speed_prop [m/s]', 'max_wind_speed_prop [m/s]',
       'wind_speed_stddev_prop [m/s]', 'wind_direction_prop [degrees]',
       'wind_direction_stddev_prop [degrees]', 'avg_wind_speed_sonic [m/s]',
       'max_wind_speed_sonic [m/s]', 'wind_speed_stddev_sonic [m/s]',
       'wind_direction_sonic [degrees]',
       'wind_direction_stddev_sonic [degrees]', 'solar_insolation [W/m^2]',
       'station_pressure [mbar]', 'snow_depth [cm]', 'frozen_soil_05cm [bit]',
       'frozen_soil_25cm [bit]', 'frozen_soil_50cm [bit]',
       'soil_temp_05cm [degC]', 'soil_temp_25cm [degC]',
       'soil_temp_50cm [degC]', 'soil_moisture_05cm [m^3/m^3]',
       'soil_moisture_25cm [m^3/m^3]', 'soil_moisture_50cm [m^3/m^3]'],
      dtype='object')

Create several Series objects for particular columns of interest. Pay attention to which DataFrame provides the particular Series!

stid = nysm_sites['stid']
lats = nysm_sites['lat']
lons = nysm_sites['lon']

time = nysm_data['time']
tmpc = nysm_data['temp_2m [degC]']
rh   = nysm_data['relative_humidity [percent]']
pres = nysm_data['station_pressure [mbar]']
wspd = nysm_data['max_wind_speed_prop [m/s]']
drct = nysm_data['wind_direction_prop [degrees]']
pinc = nysm_data['precip_incremental [mm]']
ptot = nysm_data['precip_local [mm]']
pint = nysm_data['precip_max_intensity [mm/min]']

Examine one or more of these Series.

tmpc
0      13.3
1      14.0
2      14.8
3      16.0
4      14.3
       ... 
121    12.9
122    13.8
123    15.7
124    14.0
125    12.0
Name: temp_2m [degC], Length: 126, dtype: float64
Exercise: Read in at least one additional Series object.
# Write your code below
Tip: Each of these Series objects contain data stored as NumPy arrays. As a result, we can take advantage of broadcasting to perform operations on all array elements without needing to construct a Python for loop.

Convert the temperature and wind speed arrays to Fahrenheit and knots, respectively.

tmpf = tmpc * 1.8 + 32
wspk = wspd * 1.94384

Examine the new Series. Note that every element of the array has been calculated using the arithemtic above … in just one line of code per Series!

tmpf
0      55.94
1      57.20
2      58.64
3      60.80
4      57.74
       ...  
121    55.22
122    56.84
123    60.26
124    57.20
125    53.60
Name: temp_2m [degC], Length: 126, dtype: float64
Note:Notice that the metadata did not change to reflect the change in units! That is something we'd have to change manually, via the Series' name attribute:
tmpf.name = 'temp_2m [degF]'
tmpf
0      55.94
1      57.20
2      58.64
3      60.80
4      57.74
       ...  
121    55.22
122    56.84
123    60.26
124    57.20
125    53.60
Name: temp_2m [degF], Length: 126, dtype: float64

Next, use another statistical method available to Series: in this case, the maximum value among all elements of the array.

wspk.max()
32.07336

Our goal is to make a map of NYSM observations, which includes the wind velocity. The convention is to plot wind velocity using wind barbs. The MetPy library allows us to not only make such a map, but perform a variety of meteorologically-relevant calculations and diagnostics. Here, we will use such a calculation, which will determine the two scalar components of wind velocity (u and v), from wind speed and direction. We will use MetPy’s wind_components method.

This method requires us to do the following:

  1. Extract the Numpy arrays from the windspeed and direction Series objects via the values attribute

  2. Attach units to these arrays using MetPy’s units class

We can accomplish everything in this single line of code:

u, v = wind_components(wspk.values * units.knots, drct.values * units.degree)

Take a look at one of the output components:

u
Magnitude
[3.1076105228416284 0.3525481021992311 3.4248560097185075
-5.339764832363268 -5.7443875305457315 -5.801459022127413 nan
4.771890871857093 -0.5086636259164198 3.206670599922817 -0.0
-8.943260505536115 1.95628967615404 -6.535723866844814
-0.36623781065982225 -2.58818237808924 0.5475465615634355
-3.0096477282717324 0.10847633913822438 1.430751121651063
-1.624319147377007 3.5030611214955707 2.053848538588327
-2.3039181462446576 -0.13567807534037896 2.6040035995945137
2.0846949624373523 -6.152438500957249 2.943330012294412
-1.5344742991722606 -12.061449425530395 3.142792887679791
0.30796285583005983 1.1472303609178371 1.9505078741235846
-0.19676317722848524 -0.13909121148910164 -3.611386593889946
-0.7046869315013287 -0.1729456328115827 -0.5654353993317839
-4.446340314513128 4.4273221703080115 1.9822426613411341 1.6974896801079
-2.1183875580695726 -4.2976395505753064 -0.23628099157326013
-2.3862522460029973 1.98854353833583 0.8813702554980778 nan
0.24989525344261657 -10.911559604906929 -2.991570527088587
0.22729539438463015 3.88768 -4.746393010199996 -0.4232317117882124
-1.8352719639439985 2.5028790288329006 0.34597909211796485
0.12552133719748235 -0.14107723726273746 -0.7304316495395762
2.5700372854170377 0.1854510787863714 -3.7616937572383375
2.546657073896326 5.492188316489269 1.1425604848162005 -4.123507336624976
0.15259908777492748 2.62864736144324 -3.749342827887713 1.813316792355978
-3.263843776770169 -3.9429710421648596 -0.9643456096891315
2.445338779220784 1.5149693471932009 -0.0 -2.3745957351850238
-1.5152156053857055 -17.93519530246251 -0.013567807534037897
-0.5686538317421636 -1.2932705954534187 -4.387894073654309
-1.4338102750221322 -0.0 -0.0 3.7695693288785574 0.8409514320841175
5.622643202458873 0.24415854043988397 -4.303396329247996
1.2528558893455028 -3.9889946124130082 -19.74236355422735
-6.565317801480137 -4.8595999999999995 5.9488125727839956
-7.524115319186156 4.376878534320092 -7.268223091584154
-2.8752579866571444 0.19673320924354865 2.3933967674478036 -0.0
-1.2794711214198677 5.998305908868459 -3.7338291680239974
1.8853181372475778 -22.611606261293712 1.0125638421759993
7.687292003415959 0.5685468107631874 -9.455400101411191 -0.0
0.3366829641784686 2.7883551552467805 -1.0821209622808545
1.5617669452990754 0.30378984630847744 0.581731471037117]
Unitsknot
Tip: This is a unique type of object: not only does it have values, aka magnitude, but it also has Units attached. This is a very nice property, since it will make calculations that require awareness of units much easier!

Now, let’s plot several of the meteorological values on a map. We will use Matplotlib and Cartopy, as well as MetPy’s StationPlot method.

Plot the map, centered over NYS, with add some geographic features, and the mesonet data.


For a meteorological surface station plot, let’s take advantage of the Metpy package, and its StationPlot method.

Be patient: this may take a minute or so to plot, if you chose the highest resolution for the shapefiles!

# Set the domain for defining the plot region.
latN = 45.2
latS = 40.2
lonW = -80.0
lonE = -72.0
cLat = (latN + latS)/2
cLon = (lonW + lonE )/2

res = '50m'
proj = ccrs.LambertConformal(central_longitude=cLon, central_latitude=cLat)

fig = plt.figure(figsize=(18,12),dpi=150) # Increase the dots per inch from default 100 to make plot easier to read
ax = fig.add_subplot(1,1,1,projection=proj)
ax.set_extent ([lonW,lonE,latS,latN])
ax.add_feature (cfeature.LAND.with_scale(res))
ax.add_feature (cfeature.OCEAN.with_scale(res))
ax.add_feature(cfeature.COASTLINE.with_scale(res))
ax.add_feature (cfeature.LAKES.with_scale(res))
ax.add_feature (cfeature.STATES.with_scale(res));

# Create a station plot pointing to an Axes to draw on as well as the location of points
stationplot = StationPlot(ax, lons, lats, transform=ccrs.PlateCarree(),
                          fontsize=8)

stationplot.plot_parameter('NW', tmpf, color='red')
stationplot.plot_parameter('SW', rh, color='green')
stationplot.plot_parameter('NE', pres, color='purple')
stationplot.plot_barb(u, v,zorder=2) # zorder value set so wind barbs will display over lake features
ax.set_title ('Temperature ($^\circ$F), RH (%), Station Pressure (hPa), Peak 5-min Wind (kts) 0200 UTC 02 September 2021');
../../_images/e62d5c07df32dae140a6fa9ad19f88a5a3885edac8c4b59548132f373a26c1ef.png

What if we wanted to plot sea-level pressure (SLP) instead of station pressure? In this case, we can apply what’s called a reduction to sea-level pressure formula. This formula requires station elevation (accounting for sensor height) in meters, temperature in Kelvin, and station pressure in hectopascals. We assume each NYSM station has its sensor height .5 meters above ground level.

Tip: At present, MetPy does not yet have a function that reduces station pressure to SLP, so we will do a units-unaware calculation.
elev = nysm_sites['elevation']
sensorHeight = .5
# Reduce station pressure to SLP. Source: https://www.sandhurstweather.org.uk/barometric.pdf 
slp = pres/np.exp(-1*(elev+sensorHeight)/((tmpc+273.15) * 29.263))
slp
0      1010.929036
1      1007.734534
2      1012.633239
3      1005.404926
4      1008.947468
          ...     
121    1013.030846
122    1011.493213
123    1010.712159
124    1011.445598
125    1012.594216
Length: 126, dtype: float64

Make a new map, substituting SLP for station pressure. We will also use the convention of the three least-significant digits to represent SLP in hectopascals.

Examples: 1018.4 hPa would be plotted as 184, while 977.2 hPa would be plotted as 772.
fig = plt.figure(figsize=(18,12),dpi=150)
ax = fig.add_subplot(1,1,1,projection=proj)
ax.set_extent ([lonW,lonE,latS,latN])
ax.add_feature (cfeature.LAND.with_scale(res))
ax.add_feature (cfeature.OCEAN.with_scale(res))
ax.add_feature(cfeature.COASTLINE.with_scale(res))
ax.add_feature (cfeature.LAKES.with_scale(res))
ax.add_feature (cfeature.STATES.with_scale(res));

# Create a station plot pointing to an Axes to draw on as well as the location of points
stationplot = StationPlot(ax, lons, lats, transform=ccrs.PlateCarree(),
                          fontsize=8)

stationplot.plot_parameter('NW', tmpf, color='red')
stationplot.plot_parameter('SW', rh, color='green')
# A more complex example uses a custom formatter to control how the sea-level pressure
# values are plotted. This uses the standard trailing 3-digits of the pressure value
# in tenths of millibars.
stationplot.plot_parameter('NE', slp, color='purple', formatter=lambda v: format(10 * v, '.0f')[-3:])
stationplot.plot_barb(u, v,zorder=10)
ax.set_title ('Temperature ($^\circ$F), RH (%), SLP (hPa), Peak 5-min Wind (kts) 0200 UTC 02 September 2021');
../../_images/dd2d01b473592d50fdb5eb3d779a81132d631dd65c28fa0a28d6a2dca9826af4.png
For further thought: Think about how you might calculate and then display dewpoint instead of relative humidity.

Summary

  • Multiple Series can be constructed from multiple DataFrames

  • A single mathematical function can be applied (broadcasted) to all array elements in a Series

  • The MetPy library provides methods to assign physical units to numerical arrays and perform units-aware calculations

  • MetPy’s StationPlot method offers a customized use of Matplotlib’s Pyplot library to plot several meteorologically-relevant parameters centered about several geo-referenced points.

What’s Next?

In the next notebook, we will explore several ways in Pandas to reference particular rows, columns, and/or row-column pairs in a Pandas DataFrame.