Pandas 3: Plotting NYS Mesonet Observations

Matplotlib Logo NYSM Logo pandas Logo

Pandas 3: Plotting NYS Mesonet Observations#


Overview#

In this notebook, we’ll use Pandas to read in and analyze current data from the New York State Mesonet. We will also use Matplotlib to plot the locations of NYSM sites.

Prerequisites#

Concepts

Importance

Notes

Matplotlib

Necessary

Pandas

Necessary

  • Time to learn: 15 minutes


Imports#

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

Create a Pandas DataFrame object pointing to the latest set of obs.

nysm_data = pd.read_csv('https://www.atmos.albany.edu/products/nysm/nysm_latest.csv')

Create Series objects for several columns from the DataFrame.

First, remind ourselves of the column names.

nysm_data.columns
Index(['station', 'time', 'temp_2m [degC]', 'temp_9m [degC]',
       'relative_humidity [percent]', 'precip_incremental [mm]',
       'precip_local [mm]', 'precip_max_intensity [mm/min]',
       'avg_wind_speed_prop [m/s]', 'max_wind_speed_prop [m/s]',
       'wind_speed_stddev_prop [m/s]', 'wind_direction_prop [degrees]',
       'wind_direction_stddev_prop [degrees]', 'avg_wind_speed_sonic [m/s]',
       'max_wind_speed_sonic [m/s]', 'wind_speed_stddev_sonic [m/s]',
       'wind_direction_sonic [degrees]',
       'wind_direction_stddev_sonic [degrees]', 'solar_insolation [W/m^2]',
       'station_pressure [mbar]', 'snow_depth [cm]', 'frozen_soil_05cm [bit]',
       'frozen_soil_25cm [bit]', 'frozen_soil_50cm [bit]',
       'soil_temp_05cm [degC]', 'soil_temp_25cm [degC]',
       'soil_temp_50cm [degC]', 'soil_moisture_05cm [m^3/m^3]',
       'soil_moisture_25cm [m^3/m^3]', 'soil_moisture_50cm [m^3/m^3]', 'lat',
       'lon', 'elevation', 'name'],
      dtype='object')

Create several Series objects for particular columns of interest.

stid = nysm_data['station']
lats = nysm_data['lat']
lons = nysm_data['lon']
time = nysm_data['time']
tmpc = nysm_data['temp_2m [degC]']
tmpc9 = nysm_data['temp_9m [degC]']
rh   = nysm_data['relative_humidity [percent]']
pres = nysm_data['station_pressure [mbar]']
wspd = nysm_data['max_wind_speed_prop [m/s]']
drct = nysm_data['wind_direction_prop [degrees]']
pinc = nysm_data['precip_incremental [mm]']
ptot = nysm_data['precip_local [mm]']
pint = nysm_data['precip_max_intensity [mm/min]']
time
0      2025-03-13 18:40:00
1      2025-03-13 18:40:00
2      2025-03-13 18:40:00
3      2025-03-13 18:40:00
4      2025-03-13 18:40:00
              ...         
122    2025-03-13 18:40:00
123    2025-03-13 18:40:00
124    2025-03-13 18:40:00
125    2025-03-13 18:40:00
126    2025-03-13 18:40:00
Name: time, Length: 127, dtype: object

Examine one or more of these Series.

tmpc
0      11.3
1       9.3
2      11.6
3      11.2
4      10.3
       ... 
122     5.9
123     8.1
124     9.2
125    12.9
126    13.3
Name: temp_2m [degC], Length: 127, dtype: float64
Exercise: Read in at least one additional Series object.
# Write your code below
Tip: Each of these Series objects contain data stored as NumPy arrays. As a result, we can take advantage of vectorizing, which perform operations on all array elements without needing to construct a Python for loop.

Convert the temperature and wind speed arrays to Fahrenheit and knots, respectively.

tmpf = tmpc * 1.8 + 32
wspk = wspd * 1.94384

Examine the new Series. Note that every element of the array has been calculated using the arithemtic above … in just one line of code per Series!

tmpf
0      52.34
1      48.74
2      52.88
3      52.16
4      50.54
       ...  
122    42.62
123    46.58
124    48.56
125    55.22
126    55.94
Name: temp_2m [degC], Length: 127, dtype: float64
Note: The metadata did not change to reflect the change in units! That is something we'd have to change manually, via the Series' name attribute.
tmpf.name = 'temp_2m [degF]'
tmpf
0      52.34
1      48.74
2      52.88
3      52.16
4      50.54
       ...  
122    42.62
123    46.58
124    48.56
125    55.22
126    55.94
Name: temp_2m [degF], Length: 127, dtype: float64

Next, get the basic statistical properties of one of the Series.

tmpf.describe()
count    125.000000
mean      49.336160
std        4.202479
min       37.760000
25%       45.860000
50%       49.640000
75%       52.160000
max       59.180000
Name: temp_2m [degF], dtype: float64
tmpc9
0      10.8
1       8.5
2      10.9
3      10.1
4       9.6
       ... 
122     6.2
123     8.2
124     8.8
125    12.2
126    12.1
Name: temp_9m [degC], Length: 127, dtype: float64
missing = tmpc9.isna()
nysm_data[missing]
station time temp_2m [degC] temp_9m [degC] relative_humidity [percent] precip_incremental [mm] precip_local [mm] precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] max_wind_speed_prop [m/s] ... soil_temp_05cm [degC] soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] lat lon elevation name
38 EAUR 2025-03-13 18:40:00 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN 42.713490 -78.631730 377.0380 East Aurora
66 MANH 2025-03-13 18:40:00 7.2 NaN 58.0 0.0 0.0 0.0 2.6 5.5 ... NaN NaN NaN NaN NaN NaN 40.767544 -73.964482 94.7928 Manhattan
71 NEWC 2025-03-13 18:40:00 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN 43.973290 -74.222730 501.3960 Newcomb
85 QUEE 2025-03-13 18:40:00 8.8 NaN 53.0 0.0 0.0 0.0 2.0 3.7 ... NaN NaN NaN NaN NaN NaN 40.734335 -73.815856 54.5592 Queens

4 rows × 34 columns

nysm_data[missing]
station time temp_2m [degC] temp_9m [degC] relative_humidity [percent] precip_incremental [mm] precip_local [mm] precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] max_wind_speed_prop [m/s] ... soil_temp_05cm [degC] soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] lat lon elevation name
38 EAUR 2025-03-13 18:40:00 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN 42.713490 -78.631730 377.0380 East Aurora
66 MANH 2025-03-13 18:40:00 7.2 NaN 58.0 0.0 0.0 0.0 2.6 5.5 ... NaN NaN NaN NaN NaN NaN 40.767544 -73.964482 94.7928 Manhattan
71 NEWC 2025-03-13 18:40:00 NaN NaN NaN NaN NaN NaN NaN NaN ... NaN NaN NaN NaN NaN NaN 43.973290 -74.222730 501.3960 Newcomb
85 QUEE 2025-03-13 18:40:00 8.8 NaN 53.0 0.0 0.0 0.0 2.0 3.7 ... NaN NaN NaN NaN NaN NaN 40.734335 -73.815856 54.5592 Queens

4 rows × 34 columns

Exercise: Convert the 9-m temperature to Fahrenheit, and examine its corresponding statistical properties. What do you notice in terms of the count? Why is there a difference in counts between the 9 m and 2 m arrays?
# Write your code below

Plot station locations using Matplotlib#

We use another plotting method for an Axes element … in this case, a Scatter plot. We pass as arguments into this method the x- and y-arrays, corresponding to longitudes and latitudes, and then set five additional attributes.

fig = plt.figure(figsize=(12,9))
ax = fig.add_subplot(1,1,1)
ax.set_title ('New York State Mesonet Site Locations')
ax.scatter(lons,lats,s=9,c='r',edgecolor='black',alpha=0.75)
<matplotlib.collections.PathCollection at 0x14da63cd94f0>
../../_images/5c66cbd82865b8871805b8ab5063ba7959f1a0b0ce7035235c949e49b7e5d6c9.png
Exercise: Examine the link above for how we can call the scatter function. Try changing one or more of the five argument values we used above, and try different arguments as well.

What’s Next?#

We can discern the outline of New York State! But wouldn’t it be nice if we could plot cartographic features, such as physical and/or political borders (e.g., coastlines, national/state/provincial boundaries), as well as georeference the data we are plotting? We’ll cover that next with the Cartopy package!