{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " \"Matplotlib \n", " \"NYSM\n", " \"pandas\n", "
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Pandas 3: Plotting NYS Mesonet Observations\n", "---" ] }, { "cell_type": "markdown", "metadata": { "tags": [] }, "source": [ "## Overview\n", "In this notebook, we'll use Pandas to read in and analyze current data from the [New York State Mesonet](https://www2.nysmesonet.org). We will also use Matplotlib to plot the locations of NYSM sites." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Prerequisites\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| Matplotlib | Necessary | |\n", "| Pandas | Necessary | |\n", "\n", "* **Time to learn**: 15 minutes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "## Imports" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "import matplotlib.pyplot as plt\n", "import numpy as np\n", "import pandas as pd" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a Pandas `DataFrame` object pointing to the latest set of obs." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_data = pd.read_csv('https://www.atmos.albany.edu/products/nysm/nysm_latest.csv')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create `Series` objects for several columns from the `DataFrame`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, remind ourselves of the column names." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_data.columns" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create several `Series` objects for particular columns of interest. " ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "stid = nysm_data['station']\n", "lats = nysm_data['lat']\n", "lons = nysm_data['lon']\n", "time = nysm_data['time']\n", "tmpc = nysm_data['temp_2m [degC]']\n", "tmpc9 = nysm_data['temp_9m [degC]']\n", "rh = nysm_data['relative_humidity [percent]']\n", "pres = nysm_data['station_pressure [mbar]']\n", "wspd = nysm_data['max_wind_speed_prop [m/s]']\n", "drct = nysm_data['wind_direction_prop [degrees]']\n", "pinc = nysm_data['precip_incremental [mm]']\n", "ptot = nysm_data['precip_local [mm]']\n", "pint = nysm_data['precip_max_intensity [mm/min]']" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine one or more of these `Series`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpc" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Exercise: Read in at least one additional Series object.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your code below\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: Each of these Series objects contain data stored as NumPy arrays. As a result, we can take advantage of vectorizing, which perform operations on all array elements without needing to construct a Python for loop.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Convert the temperature and wind speed arrays to Fahrenheit and knots, respectively." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf = tmpc * 1.8 + 32\n", "wspk = wspd * 1.94384" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Examine the new `Series`. Note that every element of the array has been calculated using the arithemtic above ... in just one line of code per `Series`!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Note: The metadata did not change to reflect the change in units! That is something we'd have to change manually, via the Series' name attribute.
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf.name = 'temp_2m [degF]'" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next, get the basic statistical properties of one of the `Series`." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpf.describe()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "tmpc9" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "missing = tmpc9.isna()" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_data[missing]" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "nysm_data[missing]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Exercise: Convert the 9-m temperature to Fahrenheit, and examine its corresponding statistical properties. What do you notice in terms of the count? Why is there a difference in counts between the 9 m and 2 m arrays?
" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# Write your code below" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Plot station locations using Matplotlib\n", "We use another plotting method for an `Axes` element ... in this case, a [Scatter](https://matplotlib.org/stable/api/_as_gen/matplotlib.axes.Axes.scatter.html) plot. We pass as arguments into this method the x- and y-arrays, corresponding to longitudes and latitudes, and then set five additional attributes." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "fig = plt.figure(figsize=(12,9))\n", "ax = fig.add_subplot(1,1,1)\n", "ax.set_title ('New York State Mesonet Site Locations')\n", "ax.scatter(lons,lats,s=9,c='r',edgecolor='black',alpha=0.75)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Exercise: Examine the link above for how we can call the scatter function. Try changing one or more of the five argument values we used above, and try different arguments as well.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## What's Next?\n", "\n", "We can discern the outline of New York State! But wouldn't it be nice if we could plot cartographic features, such as physical and/or political borders (e.g., coastlines, national/state/provincial boundaries), as well as *georeference* the data we are plotting? We'll cover that next with the **Cartopy** package!" ] } ], "metadata": { "kernelspec": { "display_name": "Python 3 Jan. 2025 Environment", "language": "python", "name": "jan25" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.12.8" } }, "nbformat": 4, "nbformat_minor": 4 }