{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "
\"pandas
\n", "\n", "# Pandas 8: Multi-index DataFrames\n", "---" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### In many instances, tabular data may best be represented with more than a single row index. For the case of NYSM hourly data files, which have 126 stations each with 13 sets of 5-minute obs, we definitely want to take advantage of using Multi-index Dataframes." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Overview\n", "1. Open an hour's worth of NYSM data using a single row index dataframe\n", "1. Review selection and conditional selection methodology\n", "1. Create a dataframe with station ID and date/time as the two row indices\n", "1. Use selection and conditions on a multi-index dataframe\n", "1. Briefly define tuples as opposed to lists in Python\n", "1. Work with the time index as a Datetime object\n", "\n", "## Prerequisites\n", "\n", "| Concepts | Importance | Notes |\n", "| --- | --- | --- |\n", "| Pandas notebooks 1-8 | Necessary | |\n", "\n", "* **Time to learn**: 15 minutes" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Imports" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd\n", "from datetime import datetime " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Open an hour's worth of NYSM data using a single row index dataframe" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "nysm_data_file = '/spare11/atm533/data/nysm_data_2021090202a.csv'" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationtimetemp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
0ADDI2021-09-02 02:00:00 UTC13.313.492.50.00.00.02.65.8...NaN0.00.00.019.920.319.90.520.440.44
1ADDI2021-09-02 02:05:00 UTC13.213.492.80.00.00.02.74.3...NaN0.00.00.019.920.319.90.520.440.44
2ADDI2021-09-02 02:10:00 UTC13.113.392.80.00.00.02.13.4...NaN0.00.00.019.920.319.90.520.440.44
\n", "

3 rows × 30 columns

\n", "
" ], "text/plain": [ " station time temp_2m [degC] temp_9m [degC] \\\n", "0 ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 \n", "1 ADDI 2021-09-02 02:05:00 UTC 13.2 13.4 \n", "2 ADDI 2021-09-02 02:10:00 UTC 13.1 13.3 \n", "\n", " relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n", "0 92.5 0.0 0.0 \n", "1 92.8 0.0 0.0 \n", "2 92.8 0.0 0.0 \n", "\n", " precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n", "0 0.0 2.6 \n", "1 0.0 2.7 \n", "2 0.0 2.1 \n", "\n", " max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n", "0 5.8 ... NaN 0.0 \n", "1 4.3 ... NaN 0.0 \n", "2 3.4 ... NaN 0.0 \n", "\n", " frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n", "0 0.0 0.0 19.9 \n", "1 0.0 0.0 19.9 \n", "2 0.0 0.0 19.9 \n", "\n", " soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] \\\n", "0 20.3 19.9 0.52 \n", "1 20.3 19.9 0.52 \n", "2 20.3 19.9 0.52 \n", "\n", " soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] \n", "0 0.44 0.44 \n", "1 0.44 0.44 \n", "2 0.44 0.44 \n", "\n", "[3 rows x 30 columns]" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df = pd.read_csv(nysm_data_file)\n", "# Look at the first few rows\n", "df.head(3)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Remind ourselves that the default `DataFrame`'s index is a special type of Python object, called a *RangeIndex*." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "RangeIndex(start=0, stop=1512, step=1)" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Review selection and conditional selection methodology" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Here we have multiple stations and multiple times. We can select rows/columns, set conditions, and make further selections based on those conditions as we did in the **03_Pandas_IndexSubsetsConditonals** notebook. " ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationtimetemp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
1332VOOR2021-09-02 02:00:00 UTC14.914.998.90.2611.670.002.23.6...NaN0.00.00.020.221.121.10.250.330.19
1333VOOR2021-09-02 02:05:00 UTC14.814.899.00.2711.940.003.04.6...NaN0.00.00.020.121.121.10.250.330.19
1334VOOR2021-09-02 02:10:00 UTC14.714.798.90.3912.330.113.25.1...NaN0.00.00.020.121.121.10.250.330.19
1335VOOR2021-09-02 02:15:00 UTC14.614.698.90.8013.130.183.65.6...NaN0.00.00.020.121.121.10.260.330.19
1336VOOR2021-09-02 02:20:00 UTC14.614.698.90.8814.010.182.96.4...NaN0.00.00.020.121.121.10.260.330.19
1337VOOR2021-09-02 02:25:00 UTC14.514.598.90.6614.670.152.95.8...NaN0.00.00.020.121.121.10.270.330.19
1338VOOR2021-09-02 02:30:00 UTC14.514.599.00.4515.120.002.54.6...NaN0.00.00.020.121.121.10.270.340.19
1339VOOR2021-09-02 02:35:00 UTC14.414.599.00.5515.670.132.63.8...NaN0.00.00.020.021.121.10.280.340.19
1340VOOR2021-09-02 02:40:00 UTC14.414.499.00.9916.660.252.24.3...NaN0.00.00.019.921.121.10.280.340.19
1341VOOR2021-09-02 02:45:00 UTC14.314.498.91.5118.170.391.73.5...NaN0.00.00.019.921.121.10.290.340.19
1342VOOR2021-09-02 02:50:00 UTC14.214.398.81.3019.470.311.53.4...NaN0.00.00.019.920.921.10.290.350.19
1343VOOR2021-09-02 02:55:00 UTC14.214.398.40.7720.240.192.13.7...NaN0.00.00.019.920.921.10.290.350.19
\n", "

12 rows × 30 columns

\n", "
" ], "text/plain": [ " station time temp_2m [degC] temp_9m [degC] \\\n", "1332 VOOR 2021-09-02 02:00:00 UTC 14.9 14.9 \n", "1333 VOOR 2021-09-02 02:05:00 UTC 14.8 14.8 \n", "1334 VOOR 2021-09-02 02:10:00 UTC 14.7 14.7 \n", "1335 VOOR 2021-09-02 02:15:00 UTC 14.6 14.6 \n", "1336 VOOR 2021-09-02 02:20:00 UTC 14.6 14.6 \n", "1337 VOOR 2021-09-02 02:25:00 UTC 14.5 14.5 \n", "1338 VOOR 2021-09-02 02:30:00 UTC 14.5 14.5 \n", "1339 VOOR 2021-09-02 02:35:00 UTC 14.4 14.5 \n", "1340 VOOR 2021-09-02 02:40:00 UTC 14.4 14.4 \n", "1341 VOOR 2021-09-02 02:45:00 UTC 14.3 14.4 \n", "1342 VOOR 2021-09-02 02:50:00 UTC 14.2 14.3 \n", "1343 VOOR 2021-09-02 02:55:00 UTC 14.2 14.3 \n", "\n", " relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n", "1332 98.9 0.26 11.67 \n", "1333 99.0 0.27 11.94 \n", "1334 98.9 0.39 12.33 \n", "1335 98.9 0.80 13.13 \n", "1336 98.9 0.88 14.01 \n", "1337 98.9 0.66 14.67 \n", "1338 99.0 0.45 15.12 \n", "1339 99.0 0.55 15.67 \n", "1340 99.0 0.99 16.66 \n", "1341 98.9 1.51 18.17 \n", "1342 98.8 1.30 19.47 \n", "1343 98.4 0.77 20.24 \n", "\n", " precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n", "1332 0.00 2.2 \n", "1333 0.00 3.0 \n", "1334 0.11 3.2 \n", "1335 0.18 3.6 \n", "1336 0.18 2.9 \n", "1337 0.15 2.9 \n", "1338 0.00 2.5 \n", "1339 0.13 2.6 \n", "1340 0.25 2.2 \n", "1341 0.39 1.7 \n", "1342 0.31 1.5 \n", "1343 0.19 2.1 \n", "\n", " max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n", "1332 3.6 ... NaN 0.0 \n", "1333 4.6 ... NaN 0.0 \n", "1334 5.1 ... NaN 0.0 \n", "1335 5.6 ... NaN 0.0 \n", "1336 6.4 ... NaN 0.0 \n", "1337 5.8 ... NaN 0.0 \n", "1338 4.6 ... NaN 0.0 \n", "1339 3.8 ... NaN 0.0 \n", "1340 4.3 ... NaN 0.0 \n", "1341 3.5 ... NaN 0.0 \n", "1342 3.4 ... NaN 0.0 \n", "1343 3.7 ... NaN 0.0 \n", "\n", " frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n", "1332 0.0 0.0 20.2 \n", "1333 0.0 0.0 20.1 \n", "1334 0.0 0.0 20.1 \n", "1335 0.0 0.0 20.1 \n", "1336 0.0 0.0 20.1 \n", "1337 0.0 0.0 20.1 \n", "1338 0.0 0.0 20.1 \n", "1339 0.0 0.0 20.0 \n", "1340 0.0 0.0 19.9 \n", "1341 0.0 0.0 19.9 \n", "1342 0.0 0.0 19.9 \n", "1343 0.0 0.0 19.9 \n", "\n", " soil_temp_25cm [degC] soil_temp_50cm [degC] \\\n", "1332 21.1 21.1 \n", "1333 21.1 21.1 \n", "1334 21.1 21.1 \n", "1335 21.1 21.1 \n", "1336 21.1 21.1 \n", "1337 21.1 21.1 \n", "1338 21.1 21.1 \n", "1339 21.1 21.1 \n", "1340 21.1 21.1 \n", "1341 21.1 21.1 \n", "1342 20.9 21.1 \n", "1343 20.9 21.1 \n", "\n", " soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] \\\n", "1332 0.25 0.33 \n", "1333 0.25 0.33 \n", "1334 0.25 0.33 \n", "1335 0.26 0.33 \n", "1336 0.26 0.33 \n", "1337 0.27 0.33 \n", "1338 0.27 0.34 \n", "1339 0.28 0.34 \n", "1340 0.28 0.34 \n", "1341 0.29 0.34 \n", "1342 0.29 0.35 \n", "1343 0.29 0.35 \n", "\n", " soil_moisture_50cm [m^3/m^3] \n", "1332 0.19 \n", "1333 0.19 \n", "1334 0.19 \n", "1335 0.19 \n", "1336 0.19 \n", "1337 0.19 \n", "1338 0.19 \n", "1339 0.19 \n", "1340 0.19 \n", "1341 0.19 \n", "1342 0.19 \n", "1343 0.19 \n", "\n", "[12 rows x 30 columns]" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df[df['station']=='VOOR'] # Voorheesville, NY" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Create a second dataframe that first contains just VOOR rows, then further subset by choosing one time." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
stationtimetemp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
1342VOOR2021-09-02 02:50:00 UTC14.214.398.81.319.470.311.53.4...NaN0.00.00.019.920.921.10.290.350.19
\n", "

1 rows × 30 columns

\n", "
" ], "text/plain": [ " station time temp_2m [degC] temp_9m [degC] \\\n", "1342 VOOR 2021-09-02 02:50:00 UTC 14.2 14.3 \n", "\n", " relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n", "1342 98.8 1.3 19.47 \n", "\n", " precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n", "1342 0.31 1.5 \n", "\n", " max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n", "1342 3.4 ... NaN 0.0 \n", "\n", " frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n", "1342 0.0 0.0 19.9 \n", "\n", " soil_temp_25cm [degC] soil_temp_50cm [degC] \\\n", "1342 20.9 21.1 \n", "\n", " soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] \\\n", "1342 0.29 0.35 \n", "\n", " soil_moisture_50cm [m^3/m^3] \n", "1342 0.19 \n", "\n", "[1 rows x 30 columns]" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df2 = df[df['station']=='VOOR']\n", "df2[df2['time'] == '2021-09-02 02:50:00 UTC'] # we haven't recast this as a Datetime object ... it's just a string" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, there is a better way! Since our DataFrame consists of one station after another, each associated with a number of discrete times, let's create a **multi-indexed** DataFrame, that has *station* as its outer row index and *time* as its inner." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Create a dataframe with station ID and date/time as the two row indices" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "df.set_index(['station', 'time'], inplace = True)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Note: inplace means that the df object gets re-created with its new indexes. It's convenient, but if at any point we wish to re-run cells beyond this point, it is better to start from the beginning to ensure that the df object conforms to whatever code cell operates on it.
" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "scrolled": true }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]wind_speed_stddev_prop [m/s]wind_direction_prop [degrees]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
stationtime
ADDI2021-09-02 02:00:00 UTC13.313.492.50.00.00.02.65.80.9344.0...NaN0.00.00.019.920.319.90.520.440.44
2021-09-02 02:05:00 UTC13.213.492.80.00.00.02.74.30.6340.0...NaN0.00.00.019.920.319.90.520.440.44
2021-09-02 02:10:00 UTC13.113.392.80.00.00.02.13.40.5336.0...NaN0.00.00.019.920.319.90.520.440.44
2021-09-02 02:15:00 UTC13.013.292.90.00.00.02.44.10.6342.0...NaN0.00.00.019.920.319.90.520.440.44
2021-09-02 02:20:00 UTC12.913.193.00.00.00.02.34.60.9338.0...NaN0.00.00.019.920.319.90.520.440.44
.....................................................................
YORK2021-09-02 02:35:00 UTC11.712.897.10.00.00.00.20.60.2136.0...NaN0.00.00.020.321.721.90.130.240.24
2021-09-02 02:40:00 UTC11.612.597.00.00.00.00.60.80.1131.0...NaN0.00.00.020.321.721.90.130.240.24
2021-09-02 02:45:00 UTC11.612.597.20.00.00.00.10.70.2165.0...NaN0.00.00.020.321.721.90.130.240.24
2021-09-02 02:50:00 UTC11.512.797.30.00.00.00.70.80.1193.0...NaN0.00.00.020.221.721.90.130.240.24
2021-09-02 02:55:00 UTC11.512.497.40.00.00.00.50.70.1168.0...NaN0.00.00.020.121.721.90.130.240.24
\n", "

1512 rows × 28 columns

\n", "
" ], "text/plain": [ " temp_2m [degC] temp_9m [degC] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 \n", " 2021-09-02 02:05:00 UTC 13.2 13.4 \n", " 2021-09-02 02:10:00 UTC 13.1 13.3 \n", " 2021-09-02 02:15:00 UTC 13.0 13.2 \n", " 2021-09-02 02:20:00 UTC 12.9 13.1 \n", "... ... ... \n", "YORK 2021-09-02 02:35:00 UTC 11.7 12.8 \n", " 2021-09-02 02:40:00 UTC 11.6 12.5 \n", " 2021-09-02 02:45:00 UTC 11.6 12.5 \n", " 2021-09-02 02:50:00 UTC 11.5 12.7 \n", " 2021-09-02 02:55:00 UTC 11.5 12.4 \n", "\n", " relative_humidity [percent] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 92.5 \n", " 2021-09-02 02:05:00 UTC 92.8 \n", " 2021-09-02 02:10:00 UTC 92.8 \n", " 2021-09-02 02:15:00 UTC 92.9 \n", " 2021-09-02 02:20:00 UTC 93.0 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 97.1 \n", " 2021-09-02 02:40:00 UTC 97.0 \n", " 2021-09-02 02:45:00 UTC 97.2 \n", " 2021-09-02 02:50:00 UTC 97.3 \n", " 2021-09-02 02:55:00 UTC 97.4 \n", "\n", " precip_incremental [mm] precip_local [mm] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.0 0.0 \n", " 2021-09-02 02:05:00 UTC 0.0 0.0 \n", " 2021-09-02 02:10:00 UTC 0.0 0.0 \n", " 2021-09-02 02:15:00 UTC 0.0 0.0 \n", " 2021-09-02 02:20:00 UTC 0.0 0.0 \n", "... ... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.0 0.0 \n", " 2021-09-02 02:40:00 UTC 0.0 0.0 \n", " 2021-09-02 02:45:00 UTC 0.0 0.0 \n", " 2021-09-02 02:50:00 UTC 0.0 0.0 \n", " 2021-09-02 02:55:00 UTC 0.0 0.0 \n", "\n", " precip_max_intensity [mm/min] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.0 \n", " 2021-09-02 02:05:00 UTC 0.0 \n", " 2021-09-02 02:10:00 UTC 0.0 \n", " 2021-09-02 02:15:00 UTC 0.0 \n", " 2021-09-02 02:20:00 UTC 0.0 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.0 \n", " 2021-09-02 02:40:00 UTC 0.0 \n", " 2021-09-02 02:45:00 UTC 0.0 \n", " 2021-09-02 02:50:00 UTC 0.0 \n", " 2021-09-02 02:55:00 UTC 0.0 \n", "\n", " avg_wind_speed_prop [m/s] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 2.6 \n", " 2021-09-02 02:05:00 UTC 2.7 \n", " 2021-09-02 02:10:00 UTC 2.1 \n", " 2021-09-02 02:15:00 UTC 2.4 \n", " 2021-09-02 02:20:00 UTC 2.3 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.2 \n", " 2021-09-02 02:40:00 UTC 0.6 \n", " 2021-09-02 02:45:00 UTC 0.1 \n", " 2021-09-02 02:50:00 UTC 0.7 \n", " 2021-09-02 02:55:00 UTC 0.5 \n", "\n", " max_wind_speed_prop [m/s] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 5.8 \n", " 2021-09-02 02:05:00 UTC 4.3 \n", " 2021-09-02 02:10:00 UTC 3.4 \n", " 2021-09-02 02:15:00 UTC 4.1 \n", " 2021-09-02 02:20:00 UTC 4.6 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.6 \n", " 2021-09-02 02:40:00 UTC 0.8 \n", " 2021-09-02 02:45:00 UTC 0.7 \n", " 2021-09-02 02:50:00 UTC 0.8 \n", " 2021-09-02 02:55:00 UTC 0.7 \n", "\n", " wind_speed_stddev_prop [m/s] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.9 \n", " 2021-09-02 02:05:00 UTC 0.6 \n", " 2021-09-02 02:10:00 UTC 0.5 \n", " 2021-09-02 02:15:00 UTC 0.6 \n", " 2021-09-02 02:20:00 UTC 0.9 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.2 \n", " 2021-09-02 02:40:00 UTC 0.1 \n", " 2021-09-02 02:45:00 UTC 0.2 \n", " 2021-09-02 02:50:00 UTC 0.1 \n", " 2021-09-02 02:55:00 UTC 0.1 \n", "\n", " wind_direction_prop [degrees] ... \\\n", "station time ... \n", "ADDI 2021-09-02 02:00:00 UTC 344.0 ... \n", " 2021-09-02 02:05:00 UTC 340.0 ... \n", " 2021-09-02 02:10:00 UTC 336.0 ... \n", " 2021-09-02 02:15:00 UTC 342.0 ... \n", " 2021-09-02 02:20:00 UTC 338.0 ... \n", "... ... ... \n", "YORK 2021-09-02 02:35:00 UTC 136.0 ... \n", " 2021-09-02 02:40:00 UTC 131.0 ... \n", " 2021-09-02 02:45:00 UTC 165.0 ... \n", " 2021-09-02 02:50:00 UTC 193.0 ... \n", " 2021-09-02 02:55:00 UTC 168.0 ... \n", "\n", " snow_depth [cm] frozen_soil_05cm [bit] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC NaN 0.0 \n", " 2021-09-02 02:05:00 UTC NaN 0.0 \n", " 2021-09-02 02:10:00 UTC NaN 0.0 \n", " 2021-09-02 02:15:00 UTC NaN 0.0 \n", " 2021-09-02 02:20:00 UTC NaN 0.0 \n", "... ... ... \n", "YORK 2021-09-02 02:35:00 UTC NaN 0.0 \n", " 2021-09-02 02:40:00 UTC NaN 0.0 \n", " 2021-09-02 02:45:00 UTC NaN 0.0 \n", " 2021-09-02 02:50:00 UTC NaN 0.0 \n", " 2021-09-02 02:55:00 UTC NaN 0.0 \n", "\n", " frozen_soil_25cm [bit] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.0 \n", " 2021-09-02 02:05:00 UTC 0.0 \n", " 2021-09-02 02:10:00 UTC 0.0 \n", " 2021-09-02 02:15:00 UTC 0.0 \n", " 2021-09-02 02:20:00 UTC 0.0 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.0 \n", " 2021-09-02 02:40:00 UTC 0.0 \n", " 2021-09-02 02:45:00 UTC 0.0 \n", " 2021-09-02 02:50:00 UTC 0.0 \n", " 2021-09-02 02:55:00 UTC 0.0 \n", "\n", " frozen_soil_50cm [bit] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.0 \n", " 2021-09-02 02:05:00 UTC 0.0 \n", " 2021-09-02 02:10:00 UTC 0.0 \n", " 2021-09-02 02:15:00 UTC 0.0 \n", " 2021-09-02 02:20:00 UTC 0.0 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.0 \n", " 2021-09-02 02:40:00 UTC 0.0 \n", " 2021-09-02 02:45:00 UTC 0.0 \n", " 2021-09-02 02:50:00 UTC 0.0 \n", " 2021-09-02 02:55:00 UTC 0.0 \n", "\n", " soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 19.9 20.3 \n", " 2021-09-02 02:05:00 UTC 19.9 20.3 \n", " 2021-09-02 02:10:00 UTC 19.9 20.3 \n", " 2021-09-02 02:15:00 UTC 19.9 20.3 \n", " 2021-09-02 02:20:00 UTC 19.9 20.3 \n", "... ... ... \n", "YORK 2021-09-02 02:35:00 UTC 20.3 21.7 \n", " 2021-09-02 02:40:00 UTC 20.3 21.7 \n", " 2021-09-02 02:45:00 UTC 20.3 21.7 \n", " 2021-09-02 02:50:00 UTC 20.2 21.7 \n", " 2021-09-02 02:55:00 UTC 20.1 21.7 \n", "\n", " soil_temp_50cm [degC] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 19.9 \n", " 2021-09-02 02:05:00 UTC 19.9 \n", " 2021-09-02 02:10:00 UTC 19.9 \n", " 2021-09-02 02:15:00 UTC 19.9 \n", " 2021-09-02 02:20:00 UTC 19.9 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 21.9 \n", " 2021-09-02 02:40:00 UTC 21.9 \n", " 2021-09-02 02:45:00 UTC 21.9 \n", " 2021-09-02 02:50:00 UTC 21.9 \n", " 2021-09-02 02:55:00 UTC 21.9 \n", "\n", " soil_moisture_05cm [m^3/m^3] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.52 \n", " 2021-09-02 02:05:00 UTC 0.52 \n", " 2021-09-02 02:10:00 UTC 0.52 \n", " 2021-09-02 02:15:00 UTC 0.52 \n", " 2021-09-02 02:20:00 UTC 0.52 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.13 \n", " 2021-09-02 02:40:00 UTC 0.13 \n", " 2021-09-02 02:45:00 UTC 0.13 \n", " 2021-09-02 02:50:00 UTC 0.13 \n", " 2021-09-02 02:55:00 UTC 0.13 \n", "\n", " soil_moisture_25cm [m^3/m^3] \\\n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.44 \n", " 2021-09-02 02:05:00 UTC 0.44 \n", " 2021-09-02 02:10:00 UTC 0.44 \n", " 2021-09-02 02:15:00 UTC 0.44 \n", " 2021-09-02 02:20:00 UTC 0.44 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.24 \n", " 2021-09-02 02:40:00 UTC 0.24 \n", " 2021-09-02 02:45:00 UTC 0.24 \n", " 2021-09-02 02:50:00 UTC 0.24 \n", " 2021-09-02 02:55:00 UTC 0.24 \n", "\n", " soil_moisture_50cm [m^3/m^3] \n", "station time \n", "ADDI 2021-09-02 02:00:00 UTC 0.44 \n", " 2021-09-02 02:05:00 UTC 0.44 \n", " 2021-09-02 02:10:00 UTC 0.44 \n", " 2021-09-02 02:15:00 UTC 0.44 \n", " 2021-09-02 02:20:00 UTC 0.44 \n", "... ... \n", "YORK 2021-09-02 02:35:00 UTC 0.24 \n", " 2021-09-02 02:40:00 UTC 0.24 \n", " 2021-09-02 02:45:00 UTC 0.24 \n", " 2021-09-02 02:50:00 UTC 0.24 \n", " 2021-09-02 02:55:00 UTC 0.24 \n", "\n", "[1512 rows x 28 columns]" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('ADDI', '2021-09-02 02:00:00 UTC'),\n", " ('ADDI', '2021-09-02 02:05:00 UTC'),\n", " ('ADDI', '2021-09-02 02:10:00 UTC'),\n", " ('ADDI', '2021-09-02 02:15:00 UTC'),\n", " ('ADDI', '2021-09-02 02:20:00 UTC'),\n", " ('ADDI', '2021-09-02 02:25:00 UTC'),\n", " ('ADDI', '2021-09-02 02:30:00 UTC'),\n", " ('ADDI', '2021-09-02 02:35:00 UTC'),\n", " ('ADDI', '2021-09-02 02:40:00 UTC'),\n", " ('ADDI', '2021-09-02 02:45:00 UTC'),\n", " ...\n", " ('YORK', '2021-09-02 02:10:00 UTC'),\n", " ('YORK', '2021-09-02 02:15:00 UTC'),\n", " ('YORK', '2021-09-02 02:20:00 UTC'),\n", " ('YORK', '2021-09-02 02:25:00 UTC'),\n", " ('YORK', '2021-09-02 02:30:00 UTC'),\n", " ('YORK', '2021-09-02 02:35:00 UTC'),\n", " ('YORK', '2021-09-02 02:40:00 UTC'),\n", " ('YORK', '2021-09-02 02:45:00 UTC'),\n", " ('YORK', '2021-09-02 02:50:00 UTC'),\n", " ('YORK', '2021-09-02 02:55:00 UTC')],\n", " names=['station', 'time'], length=1512)" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The date/time index is currently a string. Let's do this nifty trick to convert it to a `datetime` object." ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [], "source": [ "df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])])" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "MultiIndex([('ADDI', '2021-09-02 02:00:00+00:00'),\n", " ('ADDI', '2021-09-02 02:05:00+00:00'),\n", " ('ADDI', '2021-09-02 02:10:00+00:00')],\n", " names=['station', 'time'])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "#Examine first three row indices\n", "df.index[:3]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Use selection and conditions on a multi-index dataframe" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Select a column in the usual way." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]wind_speed_stddev_prop [m/s]wind_direction_prop [degrees]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
time
2021-09-02 02:00:00+00:0014.914.998.90.2611.670.002.23.60.6301.0...NaN0.00.00.020.221.121.10.250.330.19
2021-09-02 02:05:00+00:0014.814.899.00.2711.940.003.04.60.8299.0...NaN0.00.00.020.121.121.10.250.330.19
2021-09-02 02:10:00+00:0014.714.798.90.3912.330.113.25.10.8301.0...NaN0.00.00.020.121.121.10.250.330.19
2021-09-02 02:15:00+00:0014.614.698.90.8013.130.183.65.60.7299.0...NaN0.00.00.020.121.121.10.260.330.19
2021-09-02 02:20:00+00:0014.614.698.90.8814.010.182.96.40.9300.0...NaN0.00.00.020.121.121.10.260.330.19
2021-09-02 02:25:00+00:0014.514.598.90.6614.670.152.95.80.9298.0...NaN0.00.00.020.121.121.10.270.330.19
2021-09-02 02:30:00+00:0014.514.599.00.4515.120.002.54.60.8301.0...NaN0.00.00.020.121.121.10.270.340.19
2021-09-02 02:35:00+00:0014.414.599.00.5515.670.132.63.80.5292.0...NaN0.00.00.020.021.121.10.280.340.19
2021-09-02 02:40:00+00:0014.414.499.00.9916.660.252.24.30.8298.0...NaN0.00.00.019.921.121.10.280.340.19
2021-09-02 02:45:00+00:0014.314.498.91.5118.170.391.73.50.7290.0...NaN0.00.00.019.921.121.10.290.340.19
2021-09-02 02:50:00+00:0014.214.398.81.3019.470.311.53.40.8307.0...NaN0.00.00.019.920.921.10.290.350.19
2021-09-02 02:55:00+00:0014.214.398.40.7720.240.192.13.70.8301.0...NaN0.00.00.019.920.921.10.290.350.19
\n", "

12 rows × 28 columns

\n", "
" ], "text/plain": [ " temp_2m [degC] temp_9m [degC] \\\n", "time \n", "2021-09-02 02:00:00+00:00 14.9 14.9 \n", "2021-09-02 02:05:00+00:00 14.8 14.8 \n", "2021-09-02 02:10:00+00:00 14.7 14.7 \n", "2021-09-02 02:15:00+00:00 14.6 14.6 \n", "2021-09-02 02:20:00+00:00 14.6 14.6 \n", "2021-09-02 02:25:00+00:00 14.5 14.5 \n", "2021-09-02 02:30:00+00:00 14.5 14.5 \n", "2021-09-02 02:35:00+00:00 14.4 14.5 \n", "2021-09-02 02:40:00+00:00 14.4 14.4 \n", "2021-09-02 02:45:00+00:00 14.3 14.4 \n", "2021-09-02 02:50:00+00:00 14.2 14.3 \n", "2021-09-02 02:55:00+00:00 14.2 14.3 \n", "\n", " relative_humidity [percent] \\\n", "time \n", "2021-09-02 02:00:00+00:00 98.9 \n", "2021-09-02 02:05:00+00:00 99.0 \n", "2021-09-02 02:10:00+00:00 98.9 \n", "2021-09-02 02:15:00+00:00 98.9 \n", "2021-09-02 02:20:00+00:00 98.9 \n", "2021-09-02 02:25:00+00:00 98.9 \n", "2021-09-02 02:30:00+00:00 99.0 \n", "2021-09-02 02:35:00+00:00 99.0 \n", "2021-09-02 02:40:00+00:00 99.0 \n", "2021-09-02 02:45:00+00:00 98.9 \n", "2021-09-02 02:50:00+00:00 98.8 \n", "2021-09-02 02:55:00+00:00 98.4 \n", "\n", " precip_incremental [mm] precip_local [mm] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.26 11.67 \n", "2021-09-02 02:05:00+00:00 0.27 11.94 \n", "2021-09-02 02:10:00+00:00 0.39 12.33 \n", "2021-09-02 02:15:00+00:00 0.80 13.13 \n", "2021-09-02 02:20:00+00:00 0.88 14.01 \n", "2021-09-02 02:25:00+00:00 0.66 14.67 \n", "2021-09-02 02:30:00+00:00 0.45 15.12 \n", "2021-09-02 02:35:00+00:00 0.55 15.67 \n", "2021-09-02 02:40:00+00:00 0.99 16.66 \n", "2021-09-02 02:45:00+00:00 1.51 18.17 \n", "2021-09-02 02:50:00+00:00 1.30 19.47 \n", "2021-09-02 02:55:00+00:00 0.77 20.24 \n", "\n", " precip_max_intensity [mm/min] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.00 \n", "2021-09-02 02:05:00+00:00 0.00 \n", "2021-09-02 02:10:00+00:00 0.11 \n", "2021-09-02 02:15:00+00:00 0.18 \n", "2021-09-02 02:20:00+00:00 0.18 \n", "2021-09-02 02:25:00+00:00 0.15 \n", "2021-09-02 02:30:00+00:00 0.00 \n", "2021-09-02 02:35:00+00:00 0.13 \n", "2021-09-02 02:40:00+00:00 0.25 \n", "2021-09-02 02:45:00+00:00 0.39 \n", "2021-09-02 02:50:00+00:00 0.31 \n", "2021-09-02 02:55:00+00:00 0.19 \n", "\n", " avg_wind_speed_prop [m/s] \\\n", "time \n", "2021-09-02 02:00:00+00:00 2.2 \n", "2021-09-02 02:05:00+00:00 3.0 \n", "2021-09-02 02:10:00+00:00 3.2 \n", "2021-09-02 02:15:00+00:00 3.6 \n", "2021-09-02 02:20:00+00:00 2.9 \n", "2021-09-02 02:25:00+00:00 2.9 \n", "2021-09-02 02:30:00+00:00 2.5 \n", "2021-09-02 02:35:00+00:00 2.6 \n", "2021-09-02 02:40:00+00:00 2.2 \n", "2021-09-02 02:45:00+00:00 1.7 \n", "2021-09-02 02:50:00+00:00 1.5 \n", "2021-09-02 02:55:00+00:00 2.1 \n", "\n", " max_wind_speed_prop [m/s] \\\n", "time \n", "2021-09-02 02:00:00+00:00 3.6 \n", "2021-09-02 02:05:00+00:00 4.6 \n", "2021-09-02 02:10:00+00:00 5.1 \n", "2021-09-02 02:15:00+00:00 5.6 \n", "2021-09-02 02:20:00+00:00 6.4 \n", "2021-09-02 02:25:00+00:00 5.8 \n", "2021-09-02 02:30:00+00:00 4.6 \n", "2021-09-02 02:35:00+00:00 3.8 \n", "2021-09-02 02:40:00+00:00 4.3 \n", "2021-09-02 02:45:00+00:00 3.5 \n", "2021-09-02 02:50:00+00:00 3.4 \n", "2021-09-02 02:55:00+00:00 3.7 \n", "\n", " wind_speed_stddev_prop [m/s] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.6 \n", "2021-09-02 02:05:00+00:00 0.8 \n", "2021-09-02 02:10:00+00:00 0.8 \n", "2021-09-02 02:15:00+00:00 0.7 \n", "2021-09-02 02:20:00+00:00 0.9 \n", "2021-09-02 02:25:00+00:00 0.9 \n", "2021-09-02 02:30:00+00:00 0.8 \n", "2021-09-02 02:35:00+00:00 0.5 \n", "2021-09-02 02:40:00+00:00 0.8 \n", "2021-09-02 02:45:00+00:00 0.7 \n", "2021-09-02 02:50:00+00:00 0.8 \n", "2021-09-02 02:55:00+00:00 0.8 \n", "\n", " wind_direction_prop [degrees] ... \\\n", "time ... \n", "2021-09-02 02:00:00+00:00 301.0 ... \n", "2021-09-02 02:05:00+00:00 299.0 ... \n", "2021-09-02 02:10:00+00:00 301.0 ... \n", "2021-09-02 02:15:00+00:00 299.0 ... \n", "2021-09-02 02:20:00+00:00 300.0 ... \n", "2021-09-02 02:25:00+00:00 298.0 ... \n", "2021-09-02 02:30:00+00:00 301.0 ... \n", "2021-09-02 02:35:00+00:00 292.0 ... \n", "2021-09-02 02:40:00+00:00 298.0 ... \n", "2021-09-02 02:45:00+00:00 290.0 ... \n", "2021-09-02 02:50:00+00:00 307.0 ... \n", "2021-09-02 02:55:00+00:00 301.0 ... \n", "\n", " snow_depth [cm] frozen_soil_05cm [bit] \\\n", "time \n", "2021-09-02 02:00:00+00:00 NaN 0.0 \n", "2021-09-02 02:05:00+00:00 NaN 0.0 \n", "2021-09-02 02:10:00+00:00 NaN 0.0 \n", "2021-09-02 02:15:00+00:00 NaN 0.0 \n", "2021-09-02 02:20:00+00:00 NaN 0.0 \n", "2021-09-02 02:25:00+00:00 NaN 0.0 \n", "2021-09-02 02:30:00+00:00 NaN 0.0 \n", "2021-09-02 02:35:00+00:00 NaN 0.0 \n", "2021-09-02 02:40:00+00:00 NaN 0.0 \n", "2021-09-02 02:45:00+00:00 NaN 0.0 \n", "2021-09-02 02:50:00+00:00 NaN 0.0 \n", "2021-09-02 02:55:00+00:00 NaN 0.0 \n", "\n", " frozen_soil_25cm [bit] frozen_soil_50cm [bit] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.0 0.0 \n", "2021-09-02 02:05:00+00:00 0.0 0.0 \n", "2021-09-02 02:10:00+00:00 0.0 0.0 \n", "2021-09-02 02:15:00+00:00 0.0 0.0 \n", "2021-09-02 02:20:00+00:00 0.0 0.0 \n", "2021-09-02 02:25:00+00:00 0.0 0.0 \n", "2021-09-02 02:30:00+00:00 0.0 0.0 \n", "2021-09-02 02:35:00+00:00 0.0 0.0 \n", "2021-09-02 02:40:00+00:00 0.0 0.0 \n", "2021-09-02 02:45:00+00:00 0.0 0.0 \n", "2021-09-02 02:50:00+00:00 0.0 0.0 \n", "2021-09-02 02:55:00+00:00 0.0 0.0 \n", "\n", " soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n", "time \n", "2021-09-02 02:00:00+00:00 20.2 21.1 \n", "2021-09-02 02:05:00+00:00 20.1 21.1 \n", "2021-09-02 02:10:00+00:00 20.1 21.1 \n", "2021-09-02 02:15:00+00:00 20.1 21.1 \n", "2021-09-02 02:20:00+00:00 20.1 21.1 \n", "2021-09-02 02:25:00+00:00 20.1 21.1 \n", "2021-09-02 02:30:00+00:00 20.1 21.1 \n", "2021-09-02 02:35:00+00:00 20.0 21.1 \n", "2021-09-02 02:40:00+00:00 19.9 21.1 \n", "2021-09-02 02:45:00+00:00 19.9 21.1 \n", "2021-09-02 02:50:00+00:00 19.9 20.9 \n", "2021-09-02 02:55:00+00:00 19.9 20.9 \n", "\n", " soil_temp_50cm [degC] \\\n", "time \n", "2021-09-02 02:00:00+00:00 21.1 \n", "2021-09-02 02:05:00+00:00 21.1 \n", "2021-09-02 02:10:00+00:00 21.1 \n", "2021-09-02 02:15:00+00:00 21.1 \n", "2021-09-02 02:20:00+00:00 21.1 \n", "2021-09-02 02:25:00+00:00 21.1 \n", "2021-09-02 02:30:00+00:00 21.1 \n", "2021-09-02 02:35:00+00:00 21.1 \n", "2021-09-02 02:40:00+00:00 21.1 \n", "2021-09-02 02:45:00+00:00 21.1 \n", "2021-09-02 02:50:00+00:00 21.1 \n", "2021-09-02 02:55:00+00:00 21.1 \n", "\n", " soil_moisture_05cm [m^3/m^3] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.25 \n", "2021-09-02 02:05:00+00:00 0.25 \n", "2021-09-02 02:10:00+00:00 0.25 \n", "2021-09-02 02:15:00+00:00 0.26 \n", "2021-09-02 02:20:00+00:00 0.26 \n", "2021-09-02 02:25:00+00:00 0.27 \n", "2021-09-02 02:30:00+00:00 0.27 \n", "2021-09-02 02:35:00+00:00 0.28 \n", "2021-09-02 02:40:00+00:00 0.28 \n", "2021-09-02 02:45:00+00:00 0.29 \n", "2021-09-02 02:50:00+00:00 0.29 \n", "2021-09-02 02:55:00+00:00 0.29 \n", "\n", " soil_moisture_25cm [m^3/m^3] \\\n", "time \n", "2021-09-02 02:00:00+00:00 0.33 \n", "2021-09-02 02:05:00+00:00 0.33 \n", "2021-09-02 02:10:00+00:00 0.33 \n", "2021-09-02 02:15:00+00:00 0.33 \n", "2021-09-02 02:20:00+00:00 0.33 \n", "2021-09-02 02:25:00+00:00 0.33 \n", "2021-09-02 02:30:00+00:00 0.34 \n", "2021-09-02 02:35:00+00:00 0.34 \n", "2021-09-02 02:40:00+00:00 0.34 \n", "2021-09-02 02:45:00+00:00 0.34 \n", "2021-09-02 02:50:00+00:00 0.35 \n", "2021-09-02 02:55:00+00:00 0.35 \n", "\n", " soil_moisture_50cm [m^3/m^3] \n", "time \n", "2021-09-02 02:00:00+00:00 0.19 \n", "2021-09-02 02:05:00+00:00 0.19 \n", "2021-09-02 02:10:00+00:00 0.19 \n", "2021-09-02 02:15:00+00:00 0.19 \n", "2021-09-02 02:20:00+00:00 0.19 \n", "2021-09-02 02:25:00+00:00 0.19 \n", "2021-09-02 02:30:00+00:00 0.19 \n", "2021-09-02 02:35:00+00:00 0.19 \n", "2021-09-02 02:40:00+00:00 0.19 \n", "2021-09-02 02:45:00+00:00 0.19 \n", "2021-09-02 02:50:00+00:00 0.19 \n", "2021-09-02 02:55:00+00:00 0.19 \n", "\n", "[12 rows x 28 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc['VOOR']" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Use `loc` to select not only the station id, but also a specific time: we pass in a `tuple` to the `loc` method" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Briefly define **tuples** as opposed to **lists** in Python" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A `tuple` is a core Python class, similar to but distinct from a `list`\n", "1. Tuples are enclosed in parentheses (); lists are enclosed in brackets []\n", "1. Elements in a `tuple` are *immutable* (i.e. can't be changed), but `list` elements can be changed." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(4, 7, 9)\n", "9\n" ] } ], "source": [ "a = (4,7,9) # a tuple\n", "print (a)\n", "print(a[2])\n", "# this next line won't work; comment it out to see:\n", "# a[2] = 8" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[4, 7, 9]\n", "9\n", "[4, 7, 8]\n" ] } ], "source": [ "b = [4, 7, 9]\n", "print (b)\n", "print (b[2])\n", "# this will work:\n", "b[2] = 8\n", "print(b)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: Some Pandas methods accept tuples as their arguments, although lists will usually work too. When in doubt, append a ? to the method call to see the documentation.
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "What can be confusing is that just because something is enclosed in brackets, that doesn't mean it's a Python list object. In terms of Pandas, the DataFrame's `loc` method typically expects a string that's enclosed in brackets." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "
\n", " Tip: Note: we can pass in the requested time as a string ... it will work!
" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "temp_2m [degC] 14.30\n", "temp_9m [degC] 14.40\n", "relative_humidity [percent] 98.90\n", "precip_incremental [mm] 1.51\n", "precip_local [mm] 18.17\n", "precip_max_intensity [mm/min] 0.39\n", "avg_wind_speed_prop [m/s] 1.70\n", "max_wind_speed_prop [m/s] 3.50\n", "wind_speed_stddev_prop [m/s] 0.70\n", "wind_direction_prop [degrees] 290.00\n", "wind_direction_stddev_prop [degrees] 21.00\n", "avg_wind_speed_sonic [m/s] 1.90\n", "max_wind_speed_sonic [m/s] 3.60\n", "wind_speed_stddev_sonic [m/s] 0.70\n", "wind_direction_sonic [degrees] 294.00\n", "wind_direction_stddev_sonic [degrees] 20.00\n", "solar_insolation [W/m^2] 0.00\n", "station_pressure [mbar] 994.87\n", "snow_depth [cm] NaN\n", "frozen_soil_05cm [bit] 0.00\n", "frozen_soil_25cm [bit] 0.00\n", "frozen_soil_50cm [bit] 0.00\n", "soil_temp_05cm [degC] 19.90\n", "soil_temp_25cm [degC] 21.10\n", "soil_temp_50cm [degC] 21.10\n", "soil_moisture_05cm [m^3/m^3] 0.29\n", "soil_moisture_25cm [m^3/m^3] 0.34\n", "soil_moisture_50cm [m^3/m^3] 0.19\n", "Name: (VOOR, 2021-09-02 02:45:00+00:00), dtype: float64" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[('VOOR','2021-09-02 02:45:00 UTC')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "As above, but also select a particular column:" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "14.3" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[('VOOR','2021-09-02 02:45:00 UTC'),'temp_2m [degC]' ]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Taking advantage of *multi-indexing* allows us to write selection criteria that is more intuitive than how we've previously defined and utilized conditonal statements in our selection criteria (as in our cell earlier in the notebook with nested data frame object names). It also executes quicker!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pass in multiple columns (either as a list or a tuple) ... get back a `DataFrame` whose columns are the ones we selected" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "temp_2m [degC] 14.30\n", "precip_incremental [mm] 1.51\n", "Name: (VOOR, 2021-09-02 02:45:00+00:00), dtype: float64" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[('VOOR','2021-09-02 02:45:00 UTC'),('temp_2m [degC]','precip_incremental [mm]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pass in multiple indexes and multiple columns also returns `DataFrame`s:\n", "1. Two outer and one inner index:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]precip_incremental [mm]
stationtime
VOOR2021-09-02 02:45:00+00:0014.31.51
KIND2021-09-02 02:45:00+00:0016.40.75
\n", "
" ], "text/plain": [ " temp_2m [degC] precip_incremental [mm]\n", "station time \n", "VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n", "KIND 2021-09-02 02:45:00+00:00 16.4 0.75" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[(('VOOR','KIND'),'2021-09-02 02:45:00 UTC'),('temp_2m [degC]','precip_incremental [mm]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "2. Two outer and two inner:" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]precip_incremental [mm]
stationtime
VOOR2021-09-02 02:45:00+00:0014.31.51
2021-09-02 02:55:00+00:0014.20.77
KIND2021-09-02 02:45:00+00:0016.40.75
2021-09-02 02:55:00+00:0016.40.56
\n", "
" ], "text/plain": [ " temp_2m [degC] precip_incremental [mm]\n", "station time \n", "VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n", " 2021-09-02 02:55:00+00:00 14.2 0.77\n", "KIND 2021-09-02 02:45:00+00:00 16.4 0.75\n", " 2021-09-02 02:55:00+00:00 16.4 0.56" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[(('VOOR','KIND'),('2021-09-02 02:45:00 UTC','2021-09-02 02:55:00 UTC')),('temp_2m [degC]','precip_incremental [mm]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "3. One outer and two inner:" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]precip_incremental [mm]
stationtime
VOOR2021-09-02 02:45:00+00:0014.31.51
2021-09-02 02:55:00+00:0014.20.77
\n", "
" ], "text/plain": [ " temp_2m [degC] precip_incremental [mm]\n", "station time \n", "VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n", " 2021-09-02 02:55:00+00:00 14.2 0.77" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[('VOOR',('2021-09-02 02:45:00 UTC','2021-09-02 02:55:00 UTC')),('temp_2m [degC]','precip_incremental [mm]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "A more efficient way to get all stations at a particular time is via Pandas' `xs` DataFrame method." ] }, { "cell_type": "code", "execution_count": 21, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]temp_9m [degC]relative_humidity [percent]precip_incremental [mm]precip_local [mm]precip_max_intensity [mm/min]avg_wind_speed_prop [m/s]max_wind_speed_prop [m/s]wind_speed_stddev_prop [m/s]wind_direction_prop [degrees]...snow_depth [cm]frozen_soil_05cm [bit]frozen_soil_25cm [bit]frozen_soil_50cm [bit]soil_temp_05cm [degC]soil_temp_25cm [degC]soil_temp_50cm [degC]soil_moisture_05cm [m^3/m^3]soil_moisture_25cm [m^3/m^3]soil_moisture_50cm [m^3/m^3]
station
ADDI12.813.092.30.000.000.002.35.00.9346.0...NaN0.00.00.019.920.319.90.520.440.44
ANDE14.214.099.70.1613.740.003.16.00.9344.0...NaN0.00.00.019.119.119.20.280.220.14
BATA14.116.277.00.000.000.002.42.60.1309.0...NaN0.00.00.019.921.521.30.250.210.22
BEAC15.815.698.91.0156.830.253.45.50.935.0...NaN0.00.00.017.918.919.50.510.360.37
BELD14.114.293.80.001.160.003.46.51.25.0...NaN0.00.00.019.720.120.20.500.430.41
..................................................................
WFMB13.013.674.20.000.000.000.42.00.4260.0...NaN0.00.00.018.319.919.90.250.180.20
WGAT13.513.480.40.000.000.000.93.30.722.0...NaN0.00.00.018.420.120.70.150.250.08
WHIT16.316.393.20.000.000.001.63.10.528.0...NaN0.00.00.019.120.320.10.280.470.46
WOLC13.616.386.10.000.000.000.60.90.1317.0...NaN0.00.00.021.523.524.20.180.030.07
YORK11.612.597.20.000.000.000.10.70.2165.0...NaN0.00.00.020.321.721.90.130.240.24
\n", "

126 rows × 28 columns

\n", "
" ], "text/plain": [ " temp_2m [degC] temp_9m [degC] relative_humidity [percent] \\\n", "station \n", "ADDI 12.8 13.0 92.3 \n", "ANDE 14.2 14.0 99.7 \n", "BATA 14.1 16.2 77.0 \n", "BEAC 15.8 15.6 98.9 \n", "BELD 14.1 14.2 93.8 \n", "... ... ... ... \n", "WFMB 13.0 13.6 74.2 \n", "WGAT 13.5 13.4 80.4 \n", "WHIT 16.3 16.3 93.2 \n", "WOLC 13.6 16.3 86.1 \n", "YORK 11.6 12.5 97.2 \n", "\n", " precip_incremental [mm] precip_local [mm] \\\n", "station \n", "ADDI 0.00 0.00 \n", "ANDE 0.16 13.74 \n", "BATA 0.00 0.00 \n", "BEAC 1.01 56.83 \n", "BELD 0.00 1.16 \n", "... ... ... \n", "WFMB 0.00 0.00 \n", "WGAT 0.00 0.00 \n", "WHIT 0.00 0.00 \n", "WOLC 0.00 0.00 \n", "YORK 0.00 0.00 \n", "\n", " precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n", "station \n", "ADDI 0.00 2.3 \n", "ANDE 0.00 3.1 \n", "BATA 0.00 2.4 \n", "BEAC 0.25 3.4 \n", "BELD 0.00 3.4 \n", "... ... ... \n", "WFMB 0.00 0.4 \n", "WGAT 0.00 0.9 \n", "WHIT 0.00 1.6 \n", "WOLC 0.00 0.6 \n", "YORK 0.00 0.1 \n", "\n", " max_wind_speed_prop [m/s] wind_speed_stddev_prop [m/s] \\\n", "station \n", "ADDI 5.0 0.9 \n", "ANDE 6.0 0.9 \n", "BATA 2.6 0.1 \n", "BEAC 5.5 0.9 \n", "BELD 6.5 1.2 \n", "... ... ... \n", "WFMB 2.0 0.4 \n", "WGAT 3.3 0.7 \n", "WHIT 3.1 0.5 \n", "WOLC 0.9 0.1 \n", "YORK 0.7 0.2 \n", "\n", " wind_direction_prop [degrees] ... snow_depth [cm] \\\n", "station ... \n", "ADDI 346.0 ... NaN \n", "ANDE 344.0 ... NaN \n", "BATA 309.0 ... NaN \n", "BEAC 35.0 ... NaN \n", "BELD 5.0 ... NaN \n", "... ... ... ... \n", "WFMB 260.0 ... NaN \n", "WGAT 22.0 ... NaN \n", "WHIT 28.0 ... NaN \n", "WOLC 317.0 ... NaN \n", "YORK 165.0 ... NaN \n", "\n", " frozen_soil_05cm [bit] frozen_soil_25cm [bit] \\\n", "station \n", "ADDI 0.0 0.0 \n", "ANDE 0.0 0.0 \n", "BATA 0.0 0.0 \n", "BEAC 0.0 0.0 \n", "BELD 0.0 0.0 \n", "... ... ... \n", "WFMB 0.0 0.0 \n", "WGAT 0.0 0.0 \n", "WHIT 0.0 0.0 \n", "WOLC 0.0 0.0 \n", "YORK 0.0 0.0 \n", "\n", " frozen_soil_50cm [bit] soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n", "station \n", "ADDI 0.0 19.9 20.3 \n", "ANDE 0.0 19.1 19.1 \n", "BATA 0.0 19.9 21.5 \n", "BEAC 0.0 17.9 18.9 \n", "BELD 0.0 19.7 20.1 \n", "... ... ... ... \n", "WFMB 0.0 18.3 19.9 \n", "WGAT 0.0 18.4 20.1 \n", "WHIT 0.0 19.1 20.3 \n", "WOLC 0.0 21.5 23.5 \n", "YORK 0.0 20.3 21.7 \n", "\n", " soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] \\\n", "station \n", "ADDI 19.9 0.52 \n", "ANDE 19.2 0.28 \n", "BATA 21.3 0.25 \n", "BEAC 19.5 0.51 \n", "BELD 20.2 0.50 \n", "... ... ... \n", "WFMB 19.9 0.25 \n", "WGAT 20.7 0.15 \n", "WHIT 20.1 0.28 \n", "WOLC 24.2 0.18 \n", "YORK 21.9 0.13 \n", "\n", " soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] \n", "station \n", "ADDI 0.44 0.44 \n", "ANDE 0.22 0.14 \n", "BATA 0.21 0.22 \n", "BEAC 0.36 0.37 \n", "BELD 0.43 0.41 \n", "... ... ... \n", "WFMB 0.18 0.20 \n", "WGAT 0.25 0.08 \n", "WHIT 0.47 0.46 \n", "WOLC 0.03 0.07 \n", "YORK 0.24 0.24 \n", "\n", "[126 rows x 28 columns]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Pass in one index value and what index it belongs to\n", "df.xs('2021-09-02 02:45:00 UTC', level='time')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Include just a *list* of columns (a *tuple* won't work here)" ] }, { "cell_type": "code", "execution_count": 22, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
temp_2m [degC]temp_9m [degC]
station
ADDI12.813.0
ANDE14.214.0
BATA14.116.2
BEAC15.815.6
BELD14.114.2
.........
WFMB13.013.6
WGAT13.513.4
WHIT16.316.3
WOLC13.616.3
YORK11.612.5
\n", "

126 rows × 2 columns

\n", "
" ], "text/plain": [ " temp_2m [degC] temp_9m [degC]\n", "station \n", "ADDI 12.8 13.0\n", "ANDE 14.2 14.0\n", "BATA 14.1 16.2\n", "BEAC 15.8 15.6\n", "BELD 14.1 14.2\n", "... ... ...\n", "WFMB 13.0 13.6\n", "WGAT 13.5 13.4\n", "WHIT 16.3 16.3\n", "WOLC 13.6 16.3\n", "YORK 11.6 12.5\n", "\n", "[126 rows x 2 columns]" ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.xs('2021-09-02 02:45:00 UTC', level = 'time')[['temp_2m [degC]','temp_9m [degC]']]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Work with the time index as a Datetime object:" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas' handling of `datetime` objects is incredibly powerful (see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html) . Below, we select 3 times using the `date_range` method." ] }, { "cell_type": "code", "execution_count": 23, "metadata": {}, "outputs": [], "source": [ "timeRange = pd.date_range('2021-09-02 02:25', periods=3, freq='5min')" ] }, { "cell_type": "code", "execution_count": 24, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/tmp/ipykernel_82816/1874800629.py:1: FutureWarning: Indexing a timezone-aware DatetimeIndex with a timezone-naive datetime is deprecated and will raise KeyError in a future version. Use a timezone-aware object instead.\n", " df.loc[('MANH',timeRange.values),('max_wind_speed_prop [m/s]', 'max_wind_speed_sonic [m/s]')]\n" ] }, { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
max_wind_speed_prop [m/s]max_wind_speed_sonic [m/s]
stationtime
MANH2021-09-02 02:25:00+00:008.210.2
2021-09-02 02:30:00+00:008.710.3
2021-09-02 02:35:00+00:0012.211.5
\n", "
" ], "text/plain": [ " max_wind_speed_prop [m/s] \\\n", "station time \n", "MANH 2021-09-02 02:25:00+00:00 8.2 \n", " 2021-09-02 02:30:00+00:00 8.7 \n", " 2021-09-02 02:35:00+00:00 12.2 \n", "\n", " max_wind_speed_sonic [m/s] \n", "station time \n", "MANH 2021-09-02 02:25:00+00:00 10.2 \n", " 2021-09-02 02:30:00+00:00 10.3 \n", " 2021-09-02 02:35:00+00:00 11.5 " ] }, "execution_count": 24, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.loc[('MANH',timeRange.values),('max_wind_speed_prop [m/s]', 'max_wind_speed_sonic [m/s]')]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "---\n", "## Summary\n", "* One can recast a Pandas `DataFrame` so it leverages multi-indexing\n", "* Selecting and specifying conditions on a multi-index `DataFrame` work just as in a single-index `DataFrame`\n", "* Pandas has a large set of utilities to work with time-series based data. `xs` is one such method.\n", "* In Python, **tuples** are akin to **lists**, but a tuple's elements are *immutable*.\n", "\n", "### What's Next?\n", "We will next explore dataframes with geographic attributes, via the [GeoPandas](https://geopandas.org) package.\n", "## Resources and References\n", "1. [MetPy Monday Episode 96](https://www.youtube.com/watch?v=yQ5IxnZouKo&list=PLQut5OXpV-0ir4IdllSt1iEZKTwFBa7kO&index=87&ab_channel=Unidata)\n", "1. https://www.geeksforgeeks.org/python-difference-between-list-and-tuple/\n", "1. https://stackoverflow.com/questions/8900166/whats-the-difference-between-lists-enclosed-by-square-brackets-and-parentheses\n", "1. [Time series in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html)" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 August 2022 Environment", "language": "python", "name": "aug22" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.5" } }, "nbformat": 4, "nbformat_minor": 4 }