{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"
\n",
"\n",
"# Pandas 8: Multi-index DataFrames\n",
"---"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### In many instances, tabular data may best be represented with more than a single row index. For the case of NYSM hourly data files, which have 126 stations each with 13 sets of 5-minute obs, we definitely want to take advantage of using Multi-index Dataframes."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Overview\n",
"1. Open an hour's worth of NYSM data using a single row index dataframe\n",
"1. Review selection and conditional selection methodology\n",
"1. Create a dataframe with station ID and date/time as the two row indices\n",
"1. Use selection and conditions on a multi-index dataframe\n",
"1. Briefly define tuples as opposed to lists in Python\n",
"1. Work with the time index as a Datetime object\n",
"\n",
"## Prerequisites\n",
"\n",
"| Concepts | Importance | Notes |\n",
"| --- | --- | --- |\n",
"| Pandas notebooks 1-8 | Necessary | |\n",
"\n",
"* **Time to learn**: 15 minutes"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Imports"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"from datetime import datetime "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Open an hour's worth of NYSM data using a single row index dataframe"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"nysm_data_file = '/spare11/atm533/data/nysm_data_2021090202a.csv'"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" station | \n",
" time | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" \n",
" \n",
" 0 | \n",
" ADDI | \n",
" 2021-09-02 02:00:00 UTC | \n",
" 13.3 | \n",
" 13.4 | \n",
" 92.5 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.6 | \n",
" 5.8 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 1 | \n",
" ADDI | \n",
" 2021-09-02 02:05:00 UTC | \n",
" 13.2 | \n",
" 13.4 | \n",
" 92.8 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.7 | \n",
" 4.3 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 2 | \n",
" ADDI | \n",
" 2021-09-02 02:10:00 UTC | \n",
" 13.1 | \n",
" 13.3 | \n",
" 92.8 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.1 | \n",
" 3.4 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
"
\n",
"
3 rows × 30 columns
\n",
"
"
],
"text/plain": [
" station time temp_2m [degC] temp_9m [degC] \\\n",
"0 ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 \n",
"1 ADDI 2021-09-02 02:05:00 UTC 13.2 13.4 \n",
"2 ADDI 2021-09-02 02:10:00 UTC 13.1 13.3 \n",
"\n",
" relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n",
"0 92.5 0.0 0.0 \n",
"1 92.8 0.0 0.0 \n",
"2 92.8 0.0 0.0 \n",
"\n",
" precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n",
"0 0.0 2.6 \n",
"1 0.0 2.7 \n",
"2 0.0 2.1 \n",
"\n",
" max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n",
"0 5.8 ... NaN 0.0 \n",
"1 4.3 ... NaN 0.0 \n",
"2 3.4 ... NaN 0.0 \n",
"\n",
" frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n",
"0 0.0 0.0 19.9 \n",
"1 0.0 0.0 19.9 \n",
"2 0.0 0.0 19.9 \n",
"\n",
" soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] \\\n",
"0 20.3 19.9 0.52 \n",
"1 20.3 19.9 0.52 \n",
"2 20.3 19.9 0.52 \n",
"\n",
" soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] \n",
"0 0.44 0.44 \n",
"1 0.44 0.44 \n",
"2 0.44 0.44 \n",
"\n",
"[3 rows x 30 columns]"
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df = pd.read_csv(nysm_data_file)\n",
"# Look at the first few rows\n",
"df.head(3)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Remind ourselves that the default `DataFrame`'s index is a special type of Python object, called a *RangeIndex*."
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"RangeIndex(start=0, stop=1512, step=1)"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Review selection and conditional selection methodology"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Here we have multiple stations and multiple times. We can select rows/columns, set conditions, and make further selections based on those conditions as we did in the **03_Pandas_IndexSubsetsConditonals** notebook. "
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" station | \n",
" time | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" \n",
" \n",
" 1332 | \n",
" VOOR | \n",
" 2021-09-02 02:00:00 UTC | \n",
" 14.9 | \n",
" 14.9 | \n",
" 98.9 | \n",
" 0.26 | \n",
" 11.67 | \n",
" 0.00 | \n",
" 2.2 | \n",
" 3.6 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.2 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1333 | \n",
" VOOR | \n",
" 2021-09-02 02:05:00 UTC | \n",
" 14.8 | \n",
" 14.8 | \n",
" 99.0 | \n",
" 0.27 | \n",
" 11.94 | \n",
" 0.00 | \n",
" 3.0 | \n",
" 4.6 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1334 | \n",
" VOOR | \n",
" 2021-09-02 02:10:00 UTC | \n",
" 14.7 | \n",
" 14.7 | \n",
" 98.9 | \n",
" 0.39 | \n",
" 12.33 | \n",
" 0.11 | \n",
" 3.2 | \n",
" 5.1 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1335 | \n",
" VOOR | \n",
" 2021-09-02 02:15:00 UTC | \n",
" 14.6 | \n",
" 14.6 | \n",
" 98.9 | \n",
" 0.80 | \n",
" 13.13 | \n",
" 0.18 | \n",
" 3.6 | \n",
" 5.6 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1336 | \n",
" VOOR | \n",
" 2021-09-02 02:20:00 UTC | \n",
" 14.6 | \n",
" 14.6 | \n",
" 98.9 | \n",
" 0.88 | \n",
" 14.01 | \n",
" 0.18 | \n",
" 2.9 | \n",
" 6.4 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1337 | \n",
" VOOR | \n",
" 2021-09-02 02:25:00 UTC | \n",
" 14.5 | \n",
" 14.5 | \n",
" 98.9 | \n",
" 0.66 | \n",
" 14.67 | \n",
" 0.15 | \n",
" 2.9 | \n",
" 5.8 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.27 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1338 | \n",
" VOOR | \n",
" 2021-09-02 02:30:00 UTC | \n",
" 14.5 | \n",
" 14.5 | \n",
" 99.0 | \n",
" 0.45 | \n",
" 15.12 | \n",
" 0.00 | \n",
" 2.5 | \n",
" 4.6 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.27 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1339 | \n",
" VOOR | \n",
" 2021-09-02 02:35:00 UTC | \n",
" 14.4 | \n",
" 14.5 | \n",
" 99.0 | \n",
" 0.55 | \n",
" 15.67 | \n",
" 0.13 | \n",
" 2.6 | \n",
" 3.8 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.0 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.28 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1340 | \n",
" VOOR | \n",
" 2021-09-02 02:40:00 UTC | \n",
" 14.4 | \n",
" 14.4 | \n",
" 99.0 | \n",
" 0.99 | \n",
" 16.66 | \n",
" 0.25 | \n",
" 2.2 | \n",
" 4.3 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.28 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1341 | \n",
" VOOR | \n",
" 2021-09-02 02:45:00 UTC | \n",
" 14.3 | \n",
" 14.4 | \n",
" 98.9 | \n",
" 1.51 | \n",
" 18.17 | \n",
" 0.39 | \n",
" 1.7 | \n",
" 3.5 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1342 | \n",
" VOOR | \n",
" 2021-09-02 02:50:00 UTC | \n",
" 14.2 | \n",
" 14.3 | \n",
" 98.8 | \n",
" 1.30 | \n",
" 19.47 | \n",
" 0.31 | \n",
" 1.5 | \n",
" 3.4 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.9 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.35 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 1343 | \n",
" VOOR | \n",
" 2021-09-02 02:55:00 UTC | \n",
" 14.2 | \n",
" 14.3 | \n",
" 98.4 | \n",
" 0.77 | \n",
" 20.24 | \n",
" 0.19 | \n",
" 2.1 | \n",
" 3.7 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.9 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.35 | \n",
" 0.19 | \n",
"
\n",
" \n",
"
\n",
"
12 rows × 30 columns
\n",
"
"
],
"text/plain": [
" station time temp_2m [degC] temp_9m [degC] \\\n",
"1332 VOOR 2021-09-02 02:00:00 UTC 14.9 14.9 \n",
"1333 VOOR 2021-09-02 02:05:00 UTC 14.8 14.8 \n",
"1334 VOOR 2021-09-02 02:10:00 UTC 14.7 14.7 \n",
"1335 VOOR 2021-09-02 02:15:00 UTC 14.6 14.6 \n",
"1336 VOOR 2021-09-02 02:20:00 UTC 14.6 14.6 \n",
"1337 VOOR 2021-09-02 02:25:00 UTC 14.5 14.5 \n",
"1338 VOOR 2021-09-02 02:30:00 UTC 14.5 14.5 \n",
"1339 VOOR 2021-09-02 02:35:00 UTC 14.4 14.5 \n",
"1340 VOOR 2021-09-02 02:40:00 UTC 14.4 14.4 \n",
"1341 VOOR 2021-09-02 02:45:00 UTC 14.3 14.4 \n",
"1342 VOOR 2021-09-02 02:50:00 UTC 14.2 14.3 \n",
"1343 VOOR 2021-09-02 02:55:00 UTC 14.2 14.3 \n",
"\n",
" relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n",
"1332 98.9 0.26 11.67 \n",
"1333 99.0 0.27 11.94 \n",
"1334 98.9 0.39 12.33 \n",
"1335 98.9 0.80 13.13 \n",
"1336 98.9 0.88 14.01 \n",
"1337 98.9 0.66 14.67 \n",
"1338 99.0 0.45 15.12 \n",
"1339 99.0 0.55 15.67 \n",
"1340 99.0 0.99 16.66 \n",
"1341 98.9 1.51 18.17 \n",
"1342 98.8 1.30 19.47 \n",
"1343 98.4 0.77 20.24 \n",
"\n",
" precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n",
"1332 0.00 2.2 \n",
"1333 0.00 3.0 \n",
"1334 0.11 3.2 \n",
"1335 0.18 3.6 \n",
"1336 0.18 2.9 \n",
"1337 0.15 2.9 \n",
"1338 0.00 2.5 \n",
"1339 0.13 2.6 \n",
"1340 0.25 2.2 \n",
"1341 0.39 1.7 \n",
"1342 0.31 1.5 \n",
"1343 0.19 2.1 \n",
"\n",
" max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n",
"1332 3.6 ... NaN 0.0 \n",
"1333 4.6 ... NaN 0.0 \n",
"1334 5.1 ... NaN 0.0 \n",
"1335 5.6 ... NaN 0.0 \n",
"1336 6.4 ... NaN 0.0 \n",
"1337 5.8 ... NaN 0.0 \n",
"1338 4.6 ... NaN 0.0 \n",
"1339 3.8 ... NaN 0.0 \n",
"1340 4.3 ... NaN 0.0 \n",
"1341 3.5 ... NaN 0.0 \n",
"1342 3.4 ... NaN 0.0 \n",
"1343 3.7 ... NaN 0.0 \n",
"\n",
" frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n",
"1332 0.0 0.0 20.2 \n",
"1333 0.0 0.0 20.1 \n",
"1334 0.0 0.0 20.1 \n",
"1335 0.0 0.0 20.1 \n",
"1336 0.0 0.0 20.1 \n",
"1337 0.0 0.0 20.1 \n",
"1338 0.0 0.0 20.1 \n",
"1339 0.0 0.0 20.0 \n",
"1340 0.0 0.0 19.9 \n",
"1341 0.0 0.0 19.9 \n",
"1342 0.0 0.0 19.9 \n",
"1343 0.0 0.0 19.9 \n",
"\n",
" soil_temp_25cm [degC] soil_temp_50cm [degC] \\\n",
"1332 21.1 21.1 \n",
"1333 21.1 21.1 \n",
"1334 21.1 21.1 \n",
"1335 21.1 21.1 \n",
"1336 21.1 21.1 \n",
"1337 21.1 21.1 \n",
"1338 21.1 21.1 \n",
"1339 21.1 21.1 \n",
"1340 21.1 21.1 \n",
"1341 21.1 21.1 \n",
"1342 20.9 21.1 \n",
"1343 20.9 21.1 \n",
"\n",
" soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] \\\n",
"1332 0.25 0.33 \n",
"1333 0.25 0.33 \n",
"1334 0.25 0.33 \n",
"1335 0.26 0.33 \n",
"1336 0.26 0.33 \n",
"1337 0.27 0.33 \n",
"1338 0.27 0.34 \n",
"1339 0.28 0.34 \n",
"1340 0.28 0.34 \n",
"1341 0.29 0.34 \n",
"1342 0.29 0.35 \n",
"1343 0.29 0.35 \n",
"\n",
" soil_moisture_50cm [m^3/m^3] \n",
"1332 0.19 \n",
"1333 0.19 \n",
"1334 0.19 \n",
"1335 0.19 \n",
"1336 0.19 \n",
"1337 0.19 \n",
"1338 0.19 \n",
"1339 0.19 \n",
"1340 0.19 \n",
"1341 0.19 \n",
"1342 0.19 \n",
"1343 0.19 \n",
"\n",
"[12 rows x 30 columns]"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df[df['station']=='VOOR'] # Voorheesville, NY"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Create a second dataframe that first contains just VOOR rows, then further subset by choosing one time."
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" station | \n",
" time | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" \n",
" \n",
" 1342 | \n",
" VOOR | \n",
" 2021-09-02 02:50:00 UTC | \n",
" 14.2 | \n",
" 14.3 | \n",
" 98.8 | \n",
" 1.3 | \n",
" 19.47 | \n",
" 0.31 | \n",
" 1.5 | \n",
" 3.4 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.9 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.35 | \n",
" 0.19 | \n",
"
\n",
" \n",
"
\n",
"
1 rows × 30 columns
\n",
"
"
],
"text/plain": [
" station time temp_2m [degC] temp_9m [degC] \\\n",
"1342 VOOR 2021-09-02 02:50:00 UTC 14.2 14.3 \n",
"\n",
" relative_humidity [percent] precip_incremental [mm] precip_local [mm] \\\n",
"1342 98.8 1.3 19.47 \n",
"\n",
" precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n",
"1342 0.31 1.5 \n",
"\n",
" max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] \\\n",
"1342 3.4 ... NaN 0.0 \n",
"\n",
" frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] \\\n",
"1342 0.0 0.0 19.9 \n",
"\n",
" soil_temp_25cm [degC] soil_temp_50cm [degC] \\\n",
"1342 20.9 21.1 \n",
"\n",
" soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] \\\n",
"1342 0.29 0.35 \n",
"\n",
" soil_moisture_50cm [m^3/m^3] \n",
"1342 0.19 \n",
"\n",
"[1 rows x 30 columns]"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df2 = df[df['station']=='VOOR']\n",
"df2[df2['time'] == '2021-09-02 02:50:00 UTC'] # we haven't recast this as a Datetime object ... it's just a string"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"However, there is a better way! Since our DataFrame consists of one station after another, each associated with a number of discrete times, let's create a **multi-indexed** DataFrame, that has *station* as its outer row index and *time* as its inner."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Create a dataframe with station ID and date/time as the two row indices"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"df.set_index(['station', 'time'], inplace = True)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" Note: inplace
means that the df object gets re-created with its new indexes. It's convenient, but if at any point we wish to re-run cells beyond this point, it is better to start from the beginning to ensure that the df
object conforms to whatever code cell operates on it.
"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" wind_speed_stddev_prop [m/s] | \n",
" wind_direction_prop [degrees] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" station | \n",
" time | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" ADDI | \n",
" 2021-09-02 02:00:00 UTC | \n",
" 13.3 | \n",
" 13.4 | \n",
" 92.5 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.6 | \n",
" 5.8 | \n",
" 0.9 | \n",
" 344.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 2021-09-02 02:05:00 UTC | \n",
" 13.2 | \n",
" 13.4 | \n",
" 92.8 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.7 | \n",
" 4.3 | \n",
" 0.6 | \n",
" 340.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 2021-09-02 02:10:00 UTC | \n",
" 13.1 | \n",
" 13.3 | \n",
" 92.8 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.1 | \n",
" 3.4 | \n",
" 0.5 | \n",
" 336.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 2021-09-02 02:15:00 UTC | \n",
" 13.0 | \n",
" 13.2 | \n",
" 92.9 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.4 | \n",
" 4.1 | \n",
" 0.6 | \n",
" 342.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" 2021-09-02 02:20:00 UTC | \n",
" 12.9 | \n",
" 13.1 | \n",
" 93.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 2.3 | \n",
" 4.6 | \n",
" 0.9 | \n",
" 338.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" YORK | \n",
" 2021-09-02 02:35:00 UTC | \n",
" 11.7 | \n",
" 12.8 | \n",
" 97.1 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.2 | \n",
" 0.6 | \n",
" 0.2 | \n",
" 136.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.3 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
" 2021-09-02 02:40:00 UTC | \n",
" 11.6 | \n",
" 12.5 | \n",
" 97.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.6 | \n",
" 0.8 | \n",
" 0.1 | \n",
" 131.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.3 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
" 2021-09-02 02:45:00 UTC | \n",
" 11.6 | \n",
" 12.5 | \n",
" 97.2 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.1 | \n",
" 0.7 | \n",
" 0.2 | \n",
" 165.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.3 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
" 2021-09-02 02:50:00 UTC | \n",
" 11.5 | \n",
" 12.7 | \n",
" 97.3 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.7 | \n",
" 0.8 | \n",
" 0.1 | \n",
" 193.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.2 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
" 2021-09-02 02:55:00 UTC | \n",
" 11.5 | \n",
" 12.4 | \n",
" 97.4 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.5 | \n",
" 0.7 | \n",
" 0.1 | \n",
" 168.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
"
\n",
"
1512 rows × 28 columns
\n",
"
"
],
"text/plain": [
" temp_2m [degC] temp_9m [degC] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 \n",
" 2021-09-02 02:05:00 UTC 13.2 13.4 \n",
" 2021-09-02 02:10:00 UTC 13.1 13.3 \n",
" 2021-09-02 02:15:00 UTC 13.0 13.2 \n",
" 2021-09-02 02:20:00 UTC 12.9 13.1 \n",
"... ... ... \n",
"YORK 2021-09-02 02:35:00 UTC 11.7 12.8 \n",
" 2021-09-02 02:40:00 UTC 11.6 12.5 \n",
" 2021-09-02 02:45:00 UTC 11.6 12.5 \n",
" 2021-09-02 02:50:00 UTC 11.5 12.7 \n",
" 2021-09-02 02:55:00 UTC 11.5 12.4 \n",
"\n",
" relative_humidity [percent] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 92.5 \n",
" 2021-09-02 02:05:00 UTC 92.8 \n",
" 2021-09-02 02:10:00 UTC 92.8 \n",
" 2021-09-02 02:15:00 UTC 92.9 \n",
" 2021-09-02 02:20:00 UTC 93.0 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 97.1 \n",
" 2021-09-02 02:40:00 UTC 97.0 \n",
" 2021-09-02 02:45:00 UTC 97.2 \n",
" 2021-09-02 02:50:00 UTC 97.3 \n",
" 2021-09-02 02:55:00 UTC 97.4 \n",
"\n",
" precip_incremental [mm] precip_local [mm] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:05:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:10:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:15:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:20:00 UTC 0.0 0.0 \n",
"... ... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:40:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:45:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:50:00 UTC 0.0 0.0 \n",
" 2021-09-02 02:55:00 UTC 0.0 0.0 \n",
"\n",
" precip_max_intensity [mm/min] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.0 \n",
" 2021-09-02 02:05:00 UTC 0.0 \n",
" 2021-09-02 02:10:00 UTC 0.0 \n",
" 2021-09-02 02:15:00 UTC 0.0 \n",
" 2021-09-02 02:20:00 UTC 0.0 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.0 \n",
" 2021-09-02 02:40:00 UTC 0.0 \n",
" 2021-09-02 02:45:00 UTC 0.0 \n",
" 2021-09-02 02:50:00 UTC 0.0 \n",
" 2021-09-02 02:55:00 UTC 0.0 \n",
"\n",
" avg_wind_speed_prop [m/s] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 2.6 \n",
" 2021-09-02 02:05:00 UTC 2.7 \n",
" 2021-09-02 02:10:00 UTC 2.1 \n",
" 2021-09-02 02:15:00 UTC 2.4 \n",
" 2021-09-02 02:20:00 UTC 2.3 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.2 \n",
" 2021-09-02 02:40:00 UTC 0.6 \n",
" 2021-09-02 02:45:00 UTC 0.1 \n",
" 2021-09-02 02:50:00 UTC 0.7 \n",
" 2021-09-02 02:55:00 UTC 0.5 \n",
"\n",
" max_wind_speed_prop [m/s] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 5.8 \n",
" 2021-09-02 02:05:00 UTC 4.3 \n",
" 2021-09-02 02:10:00 UTC 3.4 \n",
" 2021-09-02 02:15:00 UTC 4.1 \n",
" 2021-09-02 02:20:00 UTC 4.6 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.6 \n",
" 2021-09-02 02:40:00 UTC 0.8 \n",
" 2021-09-02 02:45:00 UTC 0.7 \n",
" 2021-09-02 02:50:00 UTC 0.8 \n",
" 2021-09-02 02:55:00 UTC 0.7 \n",
"\n",
" wind_speed_stddev_prop [m/s] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.9 \n",
" 2021-09-02 02:05:00 UTC 0.6 \n",
" 2021-09-02 02:10:00 UTC 0.5 \n",
" 2021-09-02 02:15:00 UTC 0.6 \n",
" 2021-09-02 02:20:00 UTC 0.9 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.2 \n",
" 2021-09-02 02:40:00 UTC 0.1 \n",
" 2021-09-02 02:45:00 UTC 0.2 \n",
" 2021-09-02 02:50:00 UTC 0.1 \n",
" 2021-09-02 02:55:00 UTC 0.1 \n",
"\n",
" wind_direction_prop [degrees] ... \\\n",
"station time ... \n",
"ADDI 2021-09-02 02:00:00 UTC 344.0 ... \n",
" 2021-09-02 02:05:00 UTC 340.0 ... \n",
" 2021-09-02 02:10:00 UTC 336.0 ... \n",
" 2021-09-02 02:15:00 UTC 342.0 ... \n",
" 2021-09-02 02:20:00 UTC 338.0 ... \n",
"... ... ... \n",
"YORK 2021-09-02 02:35:00 UTC 136.0 ... \n",
" 2021-09-02 02:40:00 UTC 131.0 ... \n",
" 2021-09-02 02:45:00 UTC 165.0 ... \n",
" 2021-09-02 02:50:00 UTC 193.0 ... \n",
" 2021-09-02 02:55:00 UTC 168.0 ... \n",
"\n",
" snow_depth [cm] frozen_soil_05cm [bit] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC NaN 0.0 \n",
" 2021-09-02 02:05:00 UTC NaN 0.0 \n",
" 2021-09-02 02:10:00 UTC NaN 0.0 \n",
" 2021-09-02 02:15:00 UTC NaN 0.0 \n",
" 2021-09-02 02:20:00 UTC NaN 0.0 \n",
"... ... ... \n",
"YORK 2021-09-02 02:35:00 UTC NaN 0.0 \n",
" 2021-09-02 02:40:00 UTC NaN 0.0 \n",
" 2021-09-02 02:45:00 UTC NaN 0.0 \n",
" 2021-09-02 02:50:00 UTC NaN 0.0 \n",
" 2021-09-02 02:55:00 UTC NaN 0.0 \n",
"\n",
" frozen_soil_25cm [bit] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.0 \n",
" 2021-09-02 02:05:00 UTC 0.0 \n",
" 2021-09-02 02:10:00 UTC 0.0 \n",
" 2021-09-02 02:15:00 UTC 0.0 \n",
" 2021-09-02 02:20:00 UTC 0.0 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.0 \n",
" 2021-09-02 02:40:00 UTC 0.0 \n",
" 2021-09-02 02:45:00 UTC 0.0 \n",
" 2021-09-02 02:50:00 UTC 0.0 \n",
" 2021-09-02 02:55:00 UTC 0.0 \n",
"\n",
" frozen_soil_50cm [bit] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.0 \n",
" 2021-09-02 02:05:00 UTC 0.0 \n",
" 2021-09-02 02:10:00 UTC 0.0 \n",
" 2021-09-02 02:15:00 UTC 0.0 \n",
" 2021-09-02 02:20:00 UTC 0.0 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.0 \n",
" 2021-09-02 02:40:00 UTC 0.0 \n",
" 2021-09-02 02:45:00 UTC 0.0 \n",
" 2021-09-02 02:50:00 UTC 0.0 \n",
" 2021-09-02 02:55:00 UTC 0.0 \n",
"\n",
" soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 19.9 20.3 \n",
" 2021-09-02 02:05:00 UTC 19.9 20.3 \n",
" 2021-09-02 02:10:00 UTC 19.9 20.3 \n",
" 2021-09-02 02:15:00 UTC 19.9 20.3 \n",
" 2021-09-02 02:20:00 UTC 19.9 20.3 \n",
"... ... ... \n",
"YORK 2021-09-02 02:35:00 UTC 20.3 21.7 \n",
" 2021-09-02 02:40:00 UTC 20.3 21.7 \n",
" 2021-09-02 02:45:00 UTC 20.3 21.7 \n",
" 2021-09-02 02:50:00 UTC 20.2 21.7 \n",
" 2021-09-02 02:55:00 UTC 20.1 21.7 \n",
"\n",
" soil_temp_50cm [degC] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 19.9 \n",
" 2021-09-02 02:05:00 UTC 19.9 \n",
" 2021-09-02 02:10:00 UTC 19.9 \n",
" 2021-09-02 02:15:00 UTC 19.9 \n",
" 2021-09-02 02:20:00 UTC 19.9 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 21.9 \n",
" 2021-09-02 02:40:00 UTC 21.9 \n",
" 2021-09-02 02:45:00 UTC 21.9 \n",
" 2021-09-02 02:50:00 UTC 21.9 \n",
" 2021-09-02 02:55:00 UTC 21.9 \n",
"\n",
" soil_moisture_05cm [m^3/m^3] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.52 \n",
" 2021-09-02 02:05:00 UTC 0.52 \n",
" 2021-09-02 02:10:00 UTC 0.52 \n",
" 2021-09-02 02:15:00 UTC 0.52 \n",
" 2021-09-02 02:20:00 UTC 0.52 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.13 \n",
" 2021-09-02 02:40:00 UTC 0.13 \n",
" 2021-09-02 02:45:00 UTC 0.13 \n",
" 2021-09-02 02:50:00 UTC 0.13 \n",
" 2021-09-02 02:55:00 UTC 0.13 \n",
"\n",
" soil_moisture_25cm [m^3/m^3] \\\n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.44 \n",
" 2021-09-02 02:05:00 UTC 0.44 \n",
" 2021-09-02 02:10:00 UTC 0.44 \n",
" 2021-09-02 02:15:00 UTC 0.44 \n",
" 2021-09-02 02:20:00 UTC 0.44 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.24 \n",
" 2021-09-02 02:40:00 UTC 0.24 \n",
" 2021-09-02 02:45:00 UTC 0.24 \n",
" 2021-09-02 02:50:00 UTC 0.24 \n",
" 2021-09-02 02:55:00 UTC 0.24 \n",
"\n",
" soil_moisture_50cm [m^3/m^3] \n",
"station time \n",
"ADDI 2021-09-02 02:00:00 UTC 0.44 \n",
" 2021-09-02 02:05:00 UTC 0.44 \n",
" 2021-09-02 02:10:00 UTC 0.44 \n",
" 2021-09-02 02:15:00 UTC 0.44 \n",
" 2021-09-02 02:20:00 UTC 0.44 \n",
"... ... \n",
"YORK 2021-09-02 02:35:00 UTC 0.24 \n",
" 2021-09-02 02:40:00 UTC 0.24 \n",
" 2021-09-02 02:45:00 UTC 0.24 \n",
" 2021-09-02 02:50:00 UTC 0.24 \n",
" 2021-09-02 02:55:00 UTC 0.24 \n",
"\n",
"[1512 rows x 28 columns]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MultiIndex([('ADDI', '2021-09-02 02:00:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:05:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:10:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:15:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:20:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:25:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:30:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:35:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:40:00 UTC'),\n",
" ('ADDI', '2021-09-02 02:45:00 UTC'),\n",
" ...\n",
" ('YORK', '2021-09-02 02:10:00 UTC'),\n",
" ('YORK', '2021-09-02 02:15:00 UTC'),\n",
" ('YORK', '2021-09-02 02:20:00 UTC'),\n",
" ('YORK', '2021-09-02 02:25:00 UTC'),\n",
" ('YORK', '2021-09-02 02:30:00 UTC'),\n",
" ('YORK', '2021-09-02 02:35:00 UTC'),\n",
" ('YORK', '2021-09-02 02:40:00 UTC'),\n",
" ('YORK', '2021-09-02 02:45:00 UTC'),\n",
" ('YORK', '2021-09-02 02:50:00 UTC'),\n",
" ('YORK', '2021-09-02 02:55:00 UTC')],\n",
" names=['station', 'time'], length=1512)"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.index"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The date/time index is currently a string. Let's do this nifty trick to convert it to a `datetime` object."
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [],
"source": [
"df.index = df.index.set_levels([df.index.levels[0], pd.to_datetime(df.index.levels[1])])"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"MultiIndex([('ADDI', '2021-09-02 02:00:00+00:00'),\n",
" ('ADDI', '2021-09-02 02:05:00+00:00'),\n",
" ('ADDI', '2021-09-02 02:10:00+00:00')],\n",
" names=['station', 'time'])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Examine first three row indices\n",
"df.index[:3]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Use selection and conditions on a multi-index dataframe"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Select a column in the usual way."
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" wind_speed_stddev_prop [m/s] | \n",
" wind_direction_prop [degrees] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" time | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2021-09-02 02:00:00+00:00 | \n",
" 14.9 | \n",
" 14.9 | \n",
" 98.9 | \n",
" 0.26 | \n",
" 11.67 | \n",
" 0.00 | \n",
" 2.2 | \n",
" 3.6 | \n",
" 0.6 | \n",
" 301.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.2 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:05:00+00:00 | \n",
" 14.8 | \n",
" 14.8 | \n",
" 99.0 | \n",
" 0.27 | \n",
" 11.94 | \n",
" 0.00 | \n",
" 3.0 | \n",
" 4.6 | \n",
" 0.8 | \n",
" 299.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:10:00+00:00 | \n",
" 14.7 | \n",
" 14.7 | \n",
" 98.9 | \n",
" 0.39 | \n",
" 12.33 | \n",
" 0.11 | \n",
" 3.2 | \n",
" 5.1 | \n",
" 0.8 | \n",
" 301.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.25 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:15:00+00:00 | \n",
" 14.6 | \n",
" 14.6 | \n",
" 98.9 | \n",
" 0.80 | \n",
" 13.13 | \n",
" 0.18 | \n",
" 3.6 | \n",
" 5.6 | \n",
" 0.7 | \n",
" 299.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:20:00+00:00 | \n",
" 14.6 | \n",
" 14.6 | \n",
" 98.9 | \n",
" 0.88 | \n",
" 14.01 | \n",
" 0.18 | \n",
" 2.9 | \n",
" 6.4 | \n",
" 0.9 | \n",
" 300.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.26 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:25:00+00:00 | \n",
" 14.5 | \n",
" 14.5 | \n",
" 98.9 | \n",
" 0.66 | \n",
" 14.67 | \n",
" 0.15 | \n",
" 2.9 | \n",
" 5.8 | \n",
" 0.9 | \n",
" 298.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.27 | \n",
" 0.33 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:30:00+00:00 | \n",
" 14.5 | \n",
" 14.5 | \n",
" 99.0 | \n",
" 0.45 | \n",
" 15.12 | \n",
" 0.00 | \n",
" 2.5 | \n",
" 4.6 | \n",
" 0.8 | \n",
" 301.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.1 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.27 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:35:00+00:00 | \n",
" 14.4 | \n",
" 14.5 | \n",
" 99.0 | \n",
" 0.55 | \n",
" 15.67 | \n",
" 0.13 | \n",
" 2.6 | \n",
" 3.8 | \n",
" 0.5 | \n",
" 292.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.0 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.28 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:40:00+00:00 | \n",
" 14.4 | \n",
" 14.4 | \n",
" 99.0 | \n",
" 0.99 | \n",
" 16.66 | \n",
" 0.25 | \n",
" 2.2 | \n",
" 4.3 | \n",
" 0.8 | \n",
" 298.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.28 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 14.3 | \n",
" 14.4 | \n",
" 98.9 | \n",
" 1.51 | \n",
" 18.17 | \n",
" 0.39 | \n",
" 1.7 | \n",
" 3.5 | \n",
" 0.7 | \n",
" 290.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 21.1 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.34 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:50:00+00:00 | \n",
" 14.2 | \n",
" 14.3 | \n",
" 98.8 | \n",
" 1.30 | \n",
" 19.47 | \n",
" 0.31 | \n",
" 1.5 | \n",
" 3.4 | \n",
" 0.8 | \n",
" 307.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.9 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.35 | \n",
" 0.19 | \n",
"
\n",
" \n",
" 2021-09-02 02:55:00+00:00 | \n",
" 14.2 | \n",
" 14.3 | \n",
" 98.4 | \n",
" 0.77 | \n",
" 20.24 | \n",
" 0.19 | \n",
" 2.1 | \n",
" 3.7 | \n",
" 0.8 | \n",
" 301.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.9 | \n",
" 21.1 | \n",
" 0.29 | \n",
" 0.35 | \n",
" 0.19 | \n",
"
\n",
" \n",
"
\n",
"
12 rows × 28 columns
\n",
"
"
],
"text/plain": [
" temp_2m [degC] temp_9m [degC] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 14.9 14.9 \n",
"2021-09-02 02:05:00+00:00 14.8 14.8 \n",
"2021-09-02 02:10:00+00:00 14.7 14.7 \n",
"2021-09-02 02:15:00+00:00 14.6 14.6 \n",
"2021-09-02 02:20:00+00:00 14.6 14.6 \n",
"2021-09-02 02:25:00+00:00 14.5 14.5 \n",
"2021-09-02 02:30:00+00:00 14.5 14.5 \n",
"2021-09-02 02:35:00+00:00 14.4 14.5 \n",
"2021-09-02 02:40:00+00:00 14.4 14.4 \n",
"2021-09-02 02:45:00+00:00 14.3 14.4 \n",
"2021-09-02 02:50:00+00:00 14.2 14.3 \n",
"2021-09-02 02:55:00+00:00 14.2 14.3 \n",
"\n",
" relative_humidity [percent] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 98.9 \n",
"2021-09-02 02:05:00+00:00 99.0 \n",
"2021-09-02 02:10:00+00:00 98.9 \n",
"2021-09-02 02:15:00+00:00 98.9 \n",
"2021-09-02 02:20:00+00:00 98.9 \n",
"2021-09-02 02:25:00+00:00 98.9 \n",
"2021-09-02 02:30:00+00:00 99.0 \n",
"2021-09-02 02:35:00+00:00 99.0 \n",
"2021-09-02 02:40:00+00:00 99.0 \n",
"2021-09-02 02:45:00+00:00 98.9 \n",
"2021-09-02 02:50:00+00:00 98.8 \n",
"2021-09-02 02:55:00+00:00 98.4 \n",
"\n",
" precip_incremental [mm] precip_local [mm] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.26 11.67 \n",
"2021-09-02 02:05:00+00:00 0.27 11.94 \n",
"2021-09-02 02:10:00+00:00 0.39 12.33 \n",
"2021-09-02 02:15:00+00:00 0.80 13.13 \n",
"2021-09-02 02:20:00+00:00 0.88 14.01 \n",
"2021-09-02 02:25:00+00:00 0.66 14.67 \n",
"2021-09-02 02:30:00+00:00 0.45 15.12 \n",
"2021-09-02 02:35:00+00:00 0.55 15.67 \n",
"2021-09-02 02:40:00+00:00 0.99 16.66 \n",
"2021-09-02 02:45:00+00:00 1.51 18.17 \n",
"2021-09-02 02:50:00+00:00 1.30 19.47 \n",
"2021-09-02 02:55:00+00:00 0.77 20.24 \n",
"\n",
" precip_max_intensity [mm/min] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.00 \n",
"2021-09-02 02:05:00+00:00 0.00 \n",
"2021-09-02 02:10:00+00:00 0.11 \n",
"2021-09-02 02:15:00+00:00 0.18 \n",
"2021-09-02 02:20:00+00:00 0.18 \n",
"2021-09-02 02:25:00+00:00 0.15 \n",
"2021-09-02 02:30:00+00:00 0.00 \n",
"2021-09-02 02:35:00+00:00 0.13 \n",
"2021-09-02 02:40:00+00:00 0.25 \n",
"2021-09-02 02:45:00+00:00 0.39 \n",
"2021-09-02 02:50:00+00:00 0.31 \n",
"2021-09-02 02:55:00+00:00 0.19 \n",
"\n",
" avg_wind_speed_prop [m/s] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 2.2 \n",
"2021-09-02 02:05:00+00:00 3.0 \n",
"2021-09-02 02:10:00+00:00 3.2 \n",
"2021-09-02 02:15:00+00:00 3.6 \n",
"2021-09-02 02:20:00+00:00 2.9 \n",
"2021-09-02 02:25:00+00:00 2.9 \n",
"2021-09-02 02:30:00+00:00 2.5 \n",
"2021-09-02 02:35:00+00:00 2.6 \n",
"2021-09-02 02:40:00+00:00 2.2 \n",
"2021-09-02 02:45:00+00:00 1.7 \n",
"2021-09-02 02:50:00+00:00 1.5 \n",
"2021-09-02 02:55:00+00:00 2.1 \n",
"\n",
" max_wind_speed_prop [m/s] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 3.6 \n",
"2021-09-02 02:05:00+00:00 4.6 \n",
"2021-09-02 02:10:00+00:00 5.1 \n",
"2021-09-02 02:15:00+00:00 5.6 \n",
"2021-09-02 02:20:00+00:00 6.4 \n",
"2021-09-02 02:25:00+00:00 5.8 \n",
"2021-09-02 02:30:00+00:00 4.6 \n",
"2021-09-02 02:35:00+00:00 3.8 \n",
"2021-09-02 02:40:00+00:00 4.3 \n",
"2021-09-02 02:45:00+00:00 3.5 \n",
"2021-09-02 02:50:00+00:00 3.4 \n",
"2021-09-02 02:55:00+00:00 3.7 \n",
"\n",
" wind_speed_stddev_prop [m/s] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.6 \n",
"2021-09-02 02:05:00+00:00 0.8 \n",
"2021-09-02 02:10:00+00:00 0.8 \n",
"2021-09-02 02:15:00+00:00 0.7 \n",
"2021-09-02 02:20:00+00:00 0.9 \n",
"2021-09-02 02:25:00+00:00 0.9 \n",
"2021-09-02 02:30:00+00:00 0.8 \n",
"2021-09-02 02:35:00+00:00 0.5 \n",
"2021-09-02 02:40:00+00:00 0.8 \n",
"2021-09-02 02:45:00+00:00 0.7 \n",
"2021-09-02 02:50:00+00:00 0.8 \n",
"2021-09-02 02:55:00+00:00 0.8 \n",
"\n",
" wind_direction_prop [degrees] ... \\\n",
"time ... \n",
"2021-09-02 02:00:00+00:00 301.0 ... \n",
"2021-09-02 02:05:00+00:00 299.0 ... \n",
"2021-09-02 02:10:00+00:00 301.0 ... \n",
"2021-09-02 02:15:00+00:00 299.0 ... \n",
"2021-09-02 02:20:00+00:00 300.0 ... \n",
"2021-09-02 02:25:00+00:00 298.0 ... \n",
"2021-09-02 02:30:00+00:00 301.0 ... \n",
"2021-09-02 02:35:00+00:00 292.0 ... \n",
"2021-09-02 02:40:00+00:00 298.0 ... \n",
"2021-09-02 02:45:00+00:00 290.0 ... \n",
"2021-09-02 02:50:00+00:00 307.0 ... \n",
"2021-09-02 02:55:00+00:00 301.0 ... \n",
"\n",
" snow_depth [cm] frozen_soil_05cm [bit] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 NaN 0.0 \n",
"2021-09-02 02:05:00+00:00 NaN 0.0 \n",
"2021-09-02 02:10:00+00:00 NaN 0.0 \n",
"2021-09-02 02:15:00+00:00 NaN 0.0 \n",
"2021-09-02 02:20:00+00:00 NaN 0.0 \n",
"2021-09-02 02:25:00+00:00 NaN 0.0 \n",
"2021-09-02 02:30:00+00:00 NaN 0.0 \n",
"2021-09-02 02:35:00+00:00 NaN 0.0 \n",
"2021-09-02 02:40:00+00:00 NaN 0.0 \n",
"2021-09-02 02:45:00+00:00 NaN 0.0 \n",
"2021-09-02 02:50:00+00:00 NaN 0.0 \n",
"2021-09-02 02:55:00+00:00 NaN 0.0 \n",
"\n",
" frozen_soil_25cm [bit] frozen_soil_50cm [bit] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.0 0.0 \n",
"2021-09-02 02:05:00+00:00 0.0 0.0 \n",
"2021-09-02 02:10:00+00:00 0.0 0.0 \n",
"2021-09-02 02:15:00+00:00 0.0 0.0 \n",
"2021-09-02 02:20:00+00:00 0.0 0.0 \n",
"2021-09-02 02:25:00+00:00 0.0 0.0 \n",
"2021-09-02 02:30:00+00:00 0.0 0.0 \n",
"2021-09-02 02:35:00+00:00 0.0 0.0 \n",
"2021-09-02 02:40:00+00:00 0.0 0.0 \n",
"2021-09-02 02:45:00+00:00 0.0 0.0 \n",
"2021-09-02 02:50:00+00:00 0.0 0.0 \n",
"2021-09-02 02:55:00+00:00 0.0 0.0 \n",
"\n",
" soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 20.2 21.1 \n",
"2021-09-02 02:05:00+00:00 20.1 21.1 \n",
"2021-09-02 02:10:00+00:00 20.1 21.1 \n",
"2021-09-02 02:15:00+00:00 20.1 21.1 \n",
"2021-09-02 02:20:00+00:00 20.1 21.1 \n",
"2021-09-02 02:25:00+00:00 20.1 21.1 \n",
"2021-09-02 02:30:00+00:00 20.1 21.1 \n",
"2021-09-02 02:35:00+00:00 20.0 21.1 \n",
"2021-09-02 02:40:00+00:00 19.9 21.1 \n",
"2021-09-02 02:45:00+00:00 19.9 21.1 \n",
"2021-09-02 02:50:00+00:00 19.9 20.9 \n",
"2021-09-02 02:55:00+00:00 19.9 20.9 \n",
"\n",
" soil_temp_50cm [degC] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 21.1 \n",
"2021-09-02 02:05:00+00:00 21.1 \n",
"2021-09-02 02:10:00+00:00 21.1 \n",
"2021-09-02 02:15:00+00:00 21.1 \n",
"2021-09-02 02:20:00+00:00 21.1 \n",
"2021-09-02 02:25:00+00:00 21.1 \n",
"2021-09-02 02:30:00+00:00 21.1 \n",
"2021-09-02 02:35:00+00:00 21.1 \n",
"2021-09-02 02:40:00+00:00 21.1 \n",
"2021-09-02 02:45:00+00:00 21.1 \n",
"2021-09-02 02:50:00+00:00 21.1 \n",
"2021-09-02 02:55:00+00:00 21.1 \n",
"\n",
" soil_moisture_05cm [m^3/m^3] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.25 \n",
"2021-09-02 02:05:00+00:00 0.25 \n",
"2021-09-02 02:10:00+00:00 0.25 \n",
"2021-09-02 02:15:00+00:00 0.26 \n",
"2021-09-02 02:20:00+00:00 0.26 \n",
"2021-09-02 02:25:00+00:00 0.27 \n",
"2021-09-02 02:30:00+00:00 0.27 \n",
"2021-09-02 02:35:00+00:00 0.28 \n",
"2021-09-02 02:40:00+00:00 0.28 \n",
"2021-09-02 02:45:00+00:00 0.29 \n",
"2021-09-02 02:50:00+00:00 0.29 \n",
"2021-09-02 02:55:00+00:00 0.29 \n",
"\n",
" soil_moisture_25cm [m^3/m^3] \\\n",
"time \n",
"2021-09-02 02:00:00+00:00 0.33 \n",
"2021-09-02 02:05:00+00:00 0.33 \n",
"2021-09-02 02:10:00+00:00 0.33 \n",
"2021-09-02 02:15:00+00:00 0.33 \n",
"2021-09-02 02:20:00+00:00 0.33 \n",
"2021-09-02 02:25:00+00:00 0.33 \n",
"2021-09-02 02:30:00+00:00 0.34 \n",
"2021-09-02 02:35:00+00:00 0.34 \n",
"2021-09-02 02:40:00+00:00 0.34 \n",
"2021-09-02 02:45:00+00:00 0.34 \n",
"2021-09-02 02:50:00+00:00 0.35 \n",
"2021-09-02 02:55:00+00:00 0.35 \n",
"\n",
" soil_moisture_50cm [m^3/m^3] \n",
"time \n",
"2021-09-02 02:00:00+00:00 0.19 \n",
"2021-09-02 02:05:00+00:00 0.19 \n",
"2021-09-02 02:10:00+00:00 0.19 \n",
"2021-09-02 02:15:00+00:00 0.19 \n",
"2021-09-02 02:20:00+00:00 0.19 \n",
"2021-09-02 02:25:00+00:00 0.19 \n",
"2021-09-02 02:30:00+00:00 0.19 \n",
"2021-09-02 02:35:00+00:00 0.19 \n",
"2021-09-02 02:40:00+00:00 0.19 \n",
"2021-09-02 02:45:00+00:00 0.19 \n",
"2021-09-02 02:50:00+00:00 0.19 \n",
"2021-09-02 02:55:00+00:00 0.19 \n",
"\n",
"[12 rows x 28 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc['VOOR']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Use `loc` to select not only the station id, but also a specific time: we pass in a `tuple` to the `loc` method"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Briefly define **tuples** as opposed to **lists** in Python"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A `tuple` is a core Python class, similar to but distinct from a `list`\n",
"1. Tuples are enclosed in parentheses (); lists are enclosed in brackets []\n",
"1. Elements in a `tuple` are *immutable* (i.e. can't be changed), but `list` elements can be changed."
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"(4, 7, 9)\n",
"9\n"
]
}
],
"source": [
"a = (4,7,9) # a tuple\n",
"print (a)\n",
"print(a[2])\n",
"# this next line won't work; comment it out to see:\n",
"# a[2] = 8"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[4, 7, 9]\n",
"9\n",
"[4, 7, 8]\n"
]
}
],
"source": [
"b = [4, 7, 9]\n",
"print (b)\n",
"print (b[2])\n",
"# this will work:\n",
"b[2] = 8\n",
"print(b)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" Tip: Some Pandas methods accept tuples as their arguments, although lists will usually work too. When in doubt, append a ? to the method call to see the documentation.
"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"What can be confusing is that just because something is enclosed in brackets, that doesn't mean it's a Python list object. In terms of Pandas, the DataFrame's `loc` method typically expects a string that's enclosed in brackets."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n",
" Tip: Note: we can pass in the requested time as a string ... it will work!
"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"temp_2m [degC] 14.30\n",
"temp_9m [degC] 14.40\n",
"relative_humidity [percent] 98.90\n",
"precip_incremental [mm] 1.51\n",
"precip_local [mm] 18.17\n",
"precip_max_intensity [mm/min] 0.39\n",
"avg_wind_speed_prop [m/s] 1.70\n",
"max_wind_speed_prop [m/s] 3.50\n",
"wind_speed_stddev_prop [m/s] 0.70\n",
"wind_direction_prop [degrees] 290.00\n",
"wind_direction_stddev_prop [degrees] 21.00\n",
"avg_wind_speed_sonic [m/s] 1.90\n",
"max_wind_speed_sonic [m/s] 3.60\n",
"wind_speed_stddev_sonic [m/s] 0.70\n",
"wind_direction_sonic [degrees] 294.00\n",
"wind_direction_stddev_sonic [degrees] 20.00\n",
"solar_insolation [W/m^2] 0.00\n",
"station_pressure [mbar] 994.87\n",
"snow_depth [cm] NaN\n",
"frozen_soil_05cm [bit] 0.00\n",
"frozen_soil_25cm [bit] 0.00\n",
"frozen_soil_50cm [bit] 0.00\n",
"soil_temp_05cm [degC] 19.90\n",
"soil_temp_25cm [degC] 21.10\n",
"soil_temp_50cm [degC] 21.10\n",
"soil_moisture_05cm [m^3/m^3] 0.29\n",
"soil_moisture_25cm [m^3/m^3] 0.34\n",
"soil_moisture_50cm [m^3/m^3] 0.19\n",
"Name: (VOOR, 2021-09-02 02:45:00+00:00), dtype: float64"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[('VOOR','2021-09-02 02:45:00 UTC')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"As above, but also select a particular column:"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14.3"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[('VOOR','2021-09-02 02:45:00 UTC'),'temp_2m [degC]' ]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Taking advantage of *multi-indexing* allows us to write selection criteria that is more intuitive than how we've previously defined and utilized conditonal statements in our selection criteria (as in our cell earlier in the notebook with nested data frame object names). It also executes quicker!"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pass in multiple columns (either as a list or a tuple) ... get back a `DataFrame` whose columns are the ones we selected"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"temp_2m [degC] 14.30\n",
"precip_incremental [mm] 1.51\n",
"Name: (VOOR, 2021-09-02 02:45:00+00:00), dtype: float64"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[('VOOR','2021-09-02 02:45:00 UTC'),('temp_2m [degC]','precip_incremental [mm]')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pass in multiple indexes and multiple columns also returns `DataFrame`s:\n",
"1. Two outer and one inner index:"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" temp_2m [degC] | \n",
" precip_incremental [mm] | \n",
"
\n",
" \n",
" station | \n",
" time | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" VOOR | \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 14.3 | \n",
" 1.51 | \n",
"
\n",
" \n",
" KIND | \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 16.4 | \n",
" 0.75 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temp_2m [degC] precip_incremental [mm]\n",
"station time \n",
"VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n",
"KIND 2021-09-02 02:45:00+00:00 16.4 0.75"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[(('VOOR','KIND'),'2021-09-02 02:45:00 UTC'),('temp_2m [degC]','precip_incremental [mm]')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"2. Two outer and two inner:"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" temp_2m [degC] | \n",
" precip_incremental [mm] | \n",
"
\n",
" \n",
" station | \n",
" time | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" VOOR | \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 14.3 | \n",
" 1.51 | \n",
"
\n",
" \n",
" 2021-09-02 02:55:00+00:00 | \n",
" 14.2 | \n",
" 0.77 | \n",
"
\n",
" \n",
" KIND | \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 16.4 | \n",
" 0.75 | \n",
"
\n",
" \n",
" 2021-09-02 02:55:00+00:00 | \n",
" 16.4 | \n",
" 0.56 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temp_2m [degC] precip_incremental [mm]\n",
"station time \n",
"VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n",
" 2021-09-02 02:55:00+00:00 14.2 0.77\n",
"KIND 2021-09-02 02:45:00+00:00 16.4 0.75\n",
" 2021-09-02 02:55:00+00:00 16.4 0.56"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[(('VOOR','KIND'),('2021-09-02 02:45:00 UTC','2021-09-02 02:55:00 UTC')),('temp_2m [degC]','precip_incremental [mm]')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"3. One outer and two inner:"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" temp_2m [degC] | \n",
" precip_incremental [mm] | \n",
"
\n",
" \n",
" station | \n",
" time | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" VOOR | \n",
" 2021-09-02 02:45:00+00:00 | \n",
" 14.3 | \n",
" 1.51 | \n",
"
\n",
" \n",
" 2021-09-02 02:55:00+00:00 | \n",
" 14.2 | \n",
" 0.77 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" temp_2m [degC] precip_incremental [mm]\n",
"station time \n",
"VOOR 2021-09-02 02:45:00+00:00 14.3 1.51\n",
" 2021-09-02 02:55:00+00:00 14.2 0.77"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[('VOOR',('2021-09-02 02:45:00 UTC','2021-09-02 02:55:00 UTC')),('temp_2m [degC]','precip_incremental [mm]')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"A more efficient way to get all stations at a particular time is via Pandas' `xs` DataFrame method."
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
" relative_humidity [percent] | \n",
" precip_incremental [mm] | \n",
" precip_local [mm] | \n",
" precip_max_intensity [mm/min] | \n",
" avg_wind_speed_prop [m/s] | \n",
" max_wind_speed_prop [m/s] | \n",
" wind_speed_stddev_prop [m/s] | \n",
" wind_direction_prop [degrees] | \n",
" ... | \n",
" snow_depth [cm] | \n",
" frozen_soil_05cm [bit] | \n",
" frozen_soil_25cm [bit] | \n",
" frozen_soil_50cm [bit] | \n",
" soil_temp_05cm [degC] | \n",
" soil_temp_25cm [degC] | \n",
" soil_temp_50cm [degC] | \n",
" soil_moisture_05cm [m^3/m^3] | \n",
" soil_moisture_25cm [m^3/m^3] | \n",
" soil_moisture_50cm [m^3/m^3] | \n",
"
\n",
" \n",
" station | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" ADDI | \n",
" 12.8 | \n",
" 13.0 | \n",
" 92.3 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 2.3 | \n",
" 5.0 | \n",
" 0.9 | \n",
" 346.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 20.3 | \n",
" 19.9 | \n",
" 0.52 | \n",
" 0.44 | \n",
" 0.44 | \n",
"
\n",
" \n",
" ANDE | \n",
" 14.2 | \n",
" 14.0 | \n",
" 99.7 | \n",
" 0.16 | \n",
" 13.74 | \n",
" 0.00 | \n",
" 3.1 | \n",
" 6.0 | \n",
" 0.9 | \n",
" 344.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.1 | \n",
" 19.1 | \n",
" 19.2 | \n",
" 0.28 | \n",
" 0.22 | \n",
" 0.14 | \n",
"
\n",
" \n",
" BATA | \n",
" 14.1 | \n",
" 16.2 | \n",
" 77.0 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 2.4 | \n",
" 2.6 | \n",
" 0.1 | \n",
" 309.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.9 | \n",
" 21.5 | \n",
" 21.3 | \n",
" 0.25 | \n",
" 0.21 | \n",
" 0.22 | \n",
"
\n",
" \n",
" BEAC | \n",
" 15.8 | \n",
" 15.6 | \n",
" 98.9 | \n",
" 1.01 | \n",
" 56.83 | \n",
" 0.25 | \n",
" 3.4 | \n",
" 5.5 | \n",
" 0.9 | \n",
" 35.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 17.9 | \n",
" 18.9 | \n",
" 19.5 | \n",
" 0.51 | \n",
" 0.36 | \n",
" 0.37 | \n",
"
\n",
" \n",
" BELD | \n",
" 14.1 | \n",
" 14.2 | \n",
" 93.8 | \n",
" 0.00 | \n",
" 1.16 | \n",
" 0.00 | \n",
" 3.4 | \n",
" 6.5 | \n",
" 1.2 | \n",
" 5.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.7 | \n",
" 20.1 | \n",
" 20.2 | \n",
" 0.50 | \n",
" 0.43 | \n",
" 0.41 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" WFMB | \n",
" 13.0 | \n",
" 13.6 | \n",
" 74.2 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.4 | \n",
" 2.0 | \n",
" 0.4 | \n",
" 260.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 18.3 | \n",
" 19.9 | \n",
" 19.9 | \n",
" 0.25 | \n",
" 0.18 | \n",
" 0.20 | \n",
"
\n",
" \n",
" WGAT | \n",
" 13.5 | \n",
" 13.4 | \n",
" 80.4 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.9 | \n",
" 3.3 | \n",
" 0.7 | \n",
" 22.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 18.4 | \n",
" 20.1 | \n",
" 20.7 | \n",
" 0.15 | \n",
" 0.25 | \n",
" 0.08 | \n",
"
\n",
" \n",
" WHIT | \n",
" 16.3 | \n",
" 16.3 | \n",
" 93.2 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 1.6 | \n",
" 3.1 | \n",
" 0.5 | \n",
" 28.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 19.1 | \n",
" 20.3 | \n",
" 20.1 | \n",
" 0.28 | \n",
" 0.47 | \n",
" 0.46 | \n",
"
\n",
" \n",
" WOLC | \n",
" 13.6 | \n",
" 16.3 | \n",
" 86.1 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.6 | \n",
" 0.9 | \n",
" 0.1 | \n",
" 317.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 21.5 | \n",
" 23.5 | \n",
" 24.2 | \n",
" 0.18 | \n",
" 0.03 | \n",
" 0.07 | \n",
"
\n",
" \n",
" YORK | \n",
" 11.6 | \n",
" 12.5 | \n",
" 97.2 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.00 | \n",
" 0.1 | \n",
" 0.7 | \n",
" 0.2 | \n",
" 165.0 | \n",
" ... | \n",
" NaN | \n",
" 0.0 | \n",
" 0.0 | \n",
" 0.0 | \n",
" 20.3 | \n",
" 21.7 | \n",
" 21.9 | \n",
" 0.13 | \n",
" 0.24 | \n",
" 0.24 | \n",
"
\n",
" \n",
"
\n",
"
126 rows × 28 columns
\n",
"
"
],
"text/plain": [
" temp_2m [degC] temp_9m [degC] relative_humidity [percent] \\\n",
"station \n",
"ADDI 12.8 13.0 92.3 \n",
"ANDE 14.2 14.0 99.7 \n",
"BATA 14.1 16.2 77.0 \n",
"BEAC 15.8 15.6 98.9 \n",
"BELD 14.1 14.2 93.8 \n",
"... ... ... ... \n",
"WFMB 13.0 13.6 74.2 \n",
"WGAT 13.5 13.4 80.4 \n",
"WHIT 16.3 16.3 93.2 \n",
"WOLC 13.6 16.3 86.1 \n",
"YORK 11.6 12.5 97.2 \n",
"\n",
" precip_incremental [mm] precip_local [mm] \\\n",
"station \n",
"ADDI 0.00 0.00 \n",
"ANDE 0.16 13.74 \n",
"BATA 0.00 0.00 \n",
"BEAC 1.01 56.83 \n",
"BELD 0.00 1.16 \n",
"... ... ... \n",
"WFMB 0.00 0.00 \n",
"WGAT 0.00 0.00 \n",
"WHIT 0.00 0.00 \n",
"WOLC 0.00 0.00 \n",
"YORK 0.00 0.00 \n",
"\n",
" precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] \\\n",
"station \n",
"ADDI 0.00 2.3 \n",
"ANDE 0.00 3.1 \n",
"BATA 0.00 2.4 \n",
"BEAC 0.25 3.4 \n",
"BELD 0.00 3.4 \n",
"... ... ... \n",
"WFMB 0.00 0.4 \n",
"WGAT 0.00 0.9 \n",
"WHIT 0.00 1.6 \n",
"WOLC 0.00 0.6 \n",
"YORK 0.00 0.1 \n",
"\n",
" max_wind_speed_prop [m/s] wind_speed_stddev_prop [m/s] \\\n",
"station \n",
"ADDI 5.0 0.9 \n",
"ANDE 6.0 0.9 \n",
"BATA 2.6 0.1 \n",
"BEAC 5.5 0.9 \n",
"BELD 6.5 1.2 \n",
"... ... ... \n",
"WFMB 2.0 0.4 \n",
"WGAT 3.3 0.7 \n",
"WHIT 3.1 0.5 \n",
"WOLC 0.9 0.1 \n",
"YORK 0.7 0.2 \n",
"\n",
" wind_direction_prop [degrees] ... snow_depth [cm] \\\n",
"station ... \n",
"ADDI 346.0 ... NaN \n",
"ANDE 344.0 ... NaN \n",
"BATA 309.0 ... NaN \n",
"BEAC 35.0 ... NaN \n",
"BELD 5.0 ... NaN \n",
"... ... ... ... \n",
"WFMB 260.0 ... NaN \n",
"WGAT 22.0 ... NaN \n",
"WHIT 28.0 ... NaN \n",
"WOLC 317.0 ... NaN \n",
"YORK 165.0 ... NaN \n",
"\n",
" frozen_soil_05cm [bit] frozen_soil_25cm [bit] \\\n",
"station \n",
"ADDI 0.0 0.0 \n",
"ANDE 0.0 0.0 \n",
"BATA 0.0 0.0 \n",
"BEAC 0.0 0.0 \n",
"BELD 0.0 0.0 \n",
"... ... ... \n",
"WFMB 0.0 0.0 \n",
"WGAT 0.0 0.0 \n",
"WHIT 0.0 0.0 \n",
"WOLC 0.0 0.0 \n",
"YORK 0.0 0.0 \n",
"\n",
" frozen_soil_50cm [bit] soil_temp_05cm [degC] soil_temp_25cm [degC] \\\n",
"station \n",
"ADDI 0.0 19.9 20.3 \n",
"ANDE 0.0 19.1 19.1 \n",
"BATA 0.0 19.9 21.5 \n",
"BEAC 0.0 17.9 18.9 \n",
"BELD 0.0 19.7 20.1 \n",
"... ... ... ... \n",
"WFMB 0.0 18.3 19.9 \n",
"WGAT 0.0 18.4 20.1 \n",
"WHIT 0.0 19.1 20.3 \n",
"WOLC 0.0 21.5 23.5 \n",
"YORK 0.0 20.3 21.7 \n",
"\n",
" soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] \\\n",
"station \n",
"ADDI 19.9 0.52 \n",
"ANDE 19.2 0.28 \n",
"BATA 21.3 0.25 \n",
"BEAC 19.5 0.51 \n",
"BELD 20.2 0.50 \n",
"... ... ... \n",
"WFMB 19.9 0.25 \n",
"WGAT 20.7 0.15 \n",
"WHIT 20.1 0.28 \n",
"WOLC 24.2 0.18 \n",
"YORK 21.9 0.13 \n",
"\n",
" soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3] \n",
"station \n",
"ADDI 0.44 0.44 \n",
"ANDE 0.22 0.14 \n",
"BATA 0.21 0.22 \n",
"BEAC 0.36 0.37 \n",
"BELD 0.43 0.41 \n",
"... ... ... \n",
"WFMB 0.18 0.20 \n",
"WGAT 0.25 0.08 \n",
"WHIT 0.47 0.46 \n",
"WOLC 0.03 0.07 \n",
"YORK 0.24 0.24 \n",
"\n",
"[126 rows x 28 columns]"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Pass in one index value and what index it belongs to\n",
"df.xs('2021-09-02 02:45:00 UTC', level='time')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Include just a *list* of columns (a *tuple* won't work here)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" temp_2m [degC] | \n",
" temp_9m [degC] | \n",
"
\n",
" \n",
" station | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" ADDI | \n",
" 12.8 | \n",
" 13.0 | \n",
"
\n",
" \n",
" ANDE | \n",
" 14.2 | \n",
" 14.0 | \n",
"
\n",
" \n",
" BATA | \n",
" 14.1 | \n",
" 16.2 | \n",
"
\n",
" \n",
" BEAC | \n",
" 15.8 | \n",
" 15.6 | \n",
"
\n",
" \n",
" BELD | \n",
" 14.1 | \n",
" 14.2 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" WFMB | \n",
" 13.0 | \n",
" 13.6 | \n",
"
\n",
" \n",
" WGAT | \n",
" 13.5 | \n",
" 13.4 | \n",
"
\n",
" \n",
" WHIT | \n",
" 16.3 | \n",
" 16.3 | \n",
"
\n",
" \n",
" WOLC | \n",
" 13.6 | \n",
" 16.3 | \n",
"
\n",
" \n",
" YORK | \n",
" 11.6 | \n",
" 12.5 | \n",
"
\n",
" \n",
"
\n",
"
126 rows × 2 columns
\n",
"
"
],
"text/plain": [
" temp_2m [degC] temp_9m [degC]\n",
"station \n",
"ADDI 12.8 13.0\n",
"ANDE 14.2 14.0\n",
"BATA 14.1 16.2\n",
"BEAC 15.8 15.6\n",
"BELD 14.1 14.2\n",
"... ... ...\n",
"WFMB 13.0 13.6\n",
"WGAT 13.5 13.4\n",
"WHIT 16.3 16.3\n",
"WOLC 13.6 16.3\n",
"YORK 11.6 12.5\n",
"\n",
"[126 rows x 2 columns]"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.xs('2021-09-02 02:45:00 UTC', level = 'time')[['temp_2m [degC]','temp_9m [degC]']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Work with the time index as a Datetime object:"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pandas' handling of `datetime` objects is incredibly powerful (see https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html) . Below, we select 3 times using the `date_range` method."
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"timeRange = pd.date_range('2021-09-02 02:25', periods=3, freq='5min')"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"name": "stderr",
"output_type": "stream",
"text": [
"/tmp/ipykernel_82816/1874800629.py:1: FutureWarning: Indexing a timezone-aware DatetimeIndex with a timezone-naive datetime is deprecated and will raise KeyError in a future version. Use a timezone-aware object instead.\n",
" df.loc[('MANH',timeRange.values),('max_wind_speed_prop [m/s]', 'max_wind_speed_sonic [m/s]')]\n"
]
},
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" | \n",
" max_wind_speed_prop [m/s] | \n",
" max_wind_speed_sonic [m/s] | \n",
"
\n",
" \n",
" station | \n",
" time | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" MANH | \n",
" 2021-09-02 02:25:00+00:00 | \n",
" 8.2 | \n",
" 10.2 | \n",
"
\n",
" \n",
" 2021-09-02 02:30:00+00:00 | \n",
" 8.7 | \n",
" 10.3 | \n",
"
\n",
" \n",
" 2021-09-02 02:35:00+00:00 | \n",
" 12.2 | \n",
" 11.5 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" max_wind_speed_prop [m/s] \\\n",
"station time \n",
"MANH 2021-09-02 02:25:00+00:00 8.2 \n",
" 2021-09-02 02:30:00+00:00 8.7 \n",
" 2021-09-02 02:35:00+00:00 12.2 \n",
"\n",
" max_wind_speed_sonic [m/s] \n",
"station time \n",
"MANH 2021-09-02 02:25:00+00:00 10.2 \n",
" 2021-09-02 02:30:00+00:00 10.3 \n",
" 2021-09-02 02:35:00+00:00 11.5 "
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"df.loc[('MANH',timeRange.values),('max_wind_speed_prop [m/s]', 'max_wind_speed_sonic [m/s]')]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"---\n",
"## Summary\n",
"* One can recast a Pandas `DataFrame` so it leverages multi-indexing\n",
"* Selecting and specifying conditions on a multi-index `DataFrame` work just as in a single-index `DataFrame`\n",
"* Pandas has a large set of utilities to work with time-series based data. `xs` is one such method.\n",
"* In Python, **tuples** are akin to **lists**, but a tuple's elements are *immutable*.\n",
"\n",
"### What's Next?\n",
"We will next explore dataframes with geographic attributes, via the [GeoPandas](https://geopandas.org) package.\n",
"## Resources and References\n",
"1. [MetPy Monday Episode 96](https://www.youtube.com/watch?v=yQ5IxnZouKo&list=PLQut5OXpV-0ir4IdllSt1iEZKTwFBa7kO&index=87&ab_channel=Unidata)\n",
"1. https://www.geeksforgeeks.org/python-difference-between-list-and-tuple/\n",
"1. https://stackoverflow.com/questions/8900166/whats-the-difference-between-lists-enclosed-by-square-brackets-and-parentheses\n",
"1. [Time series in Pandas](https://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html)"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 August 2022 Environment",
"language": "python",
"name": "aug22"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.5"
}
},
"nbformat": 4,
"nbformat_minor": 4
}