Pandas 1: Introduction to Pandas

pandas Logo

Overview

Pandas, along with Matplotlib and Numpy, forms the Great Triumvirate of the scientific Python ecosystem. Its features, as cited in https://pandas.pydata.org/about/, include:

  1. A fast and efficient DataFrame object for data manipulation with integrated indexing;

  2. Tools for reading and writing data between in-memory data structures and different formats: CSV and text files, Microsoft Excel, SQL databases, and the fast HDF5 format;

  3. Intelligent data alignment and integrated handling of missing data: gain automatic label-based alignment in computations and easily manipulate messy data into an orderly form;

  4. Flexible reshaping and pivoting of data sets;

  5. Intelligent label-based slicing, fancy indexing, and subsetting of large data sets;

  6. Columns can be inserted and deleted from data structures for size mutability;

  7. Aggregating or transforming data with a powerful group by engine allowing split-apply-combine operations on data sets;

  8. High performance merging and joining of data sets;

  9. Hierarchical axis indexing provides an intuitive way of working with high-dimensional data in a lower-dimensional data structure;

  10. Time series-functionality: date range generation and frequency conversion, moving window statistics, date shifting and lagging. Even create domain-specific time offsets and join time series without losing data;

  11. Highly optimized for performance, with critical code paths written in Cython or C.

  12. Python with pandas is in use in a wide variety of academic and commercial domains, including Finance, Neuroscience, Economics, Statistics, Advertising, Web Analytics, and more (such as atmospheric science!).

Prerequisites

Concepts

Importance

Notes

Python basics

Necessary

Numpy basics

Helpful

  • Time to learn: 30 minutes


Imports

To begin using Pandas, simply import it. You will often see the nickname pd used as an abbreviation for pandas in the import statement, just like numpy is often imported as np.

import pandas as pd

Typically, one uses Pandas to read from and/or write to files containing tabular data … e.g., text files consisting of rows and columns. Let’s use for this notebook a file containing NYS Mesonet (NYSM) data from 0200 UTC 2 September 2021.

First, let’s view the first and last five lines of this data file as if we were using the Linux command-line interface.

Tip: In a Jupyter notebook, you can invoke Linux commands by prepending each Linux command with a !
# Directly run the Linux `head` and `tail` commands to display the first five lines and last five lines from the data file.
dataFile = '/spare11/atm533/data/nysm_data_2021090202.csv'
!head -5 {dataFile}
!echo .
!echo .
!echo .
!tail -5 {dataFile}
station,time,temp_2m [degC],temp_9m [degC],relative_humidity [percent],precip_incremental [mm],precip_local [mm],precip_max_intensity [mm/min],avg_wind_speed_prop [m/s],max_wind_speed_prop [m/s],wind_speed_stddev_prop [m/s],wind_direction_prop [degrees],wind_direction_stddev_prop [degrees],avg_wind_speed_sonic [m/s],max_wind_speed_sonic [m/s],wind_speed_stddev_sonic [m/s],wind_direction_sonic [degrees],wind_direction_stddev_sonic [degrees],solar_insolation [W/m^2],station_pressure [mbar],snow_depth [cm],frozen_soil_05cm [bit],frozen_soil_25cm [bit],frozen_soil_50cm [bit],soil_temp_05cm [degC],soil_temp_25cm [degC],soil_temp_50cm [degC],soil_moisture_05cm [m^3/m^3],soil_moisture_25cm [m^3/m^3],soil_moisture_50cm [m^3/m^3]

ADDI,2021-09-02 02:00:00 UTC,13.3,13.4,92.5,0.00,0.00,0.00,2.6,5.8,0.9,344,16,3.0,6.2,1.0,346,13,0,951.47,,0,0,0,19.9,20.3,19.9,0.52,0.44,0.44

ANDE,2021-09-02 02:00:00 UTC,14.0,13.7,100.0,0.28,10.67,0.00,1.9,2.6,0.4,356,16,2.2,3.3,0.5,355,14,0,947.40,,0,0,0,19.1,19.1,19.3,0.25,0.21,0.14

BATA,2021-09-02 02:00:00 UTC,14.8,16.3,77.5,0.00,0.00,0.00,1.9,2.3,0.2,310,5,2.0,2.5,0.2,314,4,0,979.93,,0,0,0,20.5,21.5,21.3,0.25,0.21,0.22

BEAC,2021-09-02 02:00:00 UTC,16.0,15.9,98.6,1.93,37.53,0.48,3.2,6.5,1.4,25,17,3.7,7.5,1.4,24,17,0,994.69,,0,0,0,18.4,19.3,19.8,0.51,0.36,0.37
.
.
.
WFMB,2021-09-02 02:00:00 UTC,12.9,13.6,75.0,0.00,0.00,0.00,0.8,1.5,0.3,253,26,0.8,2.0,0.3,250,27,0,941.26,,0,0,0,18.8,19.9,19.9,0.24,0.18,0.20

WGAT,2021-09-02 02:00:00 UTC,13.8,13.8,79.5,0.00,0.00,0.00,1.4,4.0,1.0,8,33,1.8,4.6,1.0,11,34,0,959.46,,0,0,0,18.8,20.2,20.7,0.16,0.25,0.08

WHIT,2021-09-02 02:00:00 UTC,15.7,15.8,95.9,0.00,0.00,0.00,1.4,2.6,0.4,342,19,1.6,3.2,0.6,348,18,1,1006.29,,0,0,0,19.3,20.3,20.1,0.28,0.47,0.46

WOLC,2021-09-02 02:00:00 UTC,14.0,16.6,84.6,0.00,0.00,0.00,0.4,0.9,0.2,350,19,0.6,1.4,0.3,353,20,0,996.90,,0,0,0,21.9,23.5,24.2,0.18,0.03,0.07

YORK,2021-09-02 02:00:00 UTC,12.0,13.9,96.0,0.00,0.00,0.00,0.0,0.3,0.1,274,0,0.2,0.6,0.2,244,22,0,991.17,,0,0,0,20.5,21.8,21.9,0.13,0.24,0.24

We can see that this file has comma-separated values, hence the csv suffix is used for naming. It has a line, or row at the top identifying what each column corresponds to, data-wise. Then, there follows 126 rows, in alphabetical order for each of the 126 NYS Mesonet sites.

Note: Occasionally, some columns may have missing data. For an example of this, change the dataFile's file name so it references 0000 UTC Sep. 11, 2020, and then rerun the cell. Examine Wolcott's (WOLC) values. Change back to 0200 UTC 2 Sep. 2021 and re-run before you proceed!

Although there is a lot of interesting data in this file, it’s all currently in a text-based form, not terribly conducive to data analysis nor visualization. Pandas to the rescue!

Let’s introduce ourselves to Pandas’ two core objects: the DataFrame and the Series.

The pandas DataFrame

… is a labeled, two dimensional columnal structure similar to a table, Excel-like spreadsheet, or the R language’s data.frame.

The columns that make up our DataFrame can be lists, dictionaries, NumPy arrays, pandas Series, or more. Within these columns our data can be any texts, numbers, dates and times, or many other data types you may have encountered in Python and NumPy. Shown here on the left in dark gray, our very first column is uniquely referrred to as an Index, and this contains information characterizing each row of our DataFrame. Similar to any other column, the index can label our rows by text, numbers, datetimes (a popular one!), or more.

It turns out that a Pandas DataFrame consists of one or more Pandas Series. We’ll discuss the latter in a moment, but for now, let’s create a DataFrame from our text-based data file.

We can read the data into a Pandas DataFrame object by calling Pandas’ read_csv method, since the data file consists of comma-separated values.

df = pd.read_csv(dataFile)
Tip: We have used a generic object name, df to store the resulting DataFrame. We are free to choose any valid Python object name. For example, we could have named it nysmData21090200 (note that Python object names cannot start with a number).

By simply typing the name of the DataFrame object, we can see its contents displayed in a browser-friendly format. Since we passed no arguments besides the name of the csv file, the DataFrame has the following default properties:

  1. The first and last five rows and columns are displayed

  2. The column names arise from the first line in the file

  3. The row names (or more precisely, row index names) are numbered sequentially, beginning at 0.

df
station time temp_2m [degC] temp_9m [degC] relative_humidity [percent] precip_incremental [mm] precip_local [mm] precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] max_wind_speed_prop [m/s] ... snow_depth [cm] frozen_soil_05cm [bit] frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3]
0 ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 92.5 0.00 0.00 0.00 2.6 5.8 ... NaN 0.0 0.0 0.0 19.9 20.3 19.9 0.52 0.44 0.44
1 ANDE 2021-09-02 02:00:00 UTC 14.0 13.7 100.0 0.28 10.67 0.00 1.9 2.6 ... NaN 0.0 0.0 0.0 19.1 19.1 19.3 0.25 0.21 0.14
2 BATA 2021-09-02 02:00:00 UTC 14.8 16.3 77.5 0.00 0.00 0.00 1.9 2.3 ... NaN 0.0 0.0 0.0 20.5 21.5 21.3 0.25 0.21 0.22
3 BEAC 2021-09-02 02:00:00 UTC 16.0 15.9 98.6 1.93 37.53 0.48 3.2 6.5 ... NaN 0.0 0.0 0.0 18.4 19.3 19.8 0.51 0.36 0.37
4 BELD 2021-09-02 02:00:00 UTC 14.3 14.5 94.9 0.00 1.16 0.00 2.6 4.8 ... NaN 0.0 0.0 0.0 19.7 20.1 20.2 0.50 0.43 0.41
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
121 WFMB 2021-09-02 02:00:00 UTC 12.9 13.6 75.0 0.00 0.00 0.00 0.8 1.5 ... NaN 0.0 0.0 0.0 18.8 19.9 19.9 0.24 0.18 0.20
122 WGAT 2021-09-02 02:00:00 UTC 13.8 13.8 79.5 0.00 0.00 0.00 1.4 4.0 ... NaN 0.0 0.0 0.0 18.8 20.2 20.7 0.16 0.25 0.08
123 WHIT 2021-09-02 02:00:00 UTC 15.7 15.8 95.9 0.00 0.00 0.00 1.4 2.6 ... NaN 0.0 0.0 0.0 19.3 20.3 20.1 0.28 0.47 0.46
124 WOLC 2021-09-02 02:00:00 UTC 14.0 16.6 84.6 0.00 0.00 0.00 0.4 0.9 ... NaN 0.0 0.0 0.0 21.9 23.5 24.2 0.18 0.03 0.07
125 YORK 2021-09-02 02:00:00 UTC 12.0 13.9 96.0 0.00 0.00 0.00 0.0 0.3 ... NaN 0.0 0.0 0.0 20.5 21.8 21.9 0.13 0.24 0.24

126 rows × 30 columns

Pandas allows us to use its set_option method to override the default settings. Let’s use it so we see all rows and columns.

pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)
df
station time temp_2m [degC] temp_9m [degC] relative_humidity [percent] precip_incremental [mm] precip_local [mm] precip_max_intensity [mm/min] avg_wind_speed_prop [m/s] max_wind_speed_prop [m/s] wind_speed_stddev_prop [m/s] wind_direction_prop [degrees] wind_direction_stddev_prop [degrees] avg_wind_speed_sonic [m/s] max_wind_speed_sonic [m/s] wind_speed_stddev_sonic [m/s] wind_direction_sonic [degrees] wind_direction_stddev_sonic [degrees] solar_insolation [W/m^2] station_pressure [mbar] snow_depth [cm] frozen_soil_05cm [bit] frozen_soil_25cm [bit] frozen_soil_50cm [bit] soil_temp_05cm [degC] soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3]
0 ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 92.5 0.00 0.00 0.00 2.6 5.8 0.9 344.0 16.0 3.0 6.2 1.0 346.0 13.0 0 951.47 NaN 0.0 0.0 0.0 19.9 20.3 19.9 0.52 0.44 0.44
1 ANDE 2021-09-02 02:00:00 UTC 14.0 13.7 100.0 0.28 10.67 0.00 1.9 2.6 0.4 356.0 16.0 2.2 3.3 0.5 355.0 14.0 0 947.40 NaN 0.0 0.0 0.0 19.1 19.1 19.3 0.25 0.21 0.14
2 BATA 2021-09-02 02:00:00 UTC 14.8 16.3 77.5 0.00 0.00 0.00 1.9 2.3 0.2 310.0 5.0 2.0 2.5 0.2 314.0 4.0 0 979.93 NaN 0.0 0.0 0.0 20.5 21.5 21.3 0.25 0.21 0.22
3 BEAC 2021-09-02 02:00:00 UTC 16.0 15.9 98.6 1.93 37.53 0.48 3.2 6.5 1.4 25.0 17.0 3.7 7.5 1.4 24.0 17.0 0 994.69 NaN 0.0 0.0 0.0 18.4 19.3 19.8 0.51 0.36 0.37
4 BELD 2021-09-02 02:00:00 UTC 14.3 14.5 94.9 0.00 1.16 0.00 2.6 4.8 0.7 38.0 24.0 2.8 5.0 0.8 39.0 24.0 0 954.02 NaN 0.0 0.0 0.0 19.7 20.1 20.2 0.50 0.43 0.41
5 BELL 2021-09-02 02:00:00 UTC 13.6 14.2 85.8 0.00 0.00 0.00 2.2 3.6 0.4 56.0 9.0 2.4 3.6 0.4 53.0 7.0 0 994.13 NaN 0.0 0.0 0.0 20.1 20.7 20.7 0.29 0.30 0.38
6 BELM 2021-09-02 02:00:00 UTC 11.3 11.7 97.8 0.00 0.00 0.00 NaN NaN NaN 0.0 0.0 0.1 0.4 0.1 114.0 5.0 1 963.45 NaN 0.0 0.0 0.0 19.7 20.5 20.3 0.38 0.29 0.30
7 BERK 2021-09-02 02:00:00 UTC 15.4 15.5 93.8 0.00 0.00 0.00 3.1 5.6 0.7 334.0 16.0 3.4 6.6 0.9 335.0 15.0 1 963.65 NaN 0.0 0.0 0.0 19.1 19.9 19.7 0.50 0.45 0.45
8 BING 2021-09-02 02:00:00 UTC 14.1 14.0 99.6 0.00 0.10 0.00 2.3 5.0 1.3 3.0 24.0 2.7 5.4 1.4 11.0 19.0 0 947.08 NaN 0.0 0.0 0.0 18.8 20.1 20.3 0.23 0.17 0.32
9 BKLN 2021-09-02 02:00:00 UTC 19.9 20.0 97.9 3.21 75.19 0.93 3.4 9.5 2.3 350.0 31.0 4.2 11.8 2.8 358.0 28.0 0 995.02 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
10 BRAN 2021-09-02 02:00:00 UTC 11.7 13.4 93.8 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.1 148.0 5.0 0 986.88 NaN 0.0 0.0 0.0 20.5 21.1 21.1 0.18 0.19 0.29
11 BREW 2021-09-02 02:00:00 UTC 16.2 16.5 99.7 1.01 26.89 0.27 4.9 9.8 1.7 28.0 23.0 5.5 9.9 1.8 30.0 21.0 0 981.90 NaN 0.0 0.0 0.0 20.3 20.3 20.3 0.41 0.33 0.35
12 BROC 2021-09-02 02:00:00 UTC 13.8 16.7 87.2 0.00 0.00 0.00 0.9 1.2 0.1 237.0 15.0 0.9 1.4 0.2 243.0 17.0 0 991.82 NaN 0.0 0.0 0.0 20.7 21.9 21.7 0.17 0.16 0.18
13 BRON 2021-09-02 02:00:00 UTC 18.0 17.8 99.0 6.86 110.35 1.84 6.8 11.5 2.0 17.0 20.0 7.7 12.8 2.1 19.0 17.0 0 993.97 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
14 BROO 2021-09-02 02:00:00 UTC 13.3 13.2 97.6 0.00 0.00 0.00 1.5 3.6 1.0 3.0 27.0 2.0 5.0 1.1 6.0 31.0 0 951.42 NaN 0.0 0.0 0.0 19.5 19.9 19.5 0.53 0.47 0.34
15 BSPA 2021-09-02 02:00:00 UTC 14.9 14.9 96.5 0.00 2.48 0.00 0.5 1.4 0.4 108.0 32.0 0.8 1.8 0.4 102.0 35.0 0 995.70 NaN 0.0 0.0 0.0 19.7 21.7 21.3 0.49 0.39 0.39
16 BUFF 2021-09-02 02:00:00 UTC 11.8 14.4 97.8 0.00 0.00 0.00 0.2 0.6 0.2 332.0 16.0 0.6 1.0 0.2 344.0 24.0 0 991.53 NaN 0.0 0.0 0.0 20.1 21.1 21.1 0.21 0.20 0.27
17 BURD 2021-09-02 02:00:00 UTC 13.3 13.5 92.6 0.00 0.00 0.00 2.0 6.4 1.4 14.0 30.0 2.4 6.6 1.5 14.0 22.0 0 944.80 NaN 0.0 0.0 0.0 19.3 19.5 19.3 0.29 0.24 0.28
18 BURT 2021-09-02 02:00:00 UTC 11.5 14.7 96.8 0.00 0.00 0.00 0.5 0.8 0.1 184.0 14.0 0.6 0.8 0.1 194.0 20.0 0 1001.10 NaN 0.0 0.0 0.0 21.1 21.8 21.8 0.15 0.17 0.14
19 CAMD 2021-09-02 02:00:00 UTC NaN 15.5 79.5 0.00 0.00 0.00 0.6 1.1 0.2 318.0 21.0 0.7 1.3 0.2 322.0 21.0 0 992.91 NaN 0.0 0.0 0.0 20.3 20.9 21.3 0.26 0.18 0.27
20 CAPE 2021-09-02 02:00:00 UTC 10.8 14.3 92.7 0.00 0.00 0.00 1.1 1.3 0.1 40.0 3.0 1.1 1.4 0.2 41.0 2.0 0 1001.73 NaN 0.0 0.0 0.0 18.4 18.6 17.7 0.14 0.23 0.15
21 CHAZ 2021-09-02 02:00:00 UTC 13.4 16.0 92.3 0.00 0.00 0.00 1.6 2.2 0.2 305.0 10.0 1.7 2.5 0.2 308.0 9.0 0 1004.86 NaN 0.0 0.0 0.0 18.9 20.4 19.9 0.19 0.12 0.13
22 CHES 2021-09-02 02:00:00 UTC 14.5 14.6 100.0 0.00 0.00 0.00 0.8 1.4 0.5 311.0 10.0 0.8 1.7 0.5 316.0 15.0 0 972.13 NaN 0.0 0.0 0.0 18.0 18.8 18.8 0.13 0.20 0.15
23 CINC 2021-09-02 02:00:00 UTC 14.7 14.7 94.6 0.00 0.00 0.00 2.1 4.3 0.8 16.0 21.0 2.5 5.2 0.9 22.0 19.0 0 960.11 NaN 0.0 0.0 0.0 18.8 19.7 19.7 0.32 0.36 0.23
24 CLAR 2021-09-02 02:00:00 UTC 13.9 14.0 97.9 0.52 13.89 0.11 0.8 2.0 0.5 2.0 28.0 1.0 1.8 0.4 5.0 26.0 0 939.43 NaN 0.0 0.0 0.0 17.9 19.1 19.3 0.22 0.22 0.15
25 CLIF 2021-09-02 02:00:00 UTC 13.7 15.2 89.9 0.00 0.00 0.00 1.1 1.7 0.2 308.0 15.0 1.2 1.9 0.3 310.0 13.0 0 988.85 NaN 0.0 0.0 0.0 20.7 21.3 21.3 0.23 0.21 0.25
26 CLYM 2021-09-02 02:00:00 UTC 11.2 13.3 NaN 0.00 0.00 0.00 1.2 1.4 0.1 310.0 4.0 1.2 1.5 0.1 314.0 4.0 0 960.61 NaN 0.0 0.0 0.0 20.1 20.9 20.7 0.31 0.27 0.13
27 COBL 2021-09-02 02:00:00 UTC 14.7 14.5 96.3 0.00 4.02 0.00 2.2 4.4 0.7 46.0 14.0 2.4 4.5 0.7 51.0 13.0 0 971.09 NaN 0.0 0.0 0.0 19.9 20.7 20.5 0.23 0.24 0.29
28 COHO 2021-09-02 02:00:00 UTC 12.6 13.0 81.1 0.00 0.00 0.00 3.4 4.9 0.6 342.0 10.0 3.8 5.5 0.7 346.0 8.0 0 942.30 NaN 0.0 0.0 0.0 18.8 19.4 18.9 0.28 0.29 0.35
29 COLD 2021-09-02 02:00:00 UTC 14.0 14.2 87.9 0.00 0.00 0.00 2.1 2.7 0.3 17.0 13.0 2.3 3.3 0.4 15.0 8.0 0 959.80 NaN 0.0 0.0 0.0 19.7 20.5 20.7 0.12 0.13 0.12
30 COPA 2021-09-02 02:00:00 UTC 15.6 15.7 NaN NaN NaN NaN 4.0 8.1 1.3 50.0 20.0 4.6 6.9 1.3 48.0 15.0 0 NaN NaN 0.0 0.0 0.0 18.8 19.1 18.8 0.47 0.34 0.36
31 COPE 2021-09-02 02:00:00 UTC 12.2 12.3 94.4 0.00 0.00 0.00 0.8 1.7 0.4 252.0 22.0 1.0 2.0 0.4 257.0 21.0 0 982.49 NaN 0.0 0.0 NaN 19.2 19.5 NaN 0.45 0.26 NaN
32 CROG 2021-09-02 02:00:00 UTC 12.3 12.4 92.2 0.00 0.00 0.00 0.2 1.3 0.3 353.0 23.0 0.5 1.8 0.3 355.0 24.0 0 963.02 NaN 0.0 0.0 0.0 19.5 20.5 20.1 0.10 0.12 0.08
33 CSQR 2021-09-02 02:00:00 UTC 13.9 15.0 83.9 0.00 0.00 0.00 0.5 1.3 0.3 333.0 41.0 0.7 1.6 0.3 331.0 39.0 1 994.69 NaN 0.0 0.0 0.0 19.5 19.9 19.9 0.26 0.20 0.27
34 DELE 2021-09-02 02:00:00 UTC 11.3 12.0 93.0 0.00 0.00 0.00 2.5 2.8 0.1 339.0 3.0 2.8 3.2 0.2 349.0 3.0 1 940.90 NaN 0.0 0.0 0.0 20.1 20.1 20.1 0.44 0.44 0.46
35 DEPO 2021-09-02 02:00:00 UTC 16.0 16.1 90.1 0.05 3.17 0.00 2.5 5.8 1.5 1.0 28.0 3.0 6.6 1.7 356.0 25.0 0 971.43 NaN 0.0 0.0 NaN 19.9 20.7 NaN 0.17 0.27 NaN
36 DOVE 2021-09-02 02:00:00 UTC 16.0 NaN 95.2 1.35 24.69 0.36 2.1 4.1 0.8 1.0 34.0 2.5 5.2 1.0 5.0 31.0 0 990.72 NaN 0.0 0.0 0.0 19.6 21.5 21.5 0.39 0.22 0.15
37 DUAN 2021-09-02 02:00:00 UTC 13.2 13.1 100.0 0.36 6.11 0.00 0.9 2.5 0.5 48.0 28.0 1.3 3.8 0.7 47.0 24.0 0 961.08 NaN 0.0 0.0 0.0 19.5 20.2 20.1 0.55 0.38 0.42
38 EAUR 2021-09-02 02:00:00 UTC 10.6 12.9 97.0 0.00 0.00 0.00 0.1 0.4 0.1 115.0 0.0 0.2 0.6 0.2 124.0 19.0 0 968.56 NaN 0.0 0.0 0.0 20.2 20.3 20.1 0.33 0.41 0.31
39 EDIN 2021-09-02 02:00:00 UTC 14.8 14.8 97.9 0.00 0.00 0.00 0.6 1.7 0.4 3.0 25.0 0.7 1.8 0.4 359.0 25.0 0 972.19 NaN 0.0 0.0 0.0 19.9 20.9 21.1 0.13 0.14 0.10
40 EDWA 2021-09-02 02:00:00 UTC 10.3 12.1 98.2 0.00 0.00 0.00 0.5 0.6 0.1 29.0 3.0 0.6 0.8 0.1 26.0 4.0 0 987.28 NaN 0.0 0.0 0.0 17.5 17.9 17.9 0.24 0.36 0.29
41 ELDR 2021-09-02 02:00:00 UTC 14.9 NaN 98.6 0.64 16.78 0.16 0.9 2.3 0.4 96.0 39.0 1.1 2.5 0.5 105.0 39.0 0 964.96 NaN 0.0 0.0 0.0 19.1 20.1 20.7 0.38 0.52 0.51
42 ELLE 2021-09-02 02:00:00 UTC 11.8 13.2 90.6 0.00 0.00 0.00 1.8 2.3 0.3 262.0 7.0 2.0 2.8 0.3 261.0 6.0 0 977.08 NaN 0.0 0.0 0.0 17.6 17.6 17.5 0.27 0.20 0.20
43 ELMI 2021-09-02 02:00:00 UTC 15.2 15.5 89.7 0.00 0.00 0.00 2.0 3.3 0.4 342.0 18.0 2.4 3.9 0.5 349.0 14.0 0 971.06 NaN 0.0 0.0 0.0 21.9 21.5 21.3 0.28 0.34 0.43
44 ESSX 2021-09-02 02:00:00 UTC 15.1 15.6 95.1 0.00 0.00 0.00 0.3 0.9 0.3 256.0 12.0 0.6 1.2 0.3 255.0 15.0 0 1004.93 NaN 0.0 0.0 0.0 20.9 21.1 20.7 0.30 0.40 0.35
45 FAYE 2021-09-02 02:00:00 UTC 15.7 16.1 85.9 0.00 0.00 0.00 1.4 1.9 0.2 35.0 10.0 NaN NaN NaN NaN NaN 0 989.48 NaN 0.0 0.0 NaN 20.7 21.3 NaN 0.22 0.23 NaN
46 FRED 2021-09-02 02:00:00 UTC 14.9 16.3 73.7 0.00 0.00 0.00 1.8 2.3 0.2 106.0 7.0 2.0 2.7 0.3 107.0 7.0 0 984.52 NaN 0.0 0.0 0.0 20.9 21.1 20.9 0.22 0.30 0.31
47 GABR 2021-09-02 02:00:00 UTC 11.3 11.9 92.5 0.00 0.00 0.00 0.4 0.7 0.2 10.0 0.0 0.5 1.2 0.2 23.0 12.0 0 950.19 NaN 0.0 0.0 0.0 17.9 17.5 17.6 0.17 0.31 0.33
48 GFAL 2021-09-02 02:00:00 UTC 16.0 16.2 93.9 0.00 0.00 0.00 1.3 1.8 0.2 43.0 14.0 1.5 2.2 0.3 42.0 13.0 0 998.84 NaN 0.0 0.0 0.0 20.5 21.3 21.1 0.42 0.44 0.45
49 GFLD 2021-09-02 02:00:00 UTC 13.3 14.5 88.9 0.00 0.00 0.00 1.1 1.5 0.2 317.0 19.0 1.1 1.6 0.2 315.0 19.0 0 981.42 NaN 0.0 0.0 0.0 18.9 19.3 18.9 0.22 0.32 0.30
50 GROT 2021-09-02 02:00:00 UTC 15.1 15.5 89.4 0.00 0.00 0.00 3.5 6.5 0.9 356.0 15.0 3.9 7.1 1.0 2.0 14.0 0 964.52 NaN 0.0 0.0 0.0 18.8 18.9 18.8 0.28 0.23 0.17
51 GROV 2021-09-02 02:00:00 UTC 12.2 12.6 86.1 0.00 0.00 0.00 NaN NaN NaN NaN NaN 2.2 4.3 0.7 357.0 12.0 0 941.71 NaN 0.0 0.0 NaN 19.3 19.5 NaN 0.33 0.32 NaN
52 HAMM 2021-09-02 02:00:00 UTC 12.3 16.8 95.6 0.00 0.00 0.00 0.0 0.2 0.1 320.0 0.0 0.2 0.7 0.3 9.0 7.0 0 998.49 NaN 0.0 0.0 0.0 19.1 19.4 19.1 0.15 0.25 0.27
53 HARP 2021-09-02 02:00:00 UTC 13.4 13.6 96.6 0.18 10.54 0.00 3.1 5.7 1.2 80.0 29.0 3.6 7.0 1.4 81.0 24.0 0 948.19 NaN 0.0 0.0 0.0 18.6 19.3 19.0 0.31 0.28 0.52
54 HARR 2021-09-02 02:00:00 UTC 11.8 12.9 77.1 0.00 0.00 0.00 1.3 2.3 0.4 42.0 15.0 1.3 2.8 0.4 43.0 13.0 0 954.45 NaN 0.0 0.0 0.0 18.9 18.9 18.8 0.38 0.35 0.41
55 HART 2021-09-02 02:00:00 UTC 11.3 11.6 91.8 0.00 0.00 0.00 4.6 6.7 0.7 359.0 12.0 5.2 7.2 0.9 5.0 7.0 0 927.71 NaN 0.0 0.0 0.0 18.8 18.9 18.9 0.32 0.35 0.44
56 HERK 2021-09-02 02:00:00 UTC 14.4 14.6 98.9 0.00 0.00 0.00 1.5 2.0 0.3 270.0 8.0 1.7 2.4 0.3 270.0 8.0 0 984.55 NaN 0.0 0.0 0.0 19.5 20.3 20.5 0.17 0.28 0.41
57 HFAL 2021-09-02 02:00:00 UTC 15.9 15.9 99.5 0.86 21.89 0.20 4.3 7.5 1.3 19.0 22.0 4.8 8.5 1.3 20.0 18.0 0 983.12 NaN 0.0 0.0 0.0 19.7 21.1 22.1 0.47 0.35 0.35
58 ILAK 2021-09-02 02:00:00 UTC 12.0 12.7 98.2 0.00 0.00 0.00 0.4 0.9 0.2 14.0 14.0 0.6 1.2 0.2 11.0 18.0 0 953.97 NaN 0.0 0.0 0.0 17.3 17.7 17.9 0.24 0.13 0.38
59 JOHN 2021-09-02 02:00:00 UTC 14.3 14.4 96.0 0.00 0.44 0.00 1.9 2.9 0.5 19.0 29.0 2.1 3.5 0.7 13.0 25.0 0 983.14 NaN 0.0 0.0 0.0 20.3 20.3 19.9 0.42 0.38 0.33
60 JORD 2021-09-02 02:00:00 UTC 15.7 16.2 73.6 0.00 0.00 0.00 1.3 2.5 0.5 329.0 17.0 1.4 2.8 0.6 332.0 16.0 0 996.35 NaN 0.0 0.0 0.0 20.5 21.7 21.7 0.17 0.22 0.25
61 KIND 2021-09-02 02:00:00 UTC 16.7 16.7 98.6 0.73 9.75 0.19 2.3 5.1 1.0 358.0 16.0 2.6 6.0 1.1 356.0 14.0 0 999.45 NaN 0.0 0.0 0.0 21.3 21.9 21.7 0.46 0.41 0.44
62 LAUR 2021-09-02 02:00:00 UTC 13.9 13.8 96.5 0.00 0.28 0.00 1.3 3.7 0.8 359.0 58.0 1.8 4.6 0.9 350.0 58.0 0 951.51 NaN 0.0 0.0 0.0 18.6 18.9 18.6 0.39 0.43 0.31
63 LOUI 2021-09-02 02:00:00 UTC 13.2 14.2 96.0 0.00 0.00 0.00 0.0 0.3 0.1 166.0 0.0 0.4 0.7 0.2 120.0 18.0 0 1001.70 NaN 0.0 0.0 0.0 18.2 17.2 18.0 0.16 0.12 0.12
64 MALO 2021-09-02 02:00:00 UTC 13.0 13.7 89.5 0.00 0.00 0.00 2.3 2.7 0.1 172.0 3.0 2.6 3.1 0.2 177.0 3.0 0 984.16 NaN 0.0 0.0 0.0 17.3 17.5 17.3 0.25 0.25 0.29
65 MANH 2021-09-02 02:00:00 UTC 18.7 NaN 100.0 4.64 111.03 1.22 5.4 9.5 2.0 352.0 24.0 6.6 12.5 2.2 352.0 18.0 0 989.30 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
66 MEDI 2021-09-02 02:00:00 UTC 12.2 13.4 94.3 0.00 0.00 0.00 0.3 0.5 0.1 191.0 0.0 0.5 0.6 0.1 198.0 16.0 0 993.19 NaN 0.0 0.0 0.0 20.1 21.5 21.5 0.22 0.24 0.30
67 MEDU 2021-09-02 02:00:00 UTC 14.2 14.2 94.4 0.56 10.39 0.15 3.1 5.4 1.1 21.0 20.0 3.5 7.5 1.3 25.0 20.0 0 964.00 NaN 0.0 0.0 0.0 17.7 19.1 18.8 0.39 0.37 0.38
68 MORR 2021-09-02 02:00:00 UTC 14.5 14.5 95.7 0.00 0.00 0.00 1.9 3.1 0.5 335.0 15.0 2.2 3.8 0.6 335.0 14.0 0 965.57 NaN 0.0 0.0 0.0 17.9 18.4 18.6 0.21 0.27 0.27
69 NBRA 2021-09-02 02:00:00 UTC 14.8 14.9 94.0 0.27 8.95 0.00 1.4 3.2 0.8 298.0 29.0 1.7 3.6 0.8 296.0 30.0 0 953.34 NaN 0.0 0.0 0.0 18.9 20.7 20.5 0.44 0.43 0.50
70 NEWC 2021-09-02 02:00:00 UTC 11.5 11.8 98.2 0.00 0.00 0.00 0.5 1.0 0.3 216.0 7.0 0.6 1.2 0.3 216.0 9.0 0 953.41 NaN 0.0 0.0 0.0 19.1 18.8 18.0 0.30 0.28 0.17
71 NHUD 2021-09-02 02:00:00 UTC 14.9 14.8 83.5 0.00 0.00 0.00 1.6 3.0 0.6 45.0 11.0 1.8 3.3 0.6 43.0 10.0 0 976.34 NaN 0.0 0.0 0.0 21.7 23.0 22.9 0.06 0.05 0.06
72 OLDF 2021-09-02 02:00:00 UTC 11.9 12.5 89.5 0.00 0.00 0.00 0.6 1.5 0.4 357.0 24.0 1.0 2.2 0.5 353.0 24.0 0 949.52 NaN 0.0 0.0 0.0 17.9 18.6 18.2 0.46 0.51 0.46
73 OLEA 2021-09-02 02:00:00 UTC 13.0 14.0 83.8 0.00 0.00 0.00 0.8 1.4 0.3 285.0 20.0 0.9 1.7 0.3 289.0 18.0 1 959.02 NaN 0.0 0.0 0.0 19.5 20.3 19.9 0.35 0.41 0.35
74 ONTA 2021-09-02 02:00:00 UTC 18.2 19.5 63.8 0.00 0.00 0.00 2.1 4.4 0.8 26.0 14.0 2.2 4.4 0.8 23.0 15.0 0 999.85 NaN 0.0 0.0 0.0 19.8 21.5 21.7 0.19 0.21 0.20
75 OPPE 2021-09-02 02:00:00 UTC 13.9 14.1 100.0 0.00 0.00 0.00 0.5 1.1 0.2 302.0 8.0 0.7 1.1 0.2 304.0 8.0 0 968.07 NaN 0.0 0.0 0.0 19.7 20.3 20.3 0.33 0.39 0.23
76 OSCE 2021-09-02 02:00:00 UTC 14.4 14.4 72.1 0.00 0.00 0.00 0.7 1.7 0.5 81.0 47.0 1.0 2.6 0.5 98.0 48.0 0 971.95 NaN 0.0 0.0 0.0 19.7 20.1 20.3 0.23 0.20 0.25
77 OSWE 2021-09-02 02:00:00 UTC 14.5 17.7 83.8 0.00 0.00 0.00 0.9 2.1 0.3 75.0 18.0 1.0 2.3 0.4 74.0 19.0 0 1000.06 NaN 0.0 0.0 0.0 20.1 21.1 21.3 0.27 0.11 0.21
78 OTIS 2021-09-02 02:00:00 UTC 15.7 15.6 98.8 0.88 28.49 0.19 1.1 2.6 0.6 11.0 43.0 1.4 3.2 0.7 7.0 41.0 0 978.12 NaN 0.0 0.0 NaN 18.4 20.4 NaN 0.34 0.28 NaN
79 OWEG 2021-09-02 02:00:00 UTC 14.7 14.6 97.8 0.00 0.00 0.00 2.8 5.2 1.2 346.0 22.0 3.1 6.7 1.3 351.0 19.0 0 956.93 NaN 0.0 0.0 0.0 21.7 20.1 19.8 0.44 0.41 0.35
80 PENN 2021-09-02 02:00:00 UTC 15.2 15.8 79.5 0.00 0.00 0.00 3.9 5.6 0.8 352.0 13.0 4.3 6.4 0.9 355.0 9.0 0 981.48 NaN 0.0 0.0 0.0 19.7 21.5 21.5 0.19 0.22 0.30
81 PHIL 2021-09-02 02:00:00 UTC NaN 12.6 98.7 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 996.28 NaN 0.0 0.0 0.0 18.6 19.3 18.9 0.24 0.35 0.36
82 PISE 2021-09-02 02:00:00 UTC 12.5 12.7 94.6 0.00 0.00 0.00 0.8 1.3 0.2 70.0 26.0 1.0 1.8 0.3 67.0 23.0 0 948.76 NaN 0.0 0.0 0.0 17.2 17.3 17.2 0.40 0.39 0.54
83 POTS 2021-09-02 02:00:00 UTC 13.5 14.2 92.6 0.00 0.00 0.00 0.4 0.8 0.2 77.0 17.0 0.6 1.2 0.3 77.0 20.0 0 996.59 NaN 0.0 0.0 0.0 19.9 20.3 20.3 0.28 0.30 0.28
84 QUEE 2021-09-02 02:00:00 UTC 19.0 NaN 98.0 11.95 63.43 2.68 10.0 16.5 2.5 34.0 21.0 9.5 17.0 2.4 37.0 20.0 0 992.56 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
85 RAND 2021-09-02 02:00:00 UTC 12.2 12.8 93.6 0.00 0.00 0.00 0.0 0.2 0.1 2.0 0.0 0.3 0.8 0.2 313.0 36.0 1 960.59 NaN 0.0 0.0 0.0 20.9 20.9 20.7 0.34 0.21 0.22
86 RAQU 2021-09-02 02:00:00 UTC 12.4 12.5 91.5 0.00 0.00 0.00 0.0 0.4 0.0 133.0 3.0 0.0 0.5 0.1 123.0 15.0 0 949.24 NaN 0.0 0.0 0.0 19.1 19.5 18.9 0.31 0.53 0.61
87 REDF 2021-09-02 02:00:00 UTC 12.4 12.5 82.1 0.00 0.00 0.00 1.8 3.2 0.6 12.0 22.0 1.9 3.8 0.8 12.0 15.0 0 966.79 NaN 0.0 0.0 0.0 18.9 19.2 18.9 0.41 0.41 0.29
88 REDH 2021-09-02 02:00:00 UTC 16.6 16.6 97.8 0.82 19.73 0.24 3.7 6.6 1.2 20.0 20.0 4.2 7.6 1.3 15.0 18.0 0 999.41 NaN 0.0 0.0 0.0 21.9 23.9 24.3 0.28 0.25 0.30
89 ROXB 2021-09-02 02:00:00 UTC 14.0 14.0 98.2 0.60 11.85 0.14 3.0 5.3 0.9 8.0 18.0 3.4 5.7 1.1 9.0 16.0 0 942.32 NaN 0.0 0.0 NaN 17.2 17.3 NaN 0.55 0.52 NaN
90 RUSH 2021-09-02 02:00:00 UTC 11.3 12.9 98.3 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.1 0.4 0.2 56.0 4.0 0 990.56 NaN 0.0 0.0 0.0 20.5 21.3 21.3 0.26 0.25 0.25
91 SARA 2021-09-02 02:00:00 UTC 11.2 12.9 96.8 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.1 0.5 0.1 127.0 5.0 0 975.05 NaN 0.0 0.0 0.0 20.1 20.9 19.9 0.06 0.06 0.04
92 SBRI 2021-09-02 02:00:00 UTC 14.9 15.7 66.0 0.00 0.00 0.00 2.9 4.0 0.5 331.0 10.0 3.0 4.5 0.6 333.0 10.0 0 970.88 NaN 0.0 0.0 0.0 18.6 19.7 19.7 0.22 0.23 0.24
93 SCHA 2021-09-02 02:00:00 UTC 14.9 15.1 98.1 0.53 10.01 0.12 0.5 1.4 0.4 342.0 22.0 0.7 1.8 0.4 343.0 27.0 0 997.81 NaN 0.0 0.0 0.0 21.5 22.1 22.4 0.22 0.14 0.05
94 SCHO 2021-09-02 02:00:00 UTC 16.4 16.2 99.9 0.55 9.07 0.18 1.6 4.5 1.0 320.0 45.0 2.2 4.8 1.0 328.0 44.0 0 994.70 NaN 0.0 0.0 0.0 21.1 21.9 21.9 0.33 0.19 0.14
95 SCHU 2021-09-02 02:00:00 UTC 14.8 14.7 96.3 0.07 2.75 0.00 1.3 2.4 0.4 357.0 20.0 1.5 2.8 0.5 1.0 18.0 0 1006.36 NaN 0.0 0.0 0.0 21.2 22.4 22.1 0.07 0.12 0.08
96 SCIP 2021-09-02 02:00:00 UTC 14.4 14.6 82.0 0.00 0.00 0.00 4.6 6.8 0.9 19.0 11.0 4.9 8.5 1.2 17.0 8.0 0 964.43 NaN 0.0 0.0 0.0 18.8 20.1 20.1 0.24 0.27 0.28
97 SHER 2021-09-02 02:00:00 UTC 14.2 14.4 96.1 0.00 0.00 0.00 1.8 3.1 0.6 348.0 14.0 2.1 3.4 0.6 351.0 12.0 0 968.29 NaN 0.0 0.0 0.0 19.4 20.1 20.3 0.32 0.29 0.24
98 SOME 2021-09-02 02:00:00 UTC 16.5 16.4 98.3 1.75 46.43 0.44 2.6 6.0 1.4 20.0 35.0 3.0 7.0 1.6 20.0 31.0 0 981.69 NaN 0.0 0.0 0.0 20.3 21.7 21.3 0.69 0.42 0.62
99 SOUT 2021-09-02 02:00:00 UTC 19.3 19.4 96.6 0.00 23.78 0.00 5.8 11.3 2.1 64.0 14.0 6.3 11.6 2.1 71.0 12.0 0 1003.55 NaN 0.0 0.0 0.0 22.1 23.4 23.3 0.15 0.09 0.08
100 SPRA 2021-09-02 02:00:00 UTC 15.1 15.1 96.0 0.00 0.48 0.00 2.6 3.9 0.6 60.0 8.0 2.8 4.5 0.7 64.0 9.0 0 986.31 NaN 0.0 0.0 0.0 18.8 19.5 19.9 0.36 0.28 0.34
101 SPRI 2021-09-02 02:00:00 UTC 13.8 13.7 97.3 0.00 0.03 0.00 3.5 5.0 0.5 30.0 12.0 3.7 5.7 0.6 27.0 12.0 0 957.60 NaN 0.0 0.0 0.0 18.4 19.8 19.5 0.19 0.35 0.39
102 STAT 2021-09-02 02:00:00 UTC 18.5 18.5 97.6 0.39 93.40 0.00 6.1 9.4 1.6 341.0 22.0 6.6 11.6 1.8 341.0 21.0 0 995.71 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
103 STEP 2021-09-02 02:00:00 UTC 15.7 15.6 93.0 0.73 8.54 0.21 3.1 5.9 1.1 41.0 21.0 3.4 7.0 1.2 39.0 19.0 0 973.49 NaN 0.0 0.0 0.0 19.4 19.9 19.7 0.32 0.33 0.31
104 STON 2021-09-02 02:00:00 UTC 20.3 20.3 98.0 0.00 11.68 0.00 1.1 2.6 0.7 300.0 68.0 1.5 4.2 0.8 306.0 71.0 0 996.97 NaN 0.0 0.0 0.0 23.9 25.3 25.4 0.30 0.16 0.06
105 SUFF 2021-09-02 02:00:00 UTC 16.5 16.2 98.6 0.65 66.08 0.16 4.2 12.1 2.0 18.0 21.0 5.0 13.1 2.3 16.0 15.0 0 982.10 NaN 0.0 0.0 0.0 20.3 20.1 21.7 0.47 0.34 0.28
106 TANN 2021-09-02 02:00:00 UTC 13.1 13.4 98.9 0.80 17.90 0.17 0.8 3.5 0.7 25.0 50.0 1.2 3.6 0.7 16.0 53.0 0 926.92 NaN 0.0 0.0 0.0 18.0 18.4 18.4 0.59 0.49 0.42
107 TICO 2021-09-02 02:00:00 UTC 15.5 16.1 96.5 0.00 0.00 0.00 1.8 2.9 0.3 358.0 16.0 2.0 3.2 0.4 358.0 6.0 1 1001.18 NaN 0.0 0.0 0.0 19.8 20.3 19.7 0.25 0.44 0.47
108 TULL 2021-09-02 02:00:00 UTC 14.8 15.0 94.2 0.00 0.00 0.00 2.6 3.6 0.3 340.0 11.0 3.0 4.5 0.5 345.0 9.0 0 966.06 NaN 0.0 0.0 0.0 19.9 21.1 21.3 0.21 0.25 0.17
109 TUPP 2021-09-02 02:00:00 UTC 10.9 11.3 93.6 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.4 0.1 41.0 3.0 0 953.65 NaN 0.0 0.0 0.0 19.9 20.5 20.6 0.19 0.28 0.20
110 TYRO 2021-09-02 02:00:00 UTC 14.0 14.8 85.7 0.00 0.00 0.00 0.3 0.9 0.3 47.0 28.0 0.5 1.4 0.4 54.0 36.0 0 968.00 NaN 0.0 0.0 0.0 19.1 19.3 19.3 0.25 0.31 0.28
111 VOOR 2021-09-02 02:00:00 UTC 14.9 14.9 98.9 0.26 11.67 0.00 2.2 3.6 0.6 301.0 14.0 2.3 3.9 0.6 304.0 13.0 0 995.81 NaN 0.0 0.0 0.0 20.2 21.1 21.1 0.25 0.33 0.19
112 WALL 2021-09-02 02:00:00 UTC 16.2 16.2 98.4 1.77 24.48 0.45 3.3 5.9 1.3 19.0 25.0 3.8 7.5 1.5 12.0 19.0 0 992.84 NaN 0.0 0.0 0.0 19.9 21.9 21.5 0.57 0.42 0.44
113 WALT 2021-09-02 02:00:00 UTC 13.8 13.9 96.3 0.00 2.72 0.00 3.8 6.2 1.0 351.0 16.0 4.2 7.2 1.2 359.0 13.0 0 943.38 NaN 0.0 0.0 0.0 17.3 17.2 17.5 0.53 0.51 0.46
114 WANT 2021-09-02 02:00:00 UTC 24.1 24.0 98.7 0.06 3.47 0.00 8.0 13.3 1.9 119.0 11.0 8.6 13.7 2.0 123.0 9.0 0 998.15 NaN 0.0 0.0 0.0 23.7 24.4 24.3 0.22 0.05 0.05
115 WARS 2021-09-02 02:00:00 UTC 12.8 13.6 85.4 0.00 0.00 0.00 0.9 1.6 0.3 341.0 18.0 1.0 1.6 0.3 339.0 18.0 0 949.88 NaN 0.0 0.0 0.0 18.2 18.6 18.0 0.29 0.36 0.32
116 WARW 2021-09-02 02:00:00 UTC 16.2 16.2 100.0 1.35 46.92 0.29 2.7 4.4 0.6 296.0 20.0 2.9 4.7 0.7 306.0 20.0 0 986.11 NaN 0.0 0.0 0.0 18.2 21.1 21.8 0.43 0.32 0.31
117 WATE 2021-09-02 02:00:00 UTC 14.8 15.8 83.8 0.00 0.00 0.00 2.0 2.4 0.2 353.0 8.0 2.2 3.0 0.3 355.0 6.0 0 993.81 NaN 0.0 0.0 0.0 21.5 21.5 21.5 0.40 0.47 0.47
118 WBOU 2021-09-02 02:00:00 UTC 14.4 14.2 99.1 0.52 19.27 0.11 3.6 5.8 0.9 57.0 16.0 4.0 5.9 1.0 58.0 14.0 0 955.73 NaN 0.0 0.0 0.0 17.5 20.5 21.3 0.51 0.36 0.34
119 WELL 2021-09-02 02:00:00 UTC 10.8 13.4 98.8 0.00 0.00 0.00 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0 1001.47 NaN 0.0 0.0 0.0 21.5 20.7 20.5 0.30 0.32 0.24
120 WEST 2021-09-02 02:00:00 UTC 14.4 15.2 97.0 0.00 0.00 0.00 0.0 0.2 0.0 300.0 0.0 0.3 0.6 0.2 282.0 9.0 0 986.79 NaN 0.0 0.0 0.0 18.8 19.5 19.5 0.26 0.25 0.32
121 WFMB 2021-09-02 02:00:00 UTC 12.9 13.6 75.0 0.00 0.00 0.00 0.8 1.5 0.3 253.0 26.0 0.8 2.0 0.3 250.0 27.0 0 941.26 NaN 0.0 0.0 0.0 18.8 19.9 19.9 0.24 0.18 0.20
122 WGAT 2021-09-02 02:00:00 UTC 13.8 13.8 79.5 0.00 0.00 0.00 1.4 4.0 1.0 8.0 33.0 1.8 4.6 1.0 11.0 34.0 0 959.46 NaN 0.0 0.0 0.0 18.8 20.2 20.7 0.16 0.25 0.08
123 WHIT 2021-09-02 02:00:00 UTC 15.7 15.8 95.9 0.00 0.00 0.00 1.4 2.6 0.4 342.0 19.0 1.6 3.2 0.6 348.0 18.0 1 1006.29 NaN 0.0 0.0 0.0 19.3 20.3 20.1 0.28 0.47 0.46
124 WOLC 2021-09-02 02:00:00 UTC 14.0 16.6 84.6 0.00 0.00 0.00 0.4 0.9 0.2 350.0 19.0 0.6 1.4 0.3 353.0 20.0 0 996.90 NaN 0.0 0.0 0.0 21.9 23.5 24.2 0.18 0.03 0.07
125 YORK 2021-09-02 02:00:00 UTC 12.0 13.9 96.0 0.00 0.00 0.00 0.0 0.3 0.1 274.0 0.0 0.2 0.6 0.2 244.0 22.0 0 991.17 NaN 0.0 0.0 0.0 20.5 21.8 21.9 0.13 0.24 0.24

For a relatively small DataFrame as ours, this is ok, but you definitely would want to return to a stricter limit for larger DataFrames (Pandas can support millions of rows and/or columns!) Let’s restrict back down to 10 rows and columns (five at the start, five at the end) now.

pd.set_option('display.max_rows', 10)
pd.set_option('display.max_columns', 10)
Note: Recall that occasionally, there may be some missing data. In Pandas, these are denoted as NaN ... literally, "Not a Number".

A Pandas DataFrame is a 2-dimensional array of rows and columns. To get the array size, print out the shape attribute. The first element is the number of rows, while the second is the number of columns. The following cell prints out the number of rows and columns in this particular DataFrame:

print (df.shape)
nRows = df.shape[0]
nColumns = df.shape[1]
print ("There are %d rows and %d columns in this DataFrame." % (nRows, nColumns))
(126, 30)
There are 126 rows and 30 columns in this DataFrame.

Pandas refers to the column and row names as Indexes, which are 1-d(imensional) arrays. Display the names of the columns:

colNames = df.columns
colNames
Index(['station', 'time', 'temp_2m [degC]', 'temp_9m [degC]',
       'relative_humidity [percent]', 'precip_incremental [mm]',
       'precip_local [mm]', 'precip_max_intensity [mm/min]',
       'avg_wind_speed_prop [m/s]', 'max_wind_speed_prop [m/s]',
       'wind_speed_stddev_prop [m/s]', 'wind_direction_prop [degrees]',
       'wind_direction_stddev_prop [degrees]', 'avg_wind_speed_sonic [m/s]',
       'max_wind_speed_sonic [m/s]', 'wind_speed_stddev_sonic [m/s]',
       'wind_direction_sonic [degrees]',
       'wind_direction_stddev_sonic [degrees]', 'solar_insolation [W/m^2]',
       'station_pressure [mbar]', 'snow_depth [cm]', 'frozen_soil_05cm [bit]',
       'frozen_soil_25cm [bit]', 'frozen_soil_50cm [bit]',
       'soil_temp_05cm [degC]', 'soil_temp_25cm [degC]',
       'soil_temp_50cm [degC]', 'soil_moisture_05cm [m^3/m^3]',
       'soil_moisture_25cm [m^3/m^3]', 'soil_moisture_50cm [m^3/m^3]'],
      dtype='object')

You might think that the row index would have a similar attribute, but it doesn’t:

rowNames = df.rows
rowNames
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
Input In [10], in <cell line: 1>()
----> 1 rowNames = df.rows
      2 rowNames

File /knight/anaconda_aug22/envs/aug22_env/lib/python3.10/site-packages/pandas/core/generic.py:5575, in NDFrame.__getattr__(self, name)
   5568 if (
   5569     name not in self._internal_names_set
   5570     and name not in self._metadata
   5571     and name not in self._accessors
   5572     and self._info_axis._can_hold_identifiers_and_holds_name(name)
   5573 ):
   5574     return self[name]
-> 5575 return object.__getattribute__(self, name)

AttributeError: 'DataFrame' object has no attribute 'rows'

We actually use the index attribute to get at the row names. It’s a special type of object, known as a RangeIndex.

rowNames = df.index
rowNames
RangeIndex(start=0, stop=126, step=1)

We can view this RangeIndex as a Python list as follows:

list(rowNames)
[0,
 1,
 2,
 3,
 4,
 5,
 6,
 7,
 8,
 9,
 10,
 11,
 12,
 13,
 14,
 15,
 16,
 17,
 18,
 19,
 20,
 21,
 22,
 23,
 24,
 25,
 26,
 27,
 28,
 29,
 30,
 31,
 32,
 33,
 34,
 35,
 36,
 37,
 38,
 39,
 40,
 41,
 42,
 43,
 44,
 45,
 46,
 47,
 48,
 49,
 50,
 51,
 52,
 53,
 54,
 55,
 56,
 57,
 58,
 59,
 60,
 61,
 62,
 63,
 64,
 65,
 66,
 67,
 68,
 69,
 70,
 71,
 72,
 73,
 74,
 75,
 76,
 77,
 78,
 79,
 80,
 81,
 82,
 83,
 84,
 85,
 86,
 87,
 88,
 89,
 90,
 91,
 92,
 93,
 94,
 95,
 96,
 97,
 98,
 99,
 100,
 101,
 102,
 103,
 104,
 105,
 106,
 107,
 108,
 109,
 110,
 111,
 112,
 113,
 114,
 115,
 116,
 117,
 118,
 119,
 120,
 121,
 122,
 123,
 124,
 125]

Why are the row indices a sequence of integers beginning at 0, and not the first column (in this case, station) of the DataFrame? As we noted above, that is just the default behavior. We can specify what column to use for the row index as an additional argument to pd.read_csv :

df2 = pd.read_csv(dataFile,index_col=0)
Tip: We assign the resulting `DataFrame` to a different object, to distinguish it from the first one. Once again, we could use any valid object name we want.
df2
time temp_2m [degC] temp_9m [degC] relative_humidity [percent] precip_incremental [mm] ... soil_temp_25cm [degC] soil_temp_50cm [degC] soil_moisture_05cm [m^3/m^3] soil_moisture_25cm [m^3/m^3] soil_moisture_50cm [m^3/m^3]
station
ADDI 2021-09-02 02:00:00 UTC 13.3 13.4 92.5 0.00 ... 20.3 19.9 0.52 0.44 0.44
ANDE 2021-09-02 02:00:00 UTC 14.0 13.7 100.0 0.28 ... 19.1 19.3 0.25 0.21 0.14
BATA 2021-09-02 02:00:00 UTC 14.8 16.3 77.5 0.00 ... 21.5 21.3 0.25 0.21 0.22
BEAC 2021-09-02 02:00:00 UTC 16.0 15.9 98.6 1.93 ... 19.3 19.8 0.51 0.36 0.37
BELD 2021-09-02 02:00:00 UTC 14.3 14.5 94.9 0.00 ... 20.1 20.2 0.50 0.43 0.41
... ... ... ... ... ... ... ... ... ... ... ...
WFMB 2021-09-02 02:00:00 UTC 12.9 13.6 75.0 0.00 ... 19.9 19.9 0.24 0.18 0.20
WGAT 2021-09-02 02:00:00 UTC 13.8 13.8 79.5 0.00 ... 20.2 20.7 0.16 0.25 0.08
WHIT 2021-09-02 02:00:00 UTC 15.7 15.8 95.9 0.00 ... 20.3 20.1 0.28 0.47 0.46
WOLC 2021-09-02 02:00:00 UTC 14.0 16.6 84.6 0.00 ... 23.5 24.2 0.18 0.03 0.07
YORK 2021-09-02 02:00:00 UTC 12.0 13.9 96.0 0.00 ... 21.8 21.9 0.13 0.24 0.24

126 rows × 29 columns

df2.index
Index(['ADDI', 'ANDE', 'BATA', 'BEAC', 'BELD', 'BELL', 'BELM', 'BERK', 'BING',
       'BKLN',
       ...
       'WARW', 'WATE', 'WBOU', 'WELL', 'WEST', 'WFMB', 'WGAT', 'WHIT', 'WOLC',
       'YORK'],
      dtype='object', name='station', length=126)

Now, let’s examine the 2-meter temperature column, and thus, begin our exploration of Pandas’ second core object, the Series.

The pandas Series

… is essentially any one of the columns of our DataFrame, with its accompanying Index to provide a label for each value in our column.

The pandas Series is a fast and capable 1-dimensional array of nearly any data type we could want, and it can behave very similarly to a NumPy ndarray or a Python dict. You can take a look at any of the Series that make up your DataFrame with its label and the Python dict notation, or (if permitted), with dot-shorthand:

  1. Python dict notation, using brackets:

t2m = df['temp_2m [degC]'] # Note: column name must typed exactly as it is named, so watch out for spaces!
  1. As a shorthand, we might use treat the column as an attribute and use dot notation to access it, but only in certain circumstances, which does not include the following, due to the presence of spaces and other special characters in this particular column’s name:

#t2m = df.'temp_2m [degC]' # commented out since this will fail!
Tip: It's never wrong to use the dictionary-based technique, so we'll use it in most of the examples in this and subsequent notebooks that use Pandas!

Let’s view this Series object:

t2m
0      13.3
1      14.0
2      14.8
3      16.0
4      14.3
       ... 
121    12.9
122    13.8
123    15.7
124    14.0
125    12.0
Name: temp_2m [degC], Length: 126, dtype: float64

A Series is a 1-dimensional array, but with the DataFrame’s Index attached. To represent it as a Numpy array, we use its values attribute.

t2m.values
array([13.3, 14. , 14.8, 16. , 14.3, 13.6, 11.3, 15.4, 14.1, 19.9, 11.7,
       16.2, 13.8, 18. , 13.3, 14.9, 11.8, 13.3, 11.5,  nan, 10.8, 13.4,
       14.5, 14.7, 13.9, 13.7, 11.2, 14.7, 12.6, 14. , 15.6, 12.2, 12.3,
       13.9, 11.3, 16. , 16. , 13.2, 10.6, 14.8, 10.3, 14.9, 11.8, 15.2,
       15.1, 15.7, 14.9, 11.3, 16. , 13.3, 15.1, 12.2, 12.3, 13.4, 11.8,
       11.3, 14.4, 15.9, 12. , 14.3, 15.7, 16.7, 13.9, 13.2, 13. , 18.7,
       12.2, 14.2, 14.5, 14.8, 11.5, 14.9, 11.9, 13. , 18.2, 13.9, 14.4,
       14.5, 15.7, 14.7, 15.2,  nan, 12.5, 13.5, 19. , 12.2, 12.4, 12.4,
       16.6, 14. , 11.3, 11.2, 14.9, 14.9, 16.4, 14.8, 14.4, 14.2, 16.5,
       19.3, 15.1, 13.8, 18.5, 15.7, 20.3, 16.5, 13.1, 15.5, 14.8, 10.9,
       14. , 14.9, 16.2, 13.8, 24.1, 12.8, 16.2, 14.8, 14.4, 10.8, 14.4,
       12.9, 13.8, 15.7, 14. , 12. ])
Tip: In this case, we must use dot notation, but this is because values is not a column name, but a particular attribute of this Series object.

Notice that there is metadata … i.e., data about the data, attached to this data series … in the form of the column index name. Without it, we’d have no idea what the data represents nor what units its in.

Tip: Once we start working with data in NetCDF format, as part of the Xarray library, we will see that NetCDF has even more advanced support for including metadata.

There are several interesting methods available for Series. One is describe, which prints summary statistics on numerical Series objects:

t2m.describe()
count    124.000000
mean      14.242742
std        2.200561
min       10.300000
25%       12.750000
50%       14.200000
75%       15.200000
max       24.100000
Name: temp_2m [degC], dtype: float64
Tip: Yet another Pythonic nuance here ... note that we follow describe with a set of parentheses (). In this case, describe is a particular method, or function that is available for a Pandas Series.
Exercise: Now define a Series object called RH and populate it with the column from the DataFrame corresponding to Relative Humidity. Print out its values and get its summary statistics.
# Write your code below. 
# After you have done so, you can compare your code to the solution by uncommenting the line in the cell below.
# %load /spare11/atm533/common/pandas/01a.py
'''
01.py
'''
RH = df['relative_humidity [percent]']
print (RH.values)
print(RH.describe())
[ 92.5 100.   77.5  98.6  94.9  85.8  97.8  93.8  99.6  97.9  93.8  99.7
  87.2  99.   97.6  96.5  97.8  92.6  96.8  79.5  92.7  92.3 100.   94.6
  97.9  89.9   nan  96.3  81.1  87.9   nan  94.4  92.2  83.9  93.   90.1
  95.2 100.   97.   97.9  98.2  98.6  90.6  89.7  95.1  85.9  73.7  92.5
  93.9  88.9  89.4  86.1  95.6  96.6  77.1  91.8  98.9  99.5  98.2  96.
  73.6  98.6  96.5  96.   89.5 100.   94.3  94.4  95.7  94.   98.2  83.5
  89.5  83.8  63.8 100.   72.1  83.8  98.8  97.8  79.5  98.7  94.6  92.6
  98.   93.6  91.5  82.1  97.8  98.2  98.3  96.8  66.   98.1  99.9  96.3
  82.   96.1  98.3  96.6  96.   97.3  97.6  93.   98.   98.6  98.9  96.5
  94.2  93.6  85.7  98.9  98.4  96.3  98.7  85.4 100.   83.8  99.1  98.8
  97.   75.   79.5  95.9  84.6  96. ]
count    124.000000
mean      92.639516
std        7.578573
min       63.800000
25%       89.650000
50%       95.650000
75%       98.025000
max      100.000000
Name: relative_humidity [percent], dtype: float64
Question: Was the count, obtained when you ran the summary statistics method, the same as for 2-meter temperature? If not, why?
# Uncomment the line below after you have considered the question.
# %load /spare11/atm533/common/week4/01b.py
'''
01b.py
'''
# For this time, the value of "count" is 124 for 2-m temperature and RH.
# This reflects the fact that two of the 126 NYSM sites had missing 2m temperature, 
# and two had # missing RH data, which Pandas sets to 'NaN'.  
# However, the sites with the missing data differ between these two variables.
#
# It's important to note that none of these two sites gets included in
# the summary stats that describe() prints out.
'\n01b.py\n'

Summary

  • Pandas is a very powerful tool for working with tabular (i.e. spreadsheet-style) data

  • Pandas core objects are the DataFrame and the Series

  • A Pandas DataFrame consists of one or more Series

  • Pandas can be helpful for exploratory data analysis, such as basic statistics

What’s Next?

In the next notebook, we will use Pandas to further examine meteorological data from the New York State Mesonet and display it on a map.