Interactive visualization of worldwide METAR data: Geoviews
Contents
Interactive visualization of worldwide METAR data: Geoviews¶
Prerequisites¶
Concepts |
Importance |
Notes |
---|---|---|
Pandas |
Necessary |
|
Contextily |
Helpful |
Time to learn: 30 minutes
Imports¶
from datetime import datetime
import numpy as np
import pandas as pd
import geoviews as gv
import geoviews.feature as gf
from geoviews import opts
import geoviews.tile_sources as gts
from cartopy import crs as ccrs
Create an interactive visualization of worldwide surface meteorological (METAR) data using the Holoviz ecosystem¶
Holoviz is a suite of open-source Python libraries designed for interactive data analysis and visualization via the browser (including the Jupyter notebook). In this notebook, we will use the GeoViews package, which is part of Holoviz.
Another part of the Holoviz ecosystem is bokeh. Bokeh leverages Javascript in order to accomplish interactivity via the browser. GeoViews makes available Bokeh as well as Matplotlib via an extenstion.
gv.extension('bokeh', 'matplotlib')
Use Pandas to read in the file containing the most recent METAR data.¶
Since this file has latitude and longitude, we can pass the dataframe directly to GeoViews (i.e. no need to use Geopandas).
# First define the format and then define the function
timeFormat = "%y%m%d/%H%M"
# This function will iterate over each string in a 1-d array
# and use Pandas' implementation of strptime to convert the string into a datetime object.
parseTime = lambda x: datetime.strptime(x, timeFormat)
df = pd.read_csv('/spare11/atm533/data/world_metar_latest.csv',parse_dates=['YYMMDD/HHMM'], date_parser=parseTime, sep='\s+')
/tmp/ipykernel_975643/3289517985.py:7: FutureWarning: The argument 'date_parser' is deprecated and will be removed in a future version. Please use 'date_format' instead, or read your data in as 'object' dtype and then call 'to_datetime'.
df = pd.read_csv('/spare11/atm533/data/world_metar_latest.csv',parse_dates=['YYMMDD/HHMM'], date_parser=parseTime, sep='\s+')
df
STN | YYMMDD/HHMM | SLAT | SLON | SELV | TMPC | DWPC | RELH | PMSL | SPED | GUMS | DRCT | P01M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | DYS | 2023-10-24 18:00:00 | 32.43 | -99.85 | 545.0 | 23.7 | 18.8 | 74.02 | 1007.9 | 6.18 | -9999.00 | 180.0 | -9999.0 |
1 | NUW | 2023-10-24 18:00:00 | 48.35 | -122.65 | 14.0 | 9.4 | 5.6 | 77.14 | 1012.4 | 3.60 | -9999.00 | 110.0 | -9999.0 |
2 | NYL | 2023-10-24 18:00:00 | 32.65 | -114.62 | 65.0 | 25.0 | 11.7 | 43.38 | 1008.8 | 1.54 | -9999.00 | 70.0 | -9999.0 |
3 | PALU | 2023-10-24 18:00:00 | 68.88 | -166.13 | 3.0 | 4.6 | 2.3 | 85.02 | 1020.3 | 8.24 | 19.56 | 190.0 | -9999.0 |
4 | PAEI | 2023-10-24 18:00:00 | 64.67 | -147.10 | 167.0 | -12.9 | -14.8 | 85.64 | 1034.4 | 0.00 | -9999.00 | 0.0 | -9999.0 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4219 | MZJ | 2023-10-24 18:00:00 | -9999.00 | -9999.00 | -9999.0 | 25.0 | 10.0 | 38.74 | -9999.0 | 2.57 | -9999.00 | 160.0 | -9999.0 |
4220 | OEPS | 2023-10-24 18:00:00 | -9999.00 | -9999.00 | -9999.0 | 28.5 | 5.9 | 23.85 | 1015.8 | 2.57 | -9999.00 | 140.0 | -9999.0 |
4221 | PAAD | 2023-10-24 18:00:00 | -9999.00 | -9999.00 | -9999.0 | 1.2 | -1.7 | 80.99 | -9999.0 | 9.78 | 13.90 | 240.0 | -9999.0 |
4222 | LTFO | 2023-10-24 18:00:00 | -9999.00 | -9999.00 | -9999.0 | 20.0 | 13.0 | 64.04 | -9999.0 | 3.60 | -9999.00 | 180.0 | -9999.0 |
4223 | ORBI | 2023-10-24 18:00:00 | -9999.00 | -9999.00 | -9999.0 | 22.0 | 13.0 | 56.63 | -9999.0 | 3.09 | -9999.00 | 150.0 | -9999.0 |
4224 rows × 13 columns
In this dataset, missing values are set to -9999.0. Let’s replace any instance of this value with np.nan
throughout the DataFrame.
df.replace(-9999.0,np.nan,inplace=True)
NaN = np.nan
Let’s also remove any rows whose latitudes or longitudes are missing.
df = df.query('SLAT.notnull() | SLON.notnull()')
df
STN | YYMMDD/HHMM | SLAT | SLON | SELV | TMPC | DWPC | RELH | PMSL | SPED | GUMS | DRCT | P01M | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | DYS | 2023-10-24 18:00:00 | 32.43 | -99.85 | 545.0 | 23.7 | 18.8 | 74.02 | 1007.9 | 6.18 | NaN | 180.0 | NaN |
1 | NUW | 2023-10-24 18:00:00 | 48.35 | -122.65 | 14.0 | 9.4 | 5.6 | 77.14 | 1012.4 | 3.60 | NaN | 110.0 | NaN |
2 | NYL | 2023-10-24 18:00:00 | 32.65 | -114.62 | 65.0 | 25.0 | 11.7 | 43.38 | 1008.8 | 1.54 | NaN | 70.0 | NaN |
3 | PALU | 2023-10-24 18:00:00 | 68.88 | -166.13 | 3.0 | 4.6 | 2.3 | 85.02 | 1020.3 | 8.24 | 19.56 | 190.0 | NaN |
4 | PAEI | 2023-10-24 18:00:00 | 64.67 | -147.10 | 167.0 | -12.9 | -14.8 | 85.64 | 1034.4 | 0.00 | NaN | 0.0 | NaN |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
4139 | GVSV | 2023-10-24 18:00:00 | 16.83 | -25.07 | 20.0 | 29.0 | 20.0 | 58.32 | NaN | 9.27 | NaN | 40.0 | NaN |
4140 | OPST | 2023-10-24 18:00:00 | 32.53 | 74.37 | 247.0 | 21.0 | 16.0 | 73.09 | NaN | 0.00 | NaN | 0.0 | NaN |
4141 | MHGS | 2023-10-24 18:00:00 | 14.57 | -88.60 | 913.0 | 28.0 | 22.0 | 69.90 | NaN | 3.09 | NaN | 360.0 | NaN |
4142 | EDAH | 2023-10-24 18:00:00 | 53.87 | 14.15 | 28.0 | 11.0 | 11.0 | 100.00 | NaN | 3.60 | NaN | 110.0 | NaN |
4143 | QAZ | 2023-10-24 18:00:00 | 16.98 | 7.98 | 492.0 | 32.0 | -4.0 | 9.56 | 1012.1 | 3.09 | NaN | 100.0 | NaN |
4144 rows × 13 columns
df.TMPC.describe()
count 4058.000000
mean 15.996057
std 10.373377
min -23.300000
25% 10.000000
50% 18.400000
75% 23.900000
max 40.000000
Name: TMPC, dtype: float64
Create a set of GeoViews Points
objects.¶
We pass it three arguments:
The Pandas dataframe
A list containing the lons and lats
A list containing the columns we want to include
df_points = gv.Points(df, ['SLON','SLAT'],['STN','TMPC','PMSL','DWPC','SPED','P01M'])
Visualize the GeoViews Points object
df_points
Notice that the x and y-axes correspond to the range of lons and lats in the dataframe. Also notice the Toolbar to the right of the plot. You can mouse over each tool to see its function. By default, it is in Pan mode … click and drag to move around.
Next, activate the Box Zoom tool (the single magnifying glass). Click and drag from upper-left to lower-right to create a box. The plot will automatically adjust to the new zoomed-in view.
If you have a mouse, you can also try the Wheel Zoom tool, which is just below the Box Zoom tool.
The Reset tool (i.e., the bottom-most icon) resets the plot to the full domain.
Geo-reference the image via a background raster image.¶
The Holoviz suite uses the * to add layers to the same plot. Here, we first specify the Open Street Maps tile source. Then, we add our Points dataframe to it. We also specify options that accomplish the following:
Specify the width and height of the plot
Set the size of the points
Color each point by a variable: in this case, 2 m temperature in Celsius
Add a colorbar
Add the
hover
tool to the Toolbar
(gv.tile_sources.OSM * df_points).opts(
opts.Points(frame_width=800, frame_height=600, size=8, color='TMPC',tools=['hover']))
Note that hover
tool is active and appears as the tool icon just below the reset
tool. Mouse over any of the points and you will see a readout of the data from all the columns that we included in the creation of the df_points GeoViews object.
Make a Labels
plot¶
Next, let’s plot a map that instead of the point locations, outputs the values of a particular column from the dataset. To do this, we create a GeoViews Labels
object. It takes arguments similar to Points
.
df_labels = gv.Labels(df, ['SLON','SLAT'],['TMPC'])
Plot just the labels.
df_labels
Now layer on the background map tile, as we did for Points
. Specify the color and size of the text labels.
figure = (gv.tile_sources.CartoLight * df_labels).opts(
opts.Labels(frame_height=800,frame_width=800,text_color='purple',text_font_size='10pt'))
figure
- Combine the worldwide METAR and NYSM Dataframes and visualize the combined Dataframe.
- Plot a variable other than temperature.
Summary¶
The Holoviz set of Python packages provide interactive visualization of datasets in the Jupyter notebook.
Data from Pandas dataframes can be easily displayed and geo-referenced by Geoviews
Point
andLabel
objects.Similar to Contextily, Geoviews allows one to add background tile-served maps as an additional layer.
What’s Next?¶
In the next notebook, we will use GeoViews to interactively browse gridded datasets.