# Interactive data visualization in Python: Geopandas
---

## Overview
   
Within this notebook, we will cover:

1. Browser-based interactive maps of point-based data 
1. [Geopandas](https://geopandas.org/en/stable/)

## Prerequisites
| Concepts | Importance | Notes |
| --- | --- | --- |
| [Cartopy Intro](https://foundations.projectpythia.org/core/cartopy/cartopy.html) | Required | Projections and Features |
| [Pandas](https://foundations.projectpythia.org/core/pandas.html) | Required | Tabular Datasets |

- **Time to learn**: 20 minutes
---

All of the graphics we have generated so far in the class have been *static*. In other words, they exist "as-is" ... there is no way to interact with them. While this is fine, and even preferable, for traditional publication figures and websites, it would be nice to be able to produce *dynamic* figures ... which one can zoom into/out of, pan around ... similar to, say, Google Maps.<br>
<br><hr>
We have previously displayed tropical cyclone locations on a static map, using Cartopy, Matplotlib, and (in your HW1 assignment) Pandas. Now, let's make an *interactive* map ... for that, we will leverage the Geopandas Python package. 

## Imports

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from cartopy import crs as ccrs
from cartopy import feature as cfeature
import geopandas as gpd
import os
os.environ['USE_PYGEOS'] = '0'
import geopandas

<div class="alert alert-info"><b>Note: </b>The previous cell includes code lines that deal with the warning message you likely see above.</a></div>

### Read in the iBTrACS database

iBTrACS, aka the *International Best Track Archive for Climate Stewardship*, is a continually-updated database of worldwide tropical cyclones.

The **iBTrACS** datasets have over 100 columns. Let's focus on just a subset of columns that have most relevance to the North Atlantic/Gulf of Mexico/Caribbean Sea basins.

In [None]:
keepCols = ['SEASON', 'NAME', 'ISO_TIME', 'LAT', 'LON', 'USA_STATUS', 'USA_WIND', 'USA_PRES']

Open a **pandas** `Dataframe` from a URL pointing to the North Atlantic iBTrACS dataset.

In [None]:
df = pd.read_csv("https://www.ncei.noaa.gov/data/international-best-track-archive-for-climate-stewardship-ibtracs/v04r00/access/csv/ibtracs.NA.list.v04r00.csv",low_memory=False, skiprows = [1], usecols=keepCols)

<div class="alert alert-info"><b>Note: </b>We specified three additional options to Pandas'  <code>read_csv</code> function. You can explore the full suite of options via the <a href="https://pandas.pydata.org/docs/reference/api/pandas.read_csv.html">Pandas API documentation.</a></div>

In [None]:
df

Create a new `Dataframe` consisting only of tropical cyclone Franklin (2023).

In [None]:
dfFranklin = df.query('NAME == "FRANKLIN" & SEASON == 2023')

<div class="alert alert-info"><b>Note: </b>Pandas <code>query</code> function allows for database-like queries on a <code>Dataset</code>. For more info, check out the documentation for the <a href="https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.query.html"><b>df.query</b></a> function.
</div>

In [None]:
dfFranklin

Make our Pandas `Dataframe` *geo-aware*. To do this, we create a Geopandas Dataframe. It adds a `Geometry` column, which may consist of shapes or points. The TC locations are points, so that's what we'll use to instantiate the Geometry column.

In [None]:
gdf = gpd.GeoDataFrame(dfFranklin,geometry=gpd.points_from_xy(dfFranklin.LON,dfFranklin.LAT))

In [None]:
gdf

Note that `geometry` appears as a new column (`Series`). 

We can interactively `explore` this Dataframe as a map in the browser!

In [None]:
gdf.explore()

Well ... we have an interactive frame ... and it looks like there's a track in there ... but where is the interactive map?? 

We still have a little more work to do:

While the points certainly look like latitude and longitudes, we need to explicitly assign a projection to the Dataframe before we can view it on a map. One way is to assign a coordinate reference system code, via [EPSG](https://epsg.io) ... in this case, [EPSG 4326](https://epsg.io/4326).

Note the arguments to `set_crs`:

1. `epsg = 4326`: Assign the specified CRS
1. `inplace = True`: The `gdf` object is updated without the need to assign a new dataframe object
1. `allow_override = True`: If a CRS had previously been applied, override with the EPSG value specified.

In [None]:
gdf.set_crs(epsg=4326, inplace=True, allow_override=True)

Now, let's try the `explore` function again!

In [None]:
gdf.explore()

We can pan, zoom, and hover over each point ... hovering shows the values of all the columns in the `Dataframe`.

Now, let's select just one column from the Dataframe and explore once again.

In [None]:
gdf.explore(column='USA_WIND')

Notice that the values aren't in numerical order ... they're being treated as strings. Let's explicitly set the `USA_WIND` column to be an integer (a 16-bit size is fine).

In [None]:
gdf['USA_WIND'] = gdf['USA_WIND'].astype('int16')

In [None]:
gdf.explore('USA_WIND')

By default, passing in one column of numerical values will color-code each value!

## Things to try
### On your own, create Jupyter notebooks and do the following:
1. Select another storm from the historical record and plot its track.
1. Instead of just one specific named storm, create a `Dataframe` including all storms from a particular year ... or multiple years. Then plot the tracks of all storms from that year. Can you find a way to clearly show what track goes with each storm?
1. Examine the [Geopandas](https://geopandas.org/en/stable/) website and experiment with different visualizations, either from the iBTrACS database or another dataset of interest to you.

## References
1. [iBTraCS DOI](https://doi.org/10.25921/82ty-9e16)
1. [iBTraCS variable information](https://www.ncei.noaa.gov/sites/default/files/2021-07/IBTrACS_v04_column_documentation.pdf)