Rasters 1: Intro to raster data

Overview

  1. Access real-time NWP data from Unidata’s THREDDS server

  2. Use Xarray’s plot.imshow to display gridded data as an image

  3. Use Xarray’s open_rasterio method to open and view high-resolution elevation data

  4. Close data objects when you are no longer using them.

Prerequisites

Concepts

Importance

Notes

Xarray

Necessary

  • Time to learn: 15 minutes


Imports

import xarray as xr
import rioxarray as rxr
from datetime import datetime, timedelta

Access real-time NWP data from Unidata’s THREDDS server

Parse the current time and choose a recent hour when we might expect the model run to be complete. There is normally a ~90 minute lag, but we’ll give it a bit more than that.

now = datetime.now()
year = now.year
month = now.month
day = now.day
hour = now.hour
minute = now.minute
print (year, month, day, hour,minute)
2023 10 17 17 58
# The previous hour's data is typically available shortly after the half-hour, but give it a little extra time 
if (minute < 45): 
    hrDelta = 3
else:
    hrDelta = 2
runTime = now - timedelta(hours=hrDelta)
runTimeStr = runTime.strftime('%Y%m%d %H00 UTC')
modelDate = runTime.strftime('%Y%m%d')
modelHour = runTime.strftime('%H')
modelDay = runTime.strftime('%D')
print (modelDay)
print (modelHour)
print (runTimeStr)
10/17/23
15
20231017 1500 UTC

Create an object pointing to the dataset.

data_url = 'https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/HRRR/CONUS_2p5km/HRRR_CONUS_2p5km_'+ modelDate + '_' + modelHour + '00.grib2'
data_url
'https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/HRRR/CONUS_2p5km/HRRR_CONUS_2p5km_20231017_1500.grib2'
ds = xr.open_dataset(data_url)
xrdSize = ds.nbytes / 1e9
print ("Size of dataset = %.3f GB" % xrdSize)
Size of dataset = 18.159 GB
ds
<xarray.Dataset>
Dimensions:                                                                                                              (
                                                                                                                          x: 2145,
                                                                                                                          y: 1377,
                                                                                                                          time: 19,
                                                                                                                          time1: 18,
                                                                                                                          time2: 19,
                                                                                                                          ...
                                                                                                                          height_above_ground_layer2_bounds_1: 2,
                                                                                                                          pressure_difference_layer2_bounds_1: 2,
                                                                                                                          height_above_ground_layer3_bounds_1: 2,
                                                                                                                          isobaric_layer_bounds_1: 2,
                                                                                                                          sigma_layer_bounds_1: 2,
                                                                                                                          pressure_difference_layer3_bounds_1: 2)
Coordinates: (12/22)
  * x                                                                                                                    (x) float32 ...
  * y                                                                                                                    (y) float32 ...
    reftime                                                                                                              datetime64[ns] ...
  * time                                                                                                                 (time) datetime64[ns] ...
  * time1                                                                                                                (time1) datetime64[ns] ...
  * time2                                                                                                                (time2) datetime64[ns] ...
    ...                                                                                                                   ...
  * height_above_ground3                                                                                                 (height_above_ground3) float32 ...
  * isobaric_layer                                                                                                       (isobaric_layer) float32 ...
  * sigma_layer                                                                                                          (sigma_layer) float32 ...
  * pressure_difference_layer3                                                                                           (pressure_difference_layer3) float32 ...
  * isobaric                                                                                                             (isobaric) float32 ...
  * isobaric1                                                                                                            (isobaric1) float32 ...
Dimensions without coordinates: time1_bounds_1, time2_bounds_1,
                                pressure_difference_layer_bounds_1,
                                pressure_difference_layer1_bounds_1,
                                height_above_ground_layer_bounds_1,
                                height_above_ground_layer1_bounds_1,
                                height_above_ground_layer2_bounds_1,
                                pressure_difference_layer2_bounds_1,
                                height_above_ground_layer3_bounds_1,
                                isobaric_layer_bounds_1, sigma_layer_bounds_1,
                                pressure_difference_layer3_bounds_1
Data variables: (12/68)
    LambertConformal_Projection                                                                                          int32 ...
    time1_bounds                                                                                                         (time1, time1_bounds_1) datetime64[ns] ...
    time2_bounds                                                                                                         (time2, time2_bounds_1) datetime64[ns] ...
    pressure_difference_layer_bounds                                                                                     (pressure_difference_layer, pressure_difference_layer_bounds_1) float32 ...
    pressure_difference_layer1_bounds                                                                                    (pressure_difference_layer1, pressure_difference_layer1_bounds_1) float32 ...
    height_above_ground_layer_bounds                                                                                     (height_above_ground_layer, height_above_ground_layer_bounds_1) float32 ...
    ...                                                                                                                   ...
    u-component_of_wind_isobaric                                                                                         (time, isobaric1, y, x) float32 ...
    u-component_of_wind_height_above_ground                                                                              (time, height_above_ground2, y, x) float32 ...
    u-component_storm_motion_height_above_ground_layer                                                                   (time, height_above_ground_layer1, y, x) float32 ...
    v-component_of_wind_isobaric                                                                                         (time, isobaric1, y, x) float32 ...
    v-component_of_wind_height_above_ground                                                                              (time, height_above_ground2, y, x) float32 ...
    v-component_storm_motion_height_above_ground_layer                                                                   (time, height_above_ground_layer1, y, x) float32 ...
Attributes:
    Originating_or_generating_Center:                                        ...
    Originating_or_generating_Subcenter:                                     ...
    GRIB_table_version:                                                      ...
    Type_of_generating_process:                                              ...
    Analysis_or_forecast_generating_process_identifier_defined_by_originating...
    file_format:                                                             ...
    Conventions:                                                             ...
    history:                                                                 ...
    featureType:                                                             ...
    _CoordSysBuilder:                                                        ...
    EXTRA_DIMENSION.reftime:                                                 ...
Tip: This Dataset, while stored in its native (GRIB2) format, presents itself to Xarray as NetCDF. The THREDDS server automatically translates GRIB to NetCDF.
Notice as well that its horizontal dimensions, 2145 x 1377, differs from the HRRR datasets served via cloud providers such as AWS or Google Cloud (1799 x 1099)! The former is a bit higher in horizontal resolution (2.5 km vs 3 km).

Note that the horizontal dimensions of this Dataset are not longitude/latitude. Instead, they represent distances in km from a central origin.

ds.x
<xarray.DataArray 'x' (x: 2145)>
array([-2763.217 , -2760.6772, -2758.1377, ...,  2676.8267,  2679.3662,
        2681.906 ], dtype=float32)
Coordinates:
  * x        (x) float32 -2.763e+03 -2.761e+03 ... 2.679e+03 2.682e+03
    reftime  datetime64[ns] ...
Attributes:
    standard_name:        projection_x_coordinate
    units:                km
    _CoordinateAxisType:  GeoX
ds.y
<xarray.DataArray 'y' (y: 1377)>
array([-263.79062, -261.25092, -258.7112 , ..., 3225.7612 , 3228.3008 ,
       3230.8406 ], dtype=float32)
Coordinates:
  * y        (y) float32 -263.8 -261.3 -258.7 ... 3.226e+03 3.228e+03 3.231e+03
    reftime  datetime64[ns] ...
Attributes:
    standard_name:        projection_y_coordinate
    units:                km
    _CoordinateAxisType:  GeoY

Use Xarray’s plot.imshow to display gridded data as an image

One of the output grids is surface pressure. We can visualize it as a proxy for surface elevation.

ds.Pressure_surface
<xarray.DataArray 'Pressure_surface' (time: 19, y: 1377, x: 2145)>
[56119635 values with dtype=float32]
Coordinates:
  * x        (x) float32 -2.763e+03 -2.761e+03 ... 2.679e+03 2.682e+03
  * y        (y) float32 -263.8 -261.3 -258.7 ... 3.226e+03 3.228e+03 3.231e+03
    reftime  datetime64[ns] ...
  * time     (time) datetime64[ns] 2023-10-17T15:00:00 ... 2023-10-18T09:00:00
Attributes: (12/13)
    long_name:                       Pressure @ Ground or water surface
    units:                           Pa
    abbreviation:                    PRES
    grid_mapping:                    LambertConformal_Projection
    Grib_Variable_Id:                VAR_0-3-0_L1
    Grib2_Parameter:                 [0 3 0]
    ...                              ...
    Grib2_Parameter_Category:        Mass
    Grib2_Parameter_Name:            Pressure
    Grib2_Level_Type:                1
    Grib2_Level_Desc:                Ground or water surface
    Grib2_Generating_Process_Type:   Forecast
    Grib2_Statistical_Process_Type:  UnknownStatType--1

Use Xarray to generate an image, as opposed to a filled contour plot, for the first time in the Dataset.

ds.Pressure_surface.isel(time=0).plot.imshow(figsize=(15,10))
<matplotlib.image.AxesImage at 0x14cdde479a50>
../../_images/8764aac7dc64488d31708bf33674c81139945432737fde06c1451848c46e8710.png
WARNING: A quirk in how a THREDDS data server interprets a GRIB dataset means that the coordinate variable names may change unpredictabily from run to run. In this case, the time dimension was named time. When you run this notebook, the dimension name might also be time, but it could be something else ... such as time1 or time2 Modify the cell above if this occurs.
Tip: Rather than creating filled contours, we have used Xarray's Matplotlib interface to create an image map of surface pressure. Its resolution is the same as the dataset ... in this case, approximately 2.5 km. It fits the definition of what is called raster data.
Note: But there exist other rasterized representations of elevation ... at even higher resolutions ... that are in a particular image format (tiff). Since we never expect these elevations to change (except on very long time scales), we can use these static datasets to complement our visualizations.

Use rioxarray’s open_rasterio method to open high-resolution elevation data

Natural Earth, the same source that Cartopy utilizes for many of the background features we have used, also provides raster datasets for download. The medium and high-resolution manually-shaded relief files can be accessed from their location in the course data directory.

ds50m = rxr.open_rasterio('/spare11/atm533/data/raster/MSR_50M/MSR_50M.tif')
ds10m = rxr.open_rasterio('/spare11/atm533/data/raster/US_MSR_10M/US_MSR.tif')

Examine these Datasets

We now have two Xarray DataArray objects. They are three-dimensional (band, y, and x), with the new dimension called band. There is only one band in each of these data arrays, so we will just pass its one index (0) directly to imshow. Note the horizontal dimemsions … in both cases, quite a bit higher than the already high-resolution HRRR grid we viewed above. What parts of the world do they represent?

ds50m.sel(band=1).plot.imshow(figsize=(15,10))
<matplotlib.image.AxesImage at 0x14cdac1c6510>
../../_images/9da290f2afd7aee28923b881da10eba9cf93df73331f98cc4fd6b378aa6293a2.png
ds10m.sel(band=1).plot.imshow(figsize=(15,10))
<matplotlib.image.AxesImage at 0x14cd3c50b250>
../../_images/99176e9bd9ab6e77d5f2f234573745091849babc21bfdd912596af46e22d29b0.png
Tip: Notice the x- and y-axis labels. What do you think they represent? Look at the attributes of the spatial_ref coordinate variable for more info.
We'll explore these georeferenced TIFF (tagged image file format) images in more detail in subsequent notebooks!
ds10m.spatial_ref.crs_wkt
'PROJCS["WGS 84 / Pseudo-Mercator",GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AUTHORITY["EPSG","4326"]],PROJECTION["Mercator_1SP"],PARAMETER["central_meridian",0],PARAMETER["scale_factor",1],PARAMETER["false_easting",0],PARAMETER["false_northing",0],UNIT["metre",1,AUTHORITY["EPSG","9001"]],AXIS["Easting",EAST],AXIS["Northing",NORTH],EXTENSION["PROJ4","+proj=merc +a=6378137 +b=6378137 +lat_ts=0 +lon_0=0 +x_0=0 +y_0=0 +k=1 +units=m +nadgrids=@null +wktext +no_defs"],AUTHORITY["EPSG","3857"]]'

Close the file handles so they don’t stay resident in memory.

ds.close()
ds10m.close()
ds50m.close()

Summary

  • We can access real-time NWP data via the Unidata THREDDS server

  • THREDDS automatically translates data in supported formats (such as GRIB) into NetCDF.

  • Gridded data arrays are a type of raster data.

  • Matplotlib’s imshow method displays raster data as an image.

  • A frequently-used image format is tiff.

  • TIFFS can be geo-referenced, making them very useful in map-based visualizations.

What’s Next?

In the next notebook, we’ll use our geo-referenced TIFF as a map layer and overlay NWP data on top of it.