Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

GOSTrocks Primer

GOSTrocks is an internal sandbox repository that contains commonly used geospatial functions and operations. This notebooks demonstrates the usage of selected functions.

The repository also contains customized code to process specific datasets (GHSL, OpenStreetMap, Nighttime Lights, Fathom). This code is in constant development as project requirements evolve over time.

Open in Colab

Installation

From PyPI

pip install GOSTrocks

From Source

git clone https://github.com/worldbank/GOSTrocks.git
cd GOSTrocks
pip install .

Import Functions

GOSTrocks is broken down into modules of functions that share common elements.

import GOSTrocks.dataMisc as dMisc  # search and download specific datasets
import GOSTrocks.rasterMisc as rMisc  # Raster tools!

Raster Example

One of the most recurring workflows is to clip and combine raster data from different sources.

Here, we work with population data from WorldPop, and a friction surface from the Malaria Atlas Project, using Ghana as our area of interest.

They are both global datasets with ~1km resolution, so they both need to be clipped and standardized for analysis.

We will use administrative boundaries from the GeoBoundaries API.

import os
from os.path import join, exists
import urllib.request

import rasterio as rio
from rasterio.plot import show

import matplotlib.pyplot as plt
import matplotlib.colors as colors
country = "Ghana"
iso = "GHA"

Get Boundaries

dMisc.get_geoboundaries?
Signature:
dMisc.get_geoboundaries(
    iso3,
    level,
    geo_api='https://www.geoboundaries.org/api/current/gbOpen/{iso3}/{adm}/',
)
Docstring:
Download boundaries dataset from geobounadries

Parameters
----------
iso3 : bool
    ISO3 code of country to download
level : str
    Admin code to download in format of "ADM1" or "ADM2"
geo_api : str, optional
    by default "https://www.geoboundaries.org/api/current/gbOpen/{iso3}/{adm}/"

Returns
-------
gpd.GeoDataFrame
    spatial data representing the administrative boundaries
File:      ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py
Type:      function
gdf = dMisc.get_geoboundaries(iso, level="ADM0")
gdf.plot()
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
File ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py:130, in get_geoboundaries(iso3, level, geo_api)
    129 data = json.load(url)
--> 130 geo_data = gpd.read_file(data["gjDownloadURL"])
    131 return geo_data

File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:281, in _read_file(filename, bbox, mask, rows, engine, **kwargs)
    279         path_or_bytes = filename
--> 281     return _read_file_fiona(
    282         path_or_bytes, from_bytes, bbox=bbox, mask=mask, rows=rows, **kwargs
    283     )
    285 else:

File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:299, in _read_file_fiona(path_or_bytes, from_bytes, bbox, mask, rows, where, **kwargs)
    295 if not from_bytes:
    296     # Opening a file via URL or file-like-object above automatically detects a
    297     # zipped file. In order to match that behavior, attempt to add a zip scheme
    298     # if missing.
--> 299     if _is_zip(str(path_or_bytes)):
    300         parsed = fiona.parse_path(str(path_or_bytes))

File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:166, in _is_zip(path)
    165 """Check if a given path is a zipfile"""
--> 166 parsed = fiona.path.ParsedPath.from_uri(path)
    167 return (
    168     parsed.archive.endswith(".zip")
    169     if parsed.archive
    170     else parsed.path.endswith(".zip")
    171 )

AttributeError: module 'fiona' has no attribute 'path'

During handling of the above exception, another exception occurred:

ValueError                                Traceback (most recent call last)
Cell In[5], line 1
----> 1 gdf = dMisc.get_geoboundaries(iso, level="ADM0")
      2 gdf.plot()

File ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py:134, in get_geoboundaries(iso3, level, geo_api)
    132 except Exception:
    133     all_url = geo_api.format(iso3=iso3, adm="ALL")
--> 134     raise (
    135         ValueError(
    136             f"Cannot find admin dataset {cur_url}. Check out {all_url} for details on what is available"
    137         )
    138     )

ValueError: Cannot find admin dataset https://www.geoboundaries.org/api/current/gbOpen/GHA/ADM0/. Check out https://www.geoboundaries.org/api/current/gbOpen/GHA/ALL/ for details on what is available

Download Population Data

wp_url = f"https://data.worldpop.org/GIS/Population/Global_2000_2020_1km/2020/{iso.upper()}/{iso.lower()}_ppp_2020_1km_Aggregated.tif"
wp_url
'https://data.worldpop.org/GIS/Population/Global_2000_2020_1km/2020/GHA/gha_ppp_2020_1km_Aggregated.tif'
wp_path = join(
    "..", "data", f"{iso}_ppp_2020_1km_Aggregated_UNadj.tif"
)  # Download from link above
if not exists(wp_path):
    urllib.request.urlretrieve(wp_url, wp_path)
---------------------------------------------------------------------------
FileNotFoundError                         Traceback (most recent call last)
Cell In[10], line 5
      1 wp_path = join(
      2     "..", "data", f"{iso}_ppp_2020_1km_Aggregated_UNadj.tif"
      3 )  # Download from link above
      4 if not exists(wp_path):
----> 5     urllib.request.urlretrieve(wp_url, wp_path)

File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/urllib/request.py:257, in urlretrieve(url, filename, reporthook, data)
    255 # Handle temporary file setup.
    256 if filename:
--> 257     tfp = open(filename, 'wb')
    258 else:
    259     tfp = tempfile.NamedTemporaryFile(delete=False)

FileNotFoundError: [Errno 2] No such file or directory: '../data/GHA_ppp_2020_1km_Aggregated_UNadj.tif'
src_pop = rio.open(wp_path)
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
show(src_pop, norm=colors.PowerNorm(gamma=0.5), cmap="viridis", ax=ax)
plt.axis("off")
(np.float64(-3.251249987361092), np.float64(1.190416661538908), np.float64(4.732916852582885), np.float64(11.174583493482885))
<Figure size 800x800 with 1 Axes>

Clip Global Friction Surface

gfs_path = "/Users/glevin/Documents/WorldBank/GOSTrocks/2020_motorized_friction_surface.geotiff"
src_travel = rio.open(gfs_path)
# fig, ax = plt.subplots(1, 1, figsize=(8, 8))
# show(src_travel, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
# plt.axis("off")
rMisc.clipRaster?
Signature: rMisc.clipRaster(inR, inD, outFile=None, crop=True)
Docstring:
Clip input raster to provided geodataframe

Parameters
----------
inR : rasterio.DatasetReader
    Input raster to clip
inD : geopandas.GeoDataFrame
    GeoDataFrame containing the clipping geometry
outFile : str, optional
    File to write the clipped raster to, by default None
crop : bool, optional
    Whether to crop the raster to the unary_union of the GeoDataFrame (True) or to the bounding box (False), by default True

Returns
-------
tuple
    A tuple containing the clipped raster and its metadata.
File:      ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/rasterMisc.py
Type:      function
out_travel = join("..", "data", f"{iso.upper()}_2020_motorized_friction_surface.tif")
surface, surface_meta = rMisc.clipRaster(
    src_travel, gdf, outFile=out_travel, crop=False
)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[14], line 3
      1 out_travel = join("..", "data", f"{iso.upper()}_2020_motorized_friction_surface.tif")
      2 surface, surface_meta = rMisc.clipRaster(
----> 3     src_travel, gdf, outFile=out_travel, crop=False
      4 )

NameError: name 'gdf' is not defined
fig, ax = plt.subplots(1, 1, figsize=(8, 8))
show(surface, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
plt.axis("off")
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
Cell In[15], line 2
      1 fig, ax = plt.subplots(1, 1, figsize=(8, 8))
----> 2 show(surface, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
      3 plt.axis("off")

NameError: name 'surface' is not defined
<Figure size 800x800 with 1 Axes>

Standardize Rasters

Align the population raster to the friction surface, ensuring that they have the same extent and resolution.

rMisc.standardizeInputRasters?
Signature:
rMisc.standardizeInputRasters(
    inR1,
    inR2,
    inR1_outFile='',
    resampling_type='nearest',
)
Docstring:
Standardize inR1 to inR2: changes crs, extent, and resolution.

:param inR1: rasterio object for raster to be modified
:type inR1: rasterio.DatasetReader
:param inR2: rasterio object to be standardized to
:type inR12 rasterio.DatasetReader
:param inR1_outfile: path to create output raster file of standardized inR1, default is '', which means nothing is written
:type inR1: string
:param resampling_type: how to perform spatial resampling; options are nearest (default), cubic, or sum
:type resampling_type: string
:return: array of numpy array, and rasterio metadata
:rtype: array
File:      ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/rasterMisc.py
Type:      function
src_travel_clipped = rio.open(out_travel)
pop_std, pop_meta = rMisc.standardizeInputRasters(
    src_pop, src_travel_clipped, resampling_type="nearest"
)

Original population raster metadata:

src_pop.meta
{'driver': 'GTiff', 'dtype': 'float32', 'nodata': -99999.0, 'width': 533, 'height': 773, 'count': 1, 'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'), 'transform': Affine(0.0083333333, 0.0, -3.251249987361092, 0.0, -0.0083333333, 11.174583493482885)}

New population metadata aligned with travel surface:

pop_meta, surface_meta
({'driver': 'GTiff', 'dtype': 'float32', 'nodata': -99999.0, 'width': 535, 'height': 773, 'count': 1, 'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'), 'transform': Affine(0.008333333333333333, 0.0, -3.2583333333333258, 0.0, -0.008333333333333333, 11.174999999999997)}, {'driver': 'GTiff', 'dtype': 'float32', 'nodata': -9999.0, 'width': 535, 'height': 773, 'count': 1, 'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'), 'transform': Affine(0.008333333333333333, 0.0, -3.2583333333333258, 0.0, -0.008333333333333333, 11.174999999999997)})
src_pop.close()
src_travel.close()
src_travel_clipped.close()
os.remove(out_travel)
os.remove(wp_path)