GOSTrocks is an internal sandbox repository that contains commonly used geospatial functions and operations. This notebooks demonstrates the usage of selected functions.
The repository also contains customized code to process specific datasets (GHSL, OpenStreetMap, Nighttime Lights, Fathom). This code is in constant development as project requirements evolve over time.
Installation¶
From PyPI¶
pip install GOSTrocksFrom Source¶
git clone https://github.com/worldbank/GOSTrocks.git
cd GOSTrocks
pip install .Import Functions¶
GOSTrocks is broken down into modules of functions that share common elements.
import GOSTrocks.dataMisc as dMisc # search and download specific datasets
import GOSTrocks.rasterMisc as rMisc # Raster tools!Raster Example¶
One of the most recurring workflows is to clip and combine raster data from different sources.
Here, we work with population data from WorldPop, and a friction surface from the Malaria Atlas Project, using Ghana as our area of interest.
They are both global datasets with ~1km resolution, so they both need to be clipped and standardized for analysis.
We will use administrative boundaries from the GeoBoundaries API.
import os
from os.path import join, exists
import urllib.request
import rasterio as rio
from rasterio.plot import show
import matplotlib.pyplot as plt
import matplotlib.colors as colorscountry = "Ghana"
iso = "GHA"Get Boundaries¶
dMisc.get_geoboundaries?Signature:
dMisc.get_geoboundaries(
iso3,
level,
geo_api='https://www.geoboundaries.org/api/current/gbOpen/{iso3}/{adm}/',
)
Docstring:
Download boundaries dataset from geobounadries
Parameters
----------
iso3 : bool
ISO3 code of country to download
level : str
Admin code to download in format of "ADM1" or "ADM2"
geo_api : str, optional
by default "https://www.geoboundaries.org/api/current/gbOpen/{iso3}/{adm}/"
Returns
-------
gpd.GeoDataFrame
spatial data representing the administrative boundaries
File: ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py
Type: functiongdf = dMisc.get_geoboundaries(iso, level="ADM0")
gdf.plot()---------------------------------------------------------------------------
AttributeError Traceback (most recent call last)
File ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py:130, in get_geoboundaries(iso3, level, geo_api)
129 data = json.load(url)
--> 130 geo_data = gpd.read_file(data["gjDownloadURL"])
131 return geo_data
File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:281, in _read_file(filename, bbox, mask, rows, engine, **kwargs)
279 path_or_bytes = filename
--> 281 return _read_file_fiona(
282 path_or_bytes, from_bytes, bbox=bbox, mask=mask, rows=rows, **kwargs
283 )
285 else:
File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:299, in _read_file_fiona(path_or_bytes, from_bytes, bbox, mask, rows, where, **kwargs)
295 if not from_bytes:
296 # Opening a file via URL or file-like-object above automatically detects a
297 # zipped file. In order to match that behavior, attempt to add a zip scheme
298 # if missing.
--> 299 if _is_zip(str(path_or_bytes)):
300 parsed = fiona.parse_path(str(path_or_bytes))
File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/site-packages/geopandas/io/file.py:166, in _is_zip(path)
165 """Check if a given path is a zipfile"""
--> 166 parsed = fiona.path.ParsedPath.from_uri(path)
167 return (
168 parsed.archive.endswith(".zip")
169 if parsed.archive
170 else parsed.path.endswith(".zip")
171 )
AttributeError: module 'fiona' has no attribute 'path'
During handling of the above exception, another exception occurred:
ValueError Traceback (most recent call last)
Cell In[5], line 1
----> 1 gdf = dMisc.get_geoboundaries(iso, level="ADM0")
2 gdf.plot()
File ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/dataMisc.py:134, in get_geoboundaries(iso3, level, geo_api)
132 except Exception:
133 all_url = geo_api.format(iso3=iso3, adm="ALL")
--> 134 raise (
135 ValueError(
136 f"Cannot find admin dataset {cur_url}. Check out {all_url} for details on what is available"
137 )
138 )
ValueError: Cannot find admin dataset https://www.geoboundaries.org/api/current/gbOpen/GHA/ADM0/. Check out https://www.geoboundaries.org/api/current/gbOpen/GHA/ALL/ for details on what is availableDownload Population Data¶
wp_url = f"https://data.worldpop.org/GIS/Population/Global_2000_2020_1km/2020/{iso.upper()}/{iso.lower()}_ppp_2020_1km_Aggregated.tif"
wp_url'https://data.worldpop.org/GIS/Population/Global_2000_2020_1km/2020/GHA/gha_ppp_2020_1km_Aggregated.tif'wp_path = join(
"..", "data", f"{iso}_ppp_2020_1km_Aggregated_UNadj.tif"
) # Download from link above
if not exists(wp_path):
urllib.request.urlretrieve(wp_url, wp_path)---------------------------------------------------------------------------
FileNotFoundError Traceback (most recent call last)
Cell In[10], line 5
1 wp_path = join(
2 "..", "data", f"{iso}_ppp_2020_1km_Aggregated_UNadj.tif"
3 ) # Download from link above
4 if not exists(wp_path):
----> 5 urllib.request.urlretrieve(wp_url, wp_path)
File /opt/homebrew/Caskroom/miniconda/base/envs/gostrocks/lib/python3.8/urllib/request.py:257, in urlretrieve(url, filename, reporthook, data)
255 # Handle temporary file setup.
256 if filename:
--> 257 tfp = open(filename, 'wb')
258 else:
259 tfp = tempfile.NamedTemporaryFile(delete=False)
FileNotFoundError: [Errno 2] No such file or directory: '../data/GHA_ppp_2020_1km_Aggregated_UNadj.tif'src_pop = rio.open(wp_path)fig, ax = plt.subplots(1, 1, figsize=(8, 8))
show(src_pop, norm=colors.PowerNorm(gamma=0.5), cmap="viridis", ax=ax)
plt.axis("off")(np.float64(-3.251249987361092),
np.float64(1.190416661538908),
np.float64(4.732916852582885),
np.float64(11.174583493482885))
Clip Global Friction Surface¶
gfs_path = "/Users/glevin/Documents/WorldBank/GOSTrocks/2020_motorized_friction_surface.geotiff"
src_travel = rio.open(gfs_path)# fig, ax = plt.subplots(1, 1, figsize=(8, 8))
# show(src_travel, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
# plt.axis("off")rMisc.clipRaster?Signature: rMisc.clipRaster(inR, inD, outFile=None, crop=True)
Docstring:
Clip input raster to provided geodataframe
Parameters
----------
inR : rasterio.DatasetReader
Input raster to clip
inD : geopandas.GeoDataFrame
GeoDataFrame containing the clipping geometry
outFile : str, optional
File to write the clipped raster to, by default None
crop : bool, optional
Whether to crop the raster to the unary_union of the GeoDataFrame (True) or to the bounding box (False), by default True
Returns
-------
tuple
A tuple containing the clipped raster and its metadata.
File: ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/rasterMisc.py
Type: functionout_travel = join("..", "data", f"{iso.upper()}_2020_motorized_friction_surface.tif")
surface, surface_meta = rMisc.clipRaster(
src_travel, gdf, outFile=out_travel, crop=False
)---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[14], line 3
1 out_travel = join("..", "data", f"{iso.upper()}_2020_motorized_friction_surface.tif")
2 surface, surface_meta = rMisc.clipRaster(
----> 3 src_travel, gdf, outFile=out_travel, crop=False
4 )
NameError: name 'gdf' is not definedfig, ax = plt.subplots(1, 1, figsize=(8, 8))
show(surface, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
plt.axis("off")---------------------------------------------------------------------------
NameError Traceback (most recent call last)
Cell In[15], line 2
1 fig, ax = plt.subplots(1, 1, figsize=(8, 8))
----> 2 show(surface, norm=colors.PowerNorm(gamma=0.2), cmap="magma", ax=ax)
3 plt.axis("off")
NameError: name 'surface' is not defined
Standardize Rasters¶
Align the population raster to the friction surface, ensuring that they have the same extent and resolution.
rMisc.standardizeInputRasters?Signature:
rMisc.standardizeInputRasters(
inR1,
inR2,
inR1_outFile='',
resampling_type='nearest',
)
Docstring:
Standardize inR1 to inR2: changes crs, extent, and resolution.
:param inR1: rasterio object for raster to be modified
:type inR1: rasterio.DatasetReader
:param inR2: rasterio object to be standardized to
:type inR12 rasterio.DatasetReader
:param inR1_outfile: path to create output raster file of standardized inR1, default is '', which means nothing is written
:type inR1: string
:param resampling_type: how to perform spatial resampling; options are nearest (default), cubic, or sum
:type resampling_type: string
:return: array of numpy array, and rasterio metadata
:rtype: array
File: ~/Documents/WorldBank/GOSTrocks/src/GOSTrocks/rasterMisc.py
Type: functionsrc_travel_clipped = rio.open(out_travel)pop_std, pop_meta = rMisc.standardizeInputRasters(
src_pop, src_travel_clipped, resampling_type="nearest"
)Original population raster metadata:
src_pop.meta{'driver': 'GTiff',
'dtype': 'float32',
'nodata': -99999.0,
'width': 533,
'height': 773,
'count': 1,
'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'),
'transform': Affine(0.0083333333, 0.0, -3.251249987361092,
0.0, -0.0083333333, 11.174583493482885)}New population metadata aligned with travel surface:
pop_meta, surface_meta({'driver': 'GTiff',
'dtype': 'float32',
'nodata': -99999.0,
'width': 535,
'height': 773,
'count': 1,
'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'),
'transform': Affine(0.008333333333333333, 0.0, -3.2583333333333258,
0.0, -0.008333333333333333, 11.174999999999997)},
{'driver': 'GTiff',
'dtype': 'float32',
'nodata': -9999.0,
'width': 535,
'height': 773,
'count': 1,
'crs': CRS.from_wkt('GEOGCS["WGS 84",DATUM["WGS_1984",SPHEROID["WGS 84",6378137,298.257223563,AUTHORITY["EPSG","7030"]],AUTHORITY["EPSG","6326"]],PRIMEM["Greenwich",0,AUTHORITY["EPSG","8901"]],UNIT["degree",0.0174532925199433,AUTHORITY["EPSG","9122"]],AXIS["Latitude",NORTH],AXIS["Longitude",EAST],AUTHORITY["EPSG","4326"]]'),
'transform': Affine(0.008333333333333333, 0.0, -3.2583333333333258,
0.0, -0.008333333333333333, 11.174999999999997)})src_pop.close()
src_travel.close()
src_travel_clipped.close()os.remove(out_travel)
os.remove(wp_path)