Python Client#

A Python client for accessing the Space2Stats API, providing easy access to consistent, comparable, and authoritative sub-national variation data from the World Bank.

class space2stats_client.Space2StatsClient(base_url: str = 'https://space2stats.ds.io', verify_ssl: bool = True)#

Bases: object

Client for interacting with the Space2Stats API.

This client is provided by the World Bank to access spatial statistics data. The World Bank makes no warranties regarding the accuracy, reliability or completeness of the results and content.

get_topics() DataFrame#

Get a table of items (dataset themes/topics) from the STAC catalog.

get_properties(item_id: str) Dict#

Get a table with a description of variables for a given dataset (item).

get_fields() List[str]#

Get a list of all available fields from the Space2Stats API.

Returns:

A list of field names that can be used with the API.

Return type:

List[str]

Raises:

Exception – If the API request fails.

fetch_admin_boundaries(iso3: str, adm: str) GeoDataFrame#

Fetch administrative boundaries from GeoBoundaries API.

get_summary(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], fields: List[str], geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame#

Extract h3 level data from Space2Stats for a GeoDataFrame.

Parameters:
  • gdf (GeoDataFrame) – The Areas of Interest

  • spatial_join_method (["touches", "centroid", "within"]) –

    The method to use for performing the spatial join between the AOI and H3 cells
    • ”touches”: Includes H3 cells that touch the AOI

    • ”centroid”: Includes H3 cells where the centroid falls within the AOI

    • ”within”: Includes H3 cells entirely within the AOI

  • fields (List[str]) – A list of field names to retrieve from the statistics table.

  • geometry (Optional["polygon", "point"]) – Specifies if the H3 geometries should be included in the response.

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame with the requested fields for each H3 cell.

Return type:

DataFrame

get_aggregate(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], fields: list, aggregation_type: Literal['sum', 'avg', 'count', 'max', 'min'], verbose: bool = True) DataFrame#

Extract summary statistic from underlying H3 Space2Stats data.

Parameters:
  • gdf (GeoDataFrame) – The Areas of Interest

  • spatial_join_method (["touches", "centroid", "within"]) – The method to use for performing the spatial join

  • fields (List[str]) – A list of field names to retrieve

  • aggregation_type (["sum", "avg", "count", "max", "min"]) – Statistical function to apply to each field per AOI.

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame with the aggregated statistics.

Return type:

DataFrame

get_summary_by_hexids(hex_ids: List[str], fields: List[str], geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame#

Retrieve statistics for specific hex IDs.

Parameters:
  • hex_ids (List[str]) – List of H3 hexagon IDs to query

  • fields (List[str]) – List of field names to retrieve from the statistics table

  • geometry (Optional[Literal["polygon", "point"]]) – Specifies if the H3 geometries should be included in the response.

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame with the requested fields for each H3 cell.

Return type:

DataFrame

get_aggregate_by_hexids(hex_ids: List[str], fields: List[str], aggregation_type: Literal['sum', 'avg', 'count', 'max', 'min'], verbose: bool = True) DataFrame#

Aggregate statistics for specific hex IDs.

Parameters:
  • hex_ids (List[str]) – List of H3 hexagon IDs to aggregate

  • fields (List[str]) – List of field names to aggregate

  • aggregation_type (Literal["sum", "avg", "count", "max", "min"]) – Type of aggregation to perform

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame with the aggregated statistics.

Return type:

DataFrame

get_timeseries_fields() List[str]#

Get available fields from the timeseries table.

Returns:

List of field names available in the timeseries table

Return type:

List[str]

Raises:

Exception – If the API request fails

get_timeseries(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], start_date: str | None = None, end_date: str | None = None, fields: List[str] | None = None, geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame#

Get timeseries data for areas of interest.

Parameters:
  • gdf (GeoDataFrame) – The Areas of Interest

  • spatial_join_method (["touches", "centroid", "within"]) –

    The method to use for performing the spatial join between the AOI and H3 cells
    • ”touches”: Includes H3 cells that touch the AOI

    • ”centroid”: Includes H3 cells where the centroid falls within the AOI

    • ”within”: Includes H3 cells entirely within the AOI

  • start_date (Optional[str]) – Start date for filtering data (format: ‘YYYY-MM-DD’)

  • end_date (Optional[str]) – End date for filtering data (format: ‘YYYY-MM-DD’)

  • fields (Optional[List[str]]) – List of fields to retrieve. If None, all available fields will be returned.

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame containing timeseries data for each hex ID and date

Return type:

DataFrame

get_timeseries_by_hexids(hex_ids: List[str], fields: List[str], start_date: str | None = None, end_date: str | None = None, geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame#

Get timeseries data for specific hex IDs.

Parameters:
  • hex_ids (List[str]) – List of H3 hexagon IDs to query

  • fields (List[str]) – List of fields to retrieve from the timeseries table

  • start_date (Optional[str]) – Start date for filtering data (format: ‘YYYY-MM-DD’)

  • end_date (Optional[str]) – End date for filtering data (format: ‘YYYY-MM-DD’)

  • geometry (Optional[Literal["polygon", "point"]]) – Specifies if the H3 geometries should be included in the response.

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame containing timeseries data for each hex ID and date

Return type:

DataFrame

WORLD_BANK_DDH_DATASETS = {'flood_exposure': 'DR0095355', 'nighttimelights': 'DR0095356', 'population': 'DR0095354', 'urbanization': 'DR0095357'}#
WORLD_BANK_DDH_BASE_URL = 'https://datacatalogapi.worldbank.org/ddhxext/v3/resources'#
get_adm2_summaries(dataset: Literal['urbanization', 'nighttimelights', 'population', 'flood_exposure'], iso3_filter: str | None = None, verbose: bool = True) DataFrame#

Retrieve ADM2 summaries from World Bank DDH API.

Parameters:
  • dataset (Literal["urbanization", "nighttimelights", "population", "flood_exposure"]) – The dataset to retrieve: - “urbanization”: Urban and rural settlement data - “nighttimelights”: Nighttime lights intensity data - “population”: Population statistics - “flood_exposure”: Flood exposure risk data

  • iso3_filter (Optional[str]) – ISO3 country code to filter by (e.g., ‘AND’ for Andorra, ‘USA’ for United States)

  • verbose (bool) – Whether to display progress messages (default: True)

Returns:

A DataFrame containing ADM2-level statistics records

Return type:

DataFrame

Raises:
  • ValueError – If an invalid dataset is specified

  • Exception – If the API request fails

get_adm2_dataset_info() DataFrame#

Get information about available ADM2 datasets.

Returns:

A DataFrame with information about each available ADM2 dataset

Return type:

DataFrame

Quick Start#

Install the package using pip:

pip install space2stats-client

Use the client to access the Space2Stats API:

from space2stats_client import Space2StatsClient
import geopandas as gpd

# Initialize the client
client = Space2StatsClient()

# Get available topics/datasets
topics = client.get_topics()
print(topics)

# Get fields for a specific dataset
fields = client.get_fields("dataset_id")
print(fields)

# Get data for an area of interest
gdf = gpd.read_file("path/to/your/area.geojson")
summary = client.get_summary(
    gdf=gdf,
    spatial_join_method="centroid",
    fields=["population", "gdp"]
)

# Get aggregated statistics
aggregates = client.get_aggregate(
    gdf=gdf,
    spatial_join_method="centroid",
    fields=["population", "gdp"],
    aggregation_type="sum"
)

Notebook Example#

The following example demonstrates how to use the Space2Stats client in a Jupyter notebook: Flood Exposure Notebook