Python Client#
A Python client for accessing the Space2Stats API, providing easy access to consistent, comparable, and authoritative sub-national variation data from the World Bank.
- class space2stats_client.Space2StatsClient(base_url: str = 'https://space2stats.ds.io', verify_ssl: bool = True)#
Bases:
object
Client for interacting with the Space2Stats API.
This client is provided by the World Bank to access spatial statistics data. The World Bank makes no warranties regarding the accuracy, reliability or completeness of the results and content.
- get_topics() DataFrame #
Get a table of items (dataset themes/topics) from the STAC catalog.
- get_properties(item_id: str) Dict #
Get a table with a description of variables for a given dataset (item).
- get_fields() List[str] #
Get a list of all available fields from the Space2Stats API.
- Returns:
A list of field names that can be used with the API.
- Return type:
List[str]
- Raises:
Exception – If the API request fails.
- fetch_admin_boundaries(iso3: str, adm: str) GeoDataFrame #
Fetch administrative boundaries from GeoBoundaries API.
- get_summary(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], fields: List[str], geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame #
Extract h3 level data from Space2Stats for a GeoDataFrame.
- Parameters:
gdf (GeoDataFrame) – The Areas of Interest
spatial_join_method (["touches", "centroid", "within"]) –
- The method to use for performing the spatial join between the AOI and H3 cells
”touches”: Includes H3 cells that touch the AOI
”centroid”: Includes H3 cells where the centroid falls within the AOI
”within”: Includes H3 cells entirely within the AOI
fields (List[str]) – A list of field names to retrieve from the statistics table.
geometry (Optional["polygon", "point"]) – Specifies if the H3 geometries should be included in the response.
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame with the requested fields for each H3 cell.
- Return type:
DataFrame
- get_aggregate(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], fields: list, aggregation_type: Literal['sum', 'avg', 'count', 'max', 'min'], verbose: bool = True) DataFrame #
Extract summary statistic from underlying H3 Space2Stats data.
- Parameters:
gdf (GeoDataFrame) – The Areas of Interest
spatial_join_method (["touches", "centroid", "within"]) – The method to use for performing the spatial join
fields (List[str]) – A list of field names to retrieve
aggregation_type (["sum", "avg", "count", "max", "min"]) – Statistical function to apply to each field per AOI.
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame with the aggregated statistics.
- Return type:
DataFrame
- get_summary_by_hexids(hex_ids: List[str], fields: List[str], geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame #
Retrieve statistics for specific hex IDs.
- Parameters:
hex_ids (List[str]) – List of H3 hexagon IDs to query
fields (List[str]) – List of field names to retrieve from the statistics table
geometry (Optional[Literal["polygon", "point"]]) – Specifies if the H3 geometries should be included in the response.
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame with the requested fields for each H3 cell.
- Return type:
DataFrame
- get_aggregate_by_hexids(hex_ids: List[str], fields: List[str], aggregation_type: Literal['sum', 'avg', 'count', 'max', 'min'], verbose: bool = True) DataFrame #
Aggregate statistics for specific hex IDs.
- Parameters:
hex_ids (List[str]) – List of H3 hexagon IDs to aggregate
fields (List[str]) – List of field names to aggregate
aggregation_type (Literal["sum", "avg", "count", "max", "min"]) – Type of aggregation to perform
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame with the aggregated statistics.
- Return type:
DataFrame
- get_timeseries_fields() List[str] #
Get available fields from the timeseries table.
- Returns:
List of field names available in the timeseries table
- Return type:
List[str]
- Raises:
Exception – If the API request fails
- get_timeseries(gdf: GeoDataFrame, spatial_join_method: Literal['touches', 'centroid', 'within'], start_date: str | None = None, end_date: str | None = None, fields: List[str] | None = None, geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame #
Get timeseries data for areas of interest.
- Parameters:
gdf (GeoDataFrame) – The Areas of Interest
spatial_join_method (["touches", "centroid", "within"]) –
- The method to use for performing the spatial join between the AOI and H3 cells
”touches”: Includes H3 cells that touch the AOI
”centroid”: Includes H3 cells where the centroid falls within the AOI
”within”: Includes H3 cells entirely within the AOI
start_date (Optional[str]) – Start date for filtering data (format: ‘YYYY-MM-DD’)
end_date (Optional[str]) – End date for filtering data (format: ‘YYYY-MM-DD’)
fields (Optional[List[str]]) – List of fields to retrieve. If None, all available fields will be returned.
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame containing timeseries data for each hex ID and date
- Return type:
DataFrame
- get_timeseries_by_hexids(hex_ids: List[str], fields: List[str], start_date: str | None = None, end_date: str | None = None, geometry: Literal['polygon', 'point'] | None = None, verbose: bool = True) DataFrame #
Get timeseries data for specific hex IDs.
- Parameters:
hex_ids (List[str]) – List of H3 hexagon IDs to query
fields (List[str]) – List of fields to retrieve from the timeseries table
start_date (Optional[str]) – Start date for filtering data (format: ‘YYYY-MM-DD’)
end_date (Optional[str]) – End date for filtering data (format: ‘YYYY-MM-DD’)
geometry (Optional[Literal["polygon", "point"]]) – Specifies if the H3 geometries should be included in the response.
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame containing timeseries data for each hex ID and date
- Return type:
DataFrame
- WORLD_BANK_DDH_DATASETS = {'flood_exposure': 'DR0095355', 'nighttimelights': 'DR0095356', 'population': 'DR0095354', 'urbanization': 'DR0095357'}#
- WORLD_BANK_DDH_BASE_URL = 'https://datacatalogapi.worldbank.org/ddhxext/v3/resources'#
- get_adm2_summaries(dataset: Literal['urbanization', 'nighttimelights', 'population', 'flood_exposure'], iso3_filter: str | None = None, verbose: bool = True) DataFrame #
Retrieve ADM2 summaries from World Bank DDH API.
- Parameters:
dataset (Literal["urbanization", "nighttimelights", "population", "flood_exposure"]) – The dataset to retrieve: - “urbanization”: Urban and rural settlement data - “nighttimelights”: Nighttime lights intensity data - “population”: Population statistics - “flood_exposure”: Flood exposure risk data
iso3_filter (Optional[str]) – ISO3 country code to filter by (e.g., ‘AND’ for Andorra, ‘USA’ for United States)
verbose (bool) – Whether to display progress messages (default: True)
- Returns:
A DataFrame containing ADM2-level statistics records
- Return type:
DataFrame
- Raises:
ValueError – If an invalid dataset is specified
Exception – If the API request fails
- get_adm2_dataset_info() DataFrame #
Get information about available ADM2 datasets.
- Returns:
A DataFrame with information about each available ADM2 dataset
- Return type:
DataFrame
Quick Start#
Install the package using pip:
pip install space2stats-client
Use the client to access the Space2Stats API:
from space2stats_client import Space2StatsClient
import geopandas as gpd
# Initialize the client
client = Space2StatsClient()
# Get available topics/datasets
topics = client.get_topics()
print(topics)
# Get fields for a specific dataset
fields = client.get_fields("dataset_id")
print(fields)
# Get data for an area of interest
gdf = gpd.read_file("path/to/your/area.geojson")
summary = client.get_summary(
gdf=gdf,
spatial_join_method="centroid",
fields=["population", "gdp"]
)
# Get aggregated statistics
aggregates = client.get_aggregate(
gdf=gdf,
spatial_join_method="centroid",
fields=["population", "gdp"],
aggregation_type="sum"
)
Notebook Example#
The following example demonstrates how to use the Space2Stats client in a Jupyter notebook: Flood Exposure Notebook