Usage#

Creating a ZonalStats object#

ZonalStats() is the main class to request temporal and zonal statistics using the GEE backend. The object can be initialized with parameters specifying data inputs and the type of aggregation.

class gee_zonal.zonalstats.ZonalStats(target_features, statistic_type, collection_id=None, ee_dataset=None, band=None, output_name=None, output_dir=None, frequency='original', temporal_stat=None, scale=250, min_threshold=None, mask=None, tile_scale=1, start_year=None, end_year=None, scale_factor=None, mapped=False)#

Python class to calculate zonal and temporal statistics from Earth Engine datasets (ee.ImageCollection or ee.Image) over vector shapes (ee.FeatureCollections).

Parameters

target_features (ee.FeatureCollection, gpd.GeoDataFrame, or str path to a shapefile/GeoJSON) – vector features
statistic_type (str - mean, max, median, min, sum, stddev, var, count, minmax, p75, p25, p95, all) – method to aggregate image pixels by zone
collection_id (str, default: None) – ID for Earth Engine dataset
ee_dataset (ee.Image or ee.ImageCollection, default: None) – input dataset if no collection ID is provided
band (str, default: None) – name of image band to use
output_name (str, default: None) – file name for output statistics if saved to Google Drive
output_dir (str, default: None) – directory name for output statistics if saved to Google Drive
frequency (str (monthly, annual, or original), default: original) – temporal frequency for aggregation
temporal_stat (str (mean, max, median, min, sum), default: None) – statistic for temporal aggregation
scale (int, default: 250) – scale for calculation in mts
min_threshold (int, default: None) – filter out values lower than treshold
mask (ee.Image, default: None) – filter out observations where mask is zero
tile_scale (int, default: 1) – tile scale factor for parallel processing
start_year (int, default: None) – specify start year for statistics
end_year (int, default: None) – specify end year for statistics
scale_factor (int, default: None) – scale factor to multiply ee.Image to get correct units
mapped (bool, default: False) – Boolean to indicate whether to use mapped or non-mapped version of zonal stats

Input target features can be referenced directly as a GEE asset, or can be supplied as a geopandas.GeoDataFrame, or a path to a shapefile/GeoJSON (will be automatically converted to ee.FeatureCollection).

ZonalStats.runZonalStats()#

Run zonal statistics aggregation

Returns: tabular statistics
Return type: DataFrame or dict with EE task status if output_name/dir is specified

Retrieving output table#

1. Retrieve output table directly#

Statistics can be accessed as the result of ZonalStats.runZonalStats() This will be computed within the python earth engine environment.

from gee_zonal import ZonalStats
AOIs = ee.FeatureCollection('users/afche18/Ethiopia_AOI') # ID of ee.FeatureCollection
zs = ZonalStats(
  collection_id = 'LANDSAT/LC08/C01/T1_8DAY_NDVI',
  target_features = AOIs,
  statistic_type = "all", # all includes min, max, mean, and stddev
  frequency = "annual",
  temporal_stat = "mean"
)
df = zs.runZonalStats()
df

2. Submit an EE Task#

Alternatively, a task can be submitted to the Earth Engine servers by specifying an output_name and outuput_dir.

This option is recommended to run statistics for big areas or for a high number of collections. The output table will be saved on the specified directory in Google Drive.

import ee
from gee_tools import ZonalStats
zs = ZonalStats(
  collection_id = 'LANDSAT/LC08/C01/T1_8DAY_NDVI',
  target_features = AOIs,
  statistic_type = "all", # all includes min, max, mean, and stddev
  frequency = "annual",
  temporal_stat = "mean",
  output_dir = "pretty_folder",
  output_name= "pretty_file"
)
zs.runZonalStats()

The status of the task can be monitored with ZonalStats.reportRunTime()

>>> zs.reportRunTime()
Completed
Runtime: 0 minutes and 2 seconds

Searching the EE catalog#

The Earth Engine Data Catalog is an archive of public datasets available via Google Earth Engine. The Catalog() class provides a quick way to search for datasets by tags, title, and year / time period.

Initialize Catalog Object#

The catalog object contains a datasets variable, a DataFrame containing a copy of the Earth Engine data catalog.

from gee_zonal import Catalog
cat = Catalog()
cat.datasets

Search functions#

results = cat.search_tags("ndvi")
results = results.search_by_period(1985, 2021)
results = results.search_title("landsat")
print(results)

class gee_zonal.catalog.Catalog(datasets=None, redownload=False)#

Inventory of Earth Engine public, saved as a DataFrame under datasets variable

search_by_period(start, end)#

get all datasets that intersect a time period:: start of dataset <= end year end of dataset >= start year

search_by_year(year)#

get all datasets from a particular year:: dataset start <= year <= dataset end

search_tags(keyword)#: search for keyword in tags

search_title(keyword)#: search for keyword in title

Usage

Contents