Usage#
Creating a ZonalStats object#
ZonalStats() is the main class to request temporal and zonal statistics using the GEE backend. The object can be initialized with parameters specifying data inputs and the type of aggregation.
- class gee_zonal.zonalstats.ZonalStats(target_features, statistic_type, collection_id=None, ee_dataset=None, band=None, output_name=None, output_dir=None, frequency='original', temporal_stat=None, scale=250, min_threshold=None, mask=None, tile_scale=1, start_year=None, end_year=None, scale_factor=None, mapped=False)#
Python class to calculate zonal and temporal statistics from Earth Engine datasets (ee.ImageCollection or ee.Image) over vector shapes (ee.FeatureCollections).
- Parameters
target_features (ee.FeatureCollection, gpd.GeoDataFrame, or str path to a shapefile/GeoJSON) – vector features
statistic_type (str - mean, max, median, min, sum, stddev, var, count, minmax, p75, p25, p95, all) – method to aggregate image pixels by zone
collection_id (str, default: None) – ID for Earth Engine dataset
ee_dataset (ee.Image or ee.ImageCollection, default: None) – input dataset if no collection ID is provided
band (str, default: None) – name of image band to use
output_name (str, default: None) – file name for output statistics if saved to Google Drive
output_dir (str, default: None) – directory name for output statistics if saved to Google Drive
frequency (str (monthly, annual, or original), default: original) – temporal frequency for aggregation
temporal_stat (str (mean, max, median, min, sum), default: None) – statistic for temporal aggregation
scale (int, default: 250) – scale for calculation in mts
min_threshold (int, default: None) – filter out values lower than treshold
mask (ee.Image, default: None) – filter out observations where mask is zero
tile_scale (int, default: 1) – tile scale factor for parallel processing
start_year (int, default: None) – specify start year for statistics
end_year (int, default: None) – specify end year for statistics
scale_factor (int, default: None) – scale factor to multiply ee.Image to get correct units
mapped (bool, default: False) – Boolean to indicate whether to use mapped or non-mapped version of zonal stats
Input target features can be referenced directly as a GEE asset, or can be supplied
as a geopandas.GeoDataFrame
, or a path to a shapefile/GeoJSON (will be automatically
converted to ee.FeatureCollection
).
- ZonalStats.runZonalStats()#
Run zonal statistics aggregation
- Returns
tabular statistics
- Return type
DataFrame or dict with EE task status if output_name/dir is specified
Retrieving output table#
1. Retrieve output table directly#
Statistics can be accessed as the result of ZonalStats.runZonalStats()
This will be computed within the python earth engine environment.
from gee_zonal import ZonalStats
AOIs = ee.FeatureCollection('users/afche18/Ethiopia_AOI') # ID of ee.FeatureCollection
zs = ZonalStats(
collection_id = 'LANDSAT/LC08/C01/T1_8DAY_NDVI',
target_features = AOIs,
statistic_type = "all", # all includes min, max, mean, and stddev
frequency = "annual",
temporal_stat = "mean"
)
df = zs.runZonalStats()
df
2. Submit an EE Task#
Alternatively, a task can be submitted to the Earth Engine servers by specifying an output_name and outuput_dir.
This option is recommended to run statistics for big areas or for a high number of collections. The output table will be saved on the specified directory in Google Drive.
import ee
from gee_tools import ZonalStats
zs = ZonalStats(
collection_id = 'LANDSAT/LC08/C01/T1_8DAY_NDVI',
target_features = AOIs,
statistic_type = "all", # all includes min, max, mean, and stddev
frequency = "annual",
temporal_stat = "mean",
output_dir = "pretty_folder",
output_name= "pretty_file"
)
zs.runZonalStats()
The status of the task can be monitored with ZonalStats.reportRunTime()
>>> zs.reportRunTime()
Completed
Runtime: 0 minutes and 2 seconds
Searching the EE catalog#
The Earth Engine Data Catalog is an archive of public datasets available via Google Earth Engine. The Catalog() class provides a quick way to search for datasets by tags, title, and year / time period.
Initialize Catalog Object#
The catalog object contains a datasets
variable, a DataFrame
containing a copy of the Earth Engine data catalog.
from gee_zonal import Catalog
cat = Catalog()
cat.datasets
Search functions#
results = cat.search_tags("ndvi")
results = results.search_by_period(1985, 2021)
results = results.search_title("landsat")
print(results)
- class gee_zonal.catalog.Catalog(datasets=None, redownload=False)#
Inventory of Earth Engine public, saved as a DataFrame under datasets variable
- search_by_period(start, end)#
- get all datasets that intersect a time period:
start of dataset <= end year end of dataset >= start year
- search_by_year(year)#
- get all datasets from a particular year:
dataset start <= year <= dataset end
- search_tags(keyword)#
search for keyword in tags
- search_title(keyword)#
search for keyword in title