Open In Colab

4. Cell statistics and basic band math (10 min)

In the previous sub-module, Conditional operations (10 min), we created masks using boolean operations that were applied to individual pixels in the raster file (remember, think cells in a matrix).

In this tutorial we’ll expand on the idea of calculating per-pixel and per-band statistics with some simple mathmatical operations.

We’re going to do this by cleaning an image by subtracting the mean and dividing by the standard deviation.

Our tasks in this exercise:

  1. Clip a scene of the 2017 median VIIRS-DNB to Timor-Leste

  2. Brief overview of scaling

  3. Use the Reduce region to get the mean and standard deviation of the Image

  4. Create a cleaned layer

4.1. Clip a scene of 2017 VIIRS-DNB for Timor-Leste

4.1.1. Get the geometry from Google Earth Engine’s Global Adminitive Unit Layer and clip a composite of of VIIRS-DNB from December 2017 to the geometry

Timor-Leste is a country in the Southeast Asia / Pacific region that shared an island with Indonesia.

# reminder that if you are installing libraries in a Google Colab instance you will be prompted to restart your kernal

try:
    import geemap, ee
except ModuleNotFoundError:
    if 'google.colab' in str(get_ipython()):
        print("package not found, installing w/ pip in Google Colab...")
        !pip install geemap
    else:
        print("package not found, installing w/ conda...")
        !conda install mamba -c conda-forge -y
        !mamba install geemap -c conda-forge -y
    import geemap, ee
try:
        ee.Initialize()
except Exception as e:
        ee.Authenticate()
        ee.Initialize()

# get December image, we're using the "avg_rad" band
viirs2017_12 = ee.ImageCollection("NOAA/VIIRS/DNB/MONTHLY_V1/VCMSLCFG").filterDate("2017-12-01","2017-12-31").select('avg_rad').first()

# get the geometry for Timor-Leste from GEE's tagged datasets
tls = ee.Feature(ee.FeatureCollection("FAO/GAUL/2015/level0").filter(ee.Filter.eq('ADM0_NAME', 'Timor-Leste')).first()).geometry()

# clip our VIIRS image to Timor-Leste
ntl_tls = viirs2017_12.clip(tls)


# initialize our map and center it on Timor-Leste
map1 = geemap.Map()
map1.centerObject(tls, zoom=8)
map1.add_basemap('SATELLITE')
map1.addLayer(ntl_tls, {}, "VIIRS-DNB Dec 2017")
map1.addLayerControl()
map1

4.1.2. A brief overview of scaling

Timor-Leste is a developing nation and relative to its neighbor, has lower-levels of light. As such, background light that the VIIRS instrument is sensitive to can influence interpretation. As we look at this scene, you can see the relatively high levels of “noise” present.

As discussed earlier, one approach to increase the signal / noise ratio would be to reduce data over time.

But if the noise levels persist throughout the time period, that may not reduce the noise much. And what if your analysis is specifically to look at December 2017?

Or what if you’re looking to conduct comparative analysis on these data or use them as inputs for a model for statistical inference?

In this case, you will very likely want to reduce the noise levels in your data in order for your algorithm to learn your data without over-fitting (in other words, a more sensitive model might “learn” the noise…which is generally bad). Additionally, many loss functions are subject to “exploding” or “vanishing” gradients if your data are not close to zero and scaled.

4.1.3. Standardizing the image: zero mean and unit variance

That’s all important for data science and statistical inference, but out of scope for our tutorial here.

In any case, there are a few ways to scale an image or matrix and prepare it for analysis or modeling and one common one is to standardize the data to be “zero-centered” and have unit variance.

This is done simply by subtracting the mean of the entire image from each pixel’s value and then dividing that by the standard deviation of the entire image.

\[X' = \frac{X - \mu}{\sigma}\]

We’ll conduct this statistical operation on our Timor-Leste scene.

4.2. Using the reduceRegion function to get the mean and standard deviation of a region

There is more documention on this here here, but the key argument we use here is:

  • reducer: the function used to reduce data (mean or standard deviation in our case).

mu = ntl_tls.reduceRegion(reducer=ee.Reducer.mean())
std = ntl_tls.reduceRegion(reducer=ee.Reducer.stdDev())

# we'll cast these to native ee Numbers using the ee.Number constructor
mu = ee.Number(mu.get('avg_rad'))
std = ee.Number(std.get('avg_rad'))

# in order to confirm, we have to evaluate this to print client size (i.e. our computer)
print(f"the mean avg radiance (per month) for the pixels in our Timor-Leste region is: {mu.getInfo():.4f}")
print(f"the std dev is: {std.getInfo():.4f}")
the mean avg radiance (per month) for the pixels in our Timor-Leste region is: 0.2980
the std dev is: 0.6565

4.2.1. Now we can create a clean layer by subtracting the mean from our initial image and then dividing by the st. dev.

ntl_tls_std = ntl_tls.subtract(mu).divide(std)

Now we can add it to a map object with a slide panel and compare it to our original layer!

# create a split panel map
left_layer = geemap.ee_tile_layer(ntl_tls, {},'VIIRS-DNB 2017')
right_layer = geemap.ee_tile_layer(ntl_tls_std, {},'VIIRS-DNB 2017 (zero-centered and unit variance)')

map2 = geemap.Map()
map2.centerObject(tls, zoom=8)
map2.add_basemap('SATELLITE')
map2.split_map(left_layer=left_layer, right_layer=right_layer)
map2.addLayerControl()
map2

You can see that although the overal information is reduced, you do have a “cleaner” layer which can aid your analysis or modeling effort.

4.2.2. Band math

Since we’re dealing with nighttime light images, which are single-band (they’re panchromatic images as noted in the first module), we have not had to deal with band selection much.

But one very common math operation with satellite data involves creating normalized differences (such as the Normalized Difference Vegetation Index or NDVI) that, as suggests, gets the difference in value between particular bands (which in this case represent different segments of the visible electromagnetic spectrum).

We’ll talk more about this when we discuss getting image differences.

For now, we’ve learned how to conduct simple mathmatical operations on a single band. We’ve also learned about scaling images and getting simple regional statistics, such as the mean and the standard deviation.

4.3. References: