Assessing Quality of Nighttime Lights Data#

Overview #

The quality of nighttime lights data can be impacted by a number of factors, particularly cloud cover. To facilitate analysis using high quality data, Black Marble (1) marks the quality of each pixel and (2) in some cases of poor quality pixels, Black Marble will use data from a previous date to fill the value—using a temporally-gap filled NTL value. Below we illustrate how to examine the quality of nighttime lights data.

Setup #

We first load packages and obtain a polygon for a region of interest; for this example, we use Switzerland.

Hide code cell source
import os

import colorcet as cc
import contextily as cx
import geopandas
import matplotlib.colors as mcolors
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
from rasterstats import zonal_stats

from blackmarble.extract import bm_extract
from blackmarble.raster import bm_raster, transform

plt.rcParams["figure.figsize"] = (15, 5)
%load_ext autoreload
%autoreload 2
bearer = os.getenv("BLACKMARBLE_TOKEN")
gdf = geopandas.read_file(
    "https://geodata.ucdavis.edu/gadm/gadm4.1/json/gadm41_CHE_0.json"
)
gdf.explore()
Make this Notebook Trusted to load map: File -> Trust Notebook

Daily Data #

Below shows an example examining quality for daily data (VNP46A2).

Nighttime Lights #

We download data for January 1st, 2023. When the variable parameter is not specified, bm_raster creates a raster using the Gap_Filled_DNB_BRDF-Corrected_NTL variable for daily data.

ntl_r = bm_raster(
    gdf,
    product_id="VNP46A2",
    date_range="2023-01-01",
    bearer=bearer,
    variable="Gap_Filled_DNB_BRDF-Corrected_NTL",
)
fig, ax = plt.subplots()

ntl_r["Gap_Filled_DNB_BRDF-Corrected_NTL"].sel(time="2023-01-01").plot(
    robust=True, cmap="cividis"
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/fc8e702cf747bf75ac850c8a86e4d1abf82d26b3b535819a76971daff7f5b828.png

We notice that a number of observations are missing, that are poor quality and are not gap-filled. To understand the extent of missing date, we can use the following code to determine (1) the total number of pixels that cover Switzerland, (2) the total number of non-NA nighttime light pixels, and (3) the proportion of non-NA pixels.

ntl_np = ntl_r["Gap_Filled_DNB_BRDF-Corrected_NTL"].sel(time="2023-01-01")

n_pixel = zonal_stats(
    gdf,
    ntl_np.values,
    affine=transform(ntl_np),
    nodata=np.nan,
    masked=False,
    stats=["count", "nan"],
)
## Number of pixels
n_pixel[0]["count"] + n_pixel[0]["nan"]
280710.0
## Number of non nan pixels
n_pixel[0]["count"]
120565
## Proportion of non nan pixels
n_pixel[0]["count"] / (n_pixel[0]["count"] + n_pixel[0]["nan"])
0.4295001959317445

By default, bm_extract computes the mean; we can easily also compute the number of pixels and number of nan pixels.

date_range = pd.date_range(
    "2023-01-01",
    "2023-01-10",
    freq="D",
)
ntl_df = bm_extract(
    roi=gdf,
    product_id="VNP46A2",
    date_range=date_range,
    bearer=bearer,
    variable="Gap_Filled_DNB_BRDF-Corrected_NTL",
    aggfunc=["mean", "count", "nan"],
)
[2024-02-15 14:08:40 - backoff:105 - INFO] Backing off get_url(...) for 0.7s (httpx.ReadTimeout)
[2024-02-15 14:08:40 - backoff:105 - INFO] Backing off get_url(...) for 0.2s (httpx.ReadTimeout)
ntl_df["prop_non_na_pixels"] = ntl_df["ntl_count"] / (
    ntl_df["ntl_count"] + ntl_df["ntl_nan"]
)
ntl_df
GID_0 COUNTRY geometry ntl_mean ntl_count ntl_nan date prop_non_na_pixels
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.511901 120565 160145.0 2023-01-01 0.429500
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.352877 133176 147534.0 2023-01-02 0.474426
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 0.699065 9732 270978.0 2023-01-03 0.034669
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.132924 125537 155173.0 2023-01-04 0.447212
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 0.369400 7245 273465.0 2023-01-05 0.025810
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.200263 124697 156013.0 2023-01-06 0.444220
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.021223 88498 192212.0 2023-01-07 0.315265
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.158678 120015 160695.0 2023-01-08 0.427541
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.643992 25566 255144.0 2023-01-09 0.091076
0 CHE Switzerland MULTIPOLYGON (((7.04510 45.95730, 7.02940 45.9... 1.679899 40913 239797.0 2023-01-10 0.145748

The below figure shows trends in average nighttime lights (left) and the proportion of the country with a value for nighttime lights (right). For some days, low number of pixels corresponds to low nighttime lights (eg, January 3 and 5th); however, for other days, low number of pixels corresponds to higher nighttime lights (eg, January 9 and 10). On January 3 and 5, missing pixels could have been over typically high-lit areas (eg, cities)—while on January 9 and 10, missing pixels could have been over typically lower-lit areas.

Hide code cell source
fig, axs = plt.subplots(
    1, 2, figsize=(12, 5)
)  # 1 row, 2 columns for side-by-side plots

# Left plot (date vs ntl_mean)
axs[0].plot(ntl_df["date"], ntl_df["ntl_mean"], marker="o", linestyle="-")
axs[0].set_title("ntl_mean")
axs[0].set_xlabel("Date")
axs[0].set_ylabel("ntl_mean")
axs[0].tick_params(axis="x", rotation=90)

# Right plot (date vs prop_na)
axs[1].plot(
    ntl_df["date"],
    ntl_df["prop_non_na_pixels"],
    marker="o",
    linestyle="-",
    color="orange",
)
axs[1].set_title("prop_non_na_pixels")
axs[1].set_xlabel("Date")
axs[1].set_ylabel("prop_non_na_pixels")
axs[1].tick_params(axis="x", rotation=90)

# Adjust layout for better spacing
plt.tight_layout()

# Display the plots
plt.show()
../_images/de504de2a3c56a62db8e880dfef13242517804c958a4924aac5e48e7911c58cf.png

Quality #

For daily data, the quality values are:

  • 0: High-quality, Persistent nighttime lights

  • 1: High-quality, Ephemeral nighttime Lights

  • 2: Poor-quality, Outlier, potential cloud contamination, or other issues

  • 255: No retrieval, Fill value (masked out on ingestion)

We can map quality by using the Mandatory_Quality_Flag variable.

quality_r = bm_raster(
    gdf,
    product_id="VNP46A2",
    date_range="2023-01-01",
    bearer=bearer,
    variable="Mandatory_Quality_Flag",
)
Hide code cell source
fig, ax = plt.subplots()

# Create a discrete colormap
cmap = mcolors.ListedColormap(["blue", "green", "yellow", "red"])
# Plot
quality_r["Mandatory_Quality_Flag"].sel(time="2023-01-01").plot(ax=ax, cmap=cmap)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout();
../_images/b7befd4af6883fab7d4c3ddfa4fb6c967efcd327aea7381e551c2da6c4dc06e8.png

Nighttime lights for good quality observations #

The quality_flag_rm parameter determines which pixels are set to NA based on the quality indicator. By default, only pixels with a value of 255 are filtered out. However, if we only want data for good quality pixels, we can adjust the quality_flag_rm parameter.

ntl_good_qual_r = bm_raster(
    gdf,
    product_id="VNP46A2",
    date_range="2023-01-01",
    bearer=bearer,
    variable="Gap_Filled_DNB_BRDF-Corrected_NTL",
    quality_flag_rm=[2, 255],
)
Hide code cell source
fig, ax = plt.subplots()

# Plot
ntl_good_qual_r["Gap_Filled_DNB_BRDF-Corrected_NTL"].sel(time="2023-01-01").plot(
    ax=ax, cmap=cc.cm.bmy, robust=True
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/97649ed24e124d912058b60aaa5e3959634103e474b474022545a397fd234aac.png

Nighttime lights for good quality observations, without gap filling #

By default, the bm_raster function uses the Gap_Filled_DNB_BRDF-Corrected_NTL variable for daily data. Gap filling indicates that some poor quality pixels use data from a previous date; the Latest_High_Quality_Retrieval indicates the date the nighttime lights value came from.

ntl_tmp_gap_r = bm_raster(
    gdf,
    product_id="VNP46A2",
    date_range="2023-01-01",
    bearer=bearer,
    variable="Latest_High_Quality_Retrieval",
)
r_np = ntl_tmp_gap_r["Latest_High_Quality_Retrieval"].sel(time="2023-01-01")
r_np = np.where(r_np == 255, np.nan, r_np)

plt.imshow(r_np, cmap=cc.cm.fire)
plt.axis("off")
plt.colorbar(
    label="Temporal\nGap\n(Days)"
)  # Add a colorbar with ticks for discrete values
plt.tight_layout()
../_images/78349f4634c9fc79491cad9fb7b3fea7d6ca7b977984b6246ec768481c50a389.png

Instead of using Gap_Filled_DNB_BRDF-Corrected_NTL, we could ignore gap filled observations—using the DNB_BRDF-Corrected_NTL variable. Here, we also remove poor quality pixels.

ntl_r = bm_raster(
    gdf,
    product_id="VNP46A2",
    date_range="2023-01-01",
    bearer=bearer,
    variable="DNB_BRDF-Corrected_NTL",
    quality_flag_rm=[2, 255],
)
Hide code cell source
fig, ax = plt.subplots()

# Plot
ntl_r["DNB_BRDF-Corrected_NTL"].sel(time="2023-01-01").plot(
    ax=ax, cmap=cc.cm.bmy, robust=True
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/cc72ea75c7e334949cfac24ac72e9f50c30373aa2829cc246188406394c6abcd.png

Monthly/Annual Data #

Below shows an example examining quality for monthly data (VNP46A3). The same approach can be used for annual data (VNP46A4); the variables are the same for both monthly and annual data.

Nighttime Lights #

We download data for January 2023. When the variable parameter is not specified, bm_raster creates a raster using the NearNadir_Composite_Snow_Free variable for monthly and annual data—which is nighttime lights, removing effects from snow cover.

ntl_r = bm_raster(
    gdf,
    product_id="VNP46A3",
    date_range="2023-01-01",
    bearer=bearer,
    variable="NearNadir_Composite_Snow_Free",
)
fig, ax = plt.subplots()

ntl_r["NearNadir_Composite_Snow_Free"].sel(time="2023-01-01").plot(
    robust=True, cmap="cividis"
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/5e2915a2e624b5d38fa7cbc25d02b5a4ac8791bbf291143fec1d25a868b5f969.png

Number of Observations #

Black Marble removes poor quality observations, such as pixels covered by clouds. To determine the number of observations used to generate nighttime light values for each pixel, we add _Num to the variable name.

cf_r = bm_raster(
    gdf,
    product_id="VNP46A3",
    date_range="2023-01-01",
    bearer=bearer,
    variable="NearNadir_Composite_Snow_Free_Num",
)
Hide code cell source
fig, ax = plt.subplots()

cf_r["NearNadir_Composite_Snow_Free_Num"].sel(time="2023-01-01").plot(
    robust=True, cmap=cc.cm.bmy
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/e05e347706acc17aa80cfd55a6bb25c1d132f76d5d1ba17247e9036dad1d6cb1.png

Quality #

For monthly and annual data, the quality values are:

  • 0: Good-quality, The number of observations used for the composite is larger than 3

  • 1: Poor-quality, The number of observations used for the composite is less than or equal to 3

  • 2: Gap filled NTL based on historical data

  • 255: Fill value

We can map quality by adding _Quality to the variable name.

quality_r = bm_raster(
    gdf,
    product_id="VNP46A3",
    date_range="2023-01-01",
    bearer=bearer,
    variable="NearNadir_Composite_Snow_Free_Quality",
)
Hide code cell source
fig, ax = plt.subplots()

# Create a discrete colormap
cmap = mcolors.ListedColormap(["blue", "green", "yellow", "red"])
# Plot
quality_r["NearNadir_Composite_Snow_Free_Quality"].sel(time="2023-01-01").plot(
    ax=ax, cmap=cmap
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout();
../_images/387b10dcefc1c9c92e357959dcd42fa3b439911a035e5313f5f1838fcf5c633f.png

Nighttime lights for good quality observations #

The quality_flag_rm parameter determines which pixels are set to NA based on the quality indicator. By default, only pixels with a value of 255 are filtered out. However, if we also want to remove poor quality pixels, we can adjust the quality_flag_rm parameter.

ntl_good_qual_r = bm_raster(
    gdf,
    product_id="VNP46A3",
    date_range="2023-01-01",
    bearer=bearer,
    variable="NearNadir_Composite_Snow_Free",
    quality_flag_rm=[1, 255],
)
Hide code cell source
fig, ax = plt.subplots()

# Plot
ntl_r["NearNadir_Composite_Snow_Free"].sel(time="2023-01-01").plot(
    ax=ax, cmap=cc.cm.bmy, robust=True
)
cx.add_basemap(ax, crs=gdf.crs.to_string(), source=cx.providers.CartoDB.Positron)

plt.axis("off")
plt.tight_layout()
../_images/5679db4f85ba9e4c3ad832ef601b676b0c1ee95aa84858160ea43c59a1595100.png