Vanuatu#

Authors: David Newhouse, Andres Chamorro

This notebook presents the results of a poverty maping exercise done for Vanuatu, at the council level. The main objective is to assess whether small area estimation with geospatial features can be a feasible alternative to the traditional method that relies on a census.

Data Sources#

  • National Sustainable Development Baseline Survey (2019-2020)

  • Population Grid (World Pop 2020)

  • Geospatial Features (list below)

Description

Source

0

Population and demographic structure

Worldpop unconstrained

1

Elevation, DEM, 30 meters

COPERNICUS 2010

2

Distance to any road

Open Street Map

3

Land cover shares

ESA Worldcover 10m 2020

4

Built-up area

World Settlement Footprint 2019

5

Nighttime ligths 2020 composite

VIIRS

6

Electrification rate

High Resolution Energy Access 2019

7

Building heights

WSF 3D 2019

8

Count of buildings

Open Street Map

9

Count of cell towers

Open Cell ID

10

Precipitation

CHIRPS

11

Temperature

TerraClimate (4km)

12

Drougth Index

TerraClimate (4km)

13

NDVI

MODIS (250m)

Methodology#

Fist, we create a uniform grid to aggregate population and the various geospatial statistics. We use the H3 hexabin at level 6, for which each grid cell is roughly equivalent to 36 sq. km.

The map below shows the grid with a subset of statistics.

Hide code cell source
grid = gpd.read_file(join(out_dir, "h3_res7.geojson"))
stats = pd.read_csv(join(out_dir, "david", "grid_constrained.csv"))
stats.cropland = stats.cropland*100

grid = grid.merge(stats, on='hex_id', how='inner')
m = grid.explore(
    column='pop_sum',
    tooltip="pop_sum",
    cmap="YlGnBu",
    scheme='naturalbreaks',
    legend_kwds=dict(colorbar=True, caption='Population', interval=True),
    name="Population, World Pop"
    # tiles="Stamen Terrain"
    )

m = grid.explore(
    m = m,
    column='cropland',
    tooltip="cropland",
    cmap="Greens",
    scheme='naturalbreaks',
    legend_kwds=dict(colorbar=True, caption='% Cropland', interval=True, fmt="{:.0%}"),
    name="Cropland"
    )

m = grid.explore(
    m = m,
    column='viirs_sum',
    tooltip="viirs_sum",
    cmap="viridis",
    scheme='naturalbreaks',
    legend_kwds=dict(colorbar=True, caption='Sum of Lights', interval=True),
    name="Nighttime Ligths"
    )

m = grid.explore(
    m = m,
    column='building_heights_mean',
    tooltip="building_heights_mean",
    cmap="Reds",
    scheme='naturalbreaks',
    legend_kwds=dict(colorbar=True, caption='Average Building Height', interval=True),
    name="Building Heights"
    )

flm.LayerControl('topright', collapsed = False).add_to(m)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

We then join the spatial grid to the administrative boundary set at which we want to predict poverty, in this case councils (admin- 2). This results in a simulated census file with estimated population (from WorldPop), admin-2 attributes, and gesopatial covariates.

The second input into our model are the geocoded household survey responses, to which we also merge all of the geospatial features.

With these two files, we run a number of Small Area Estimation models in Stata to impute household welfare into the population grid, resulting in small area estimates of poverty at the council level.

Results#

The map below shows the poverty rates estimated with an Empirical Best Predictor (EBP) with benchmarking

Hide code cell source
adm2 = gpd.read_file(join(pac_dir, 'admin', 'adm2', 'vut_admbnda_adm2_spc_20180824.shp'))
pov = pd.read_stata(join(out_dir, 'Vanuatu_geospatial_poverty_estimates.dta'))
adm2 = adm2.merge(pov, left_on="ADM2_PCODE", right_on='adm2_pcode')

m = adm2.explore(
    column='Head_Count_ebp_bmark',
    tooltip="Head_Count_ebp_bmark",
    cmap="YlOrRd",
    scheme='equalinterval',
    popup=True,
    legend_kwds=dict(colorbar=False, caption='Poverty Rate, EBP', fmt="{:.0%}", interval=True)
    )
# folium.TileLayer('Stamen Toner', control=True).add_to(m)
m
Make this Notebook Trusted to load map: File -> Trust Notebook

Comparison with Direct Estimates#

We examine the correlation between our estimates and the results from a direct estimate using official census data. The table below reports the correlation between each model and the direct estimate.

Method

Total

Sampled Area Councils only

EBP

0.921

0.920

EBP with benchmarking

0.926

0.926

MERF

0.838

0.840

MERF with benchmarking

0.883

0.894

Direct survey estimates

N/A

0.798

We find that the alternative geospatial method is very close to the traditional census-based approach. This is promising, as we might be able to update these estimates in non-census year, or estimate poverty in other areas where a census may not be available.