Meta Mobile Network Connectivity (Illustrative)

15.4. Meta Mobile Network Connectivity (Illustrative)#

The Network Coverage Maps show the number of points (~600m) that are likely to not have mobile network coverage compared to a 30-day prior baseline. This dataset is calculated by Meta for each particular crisis. This data is then aggregated by administrative level to find out how connectivity changed after the event of interest and how it recovered over time. The following code shows how to derive indicators for this dataset.

import pandas as pd
import geopandas as gpd
from shapely import Point
import glob
from datetime import datetime

# For plotting
from bokeh.plotting import figure, show, output_notebook
from bokeh.models import Span
import bokeh
from bokeh.layouts import column
from bokeh.models import Legend, TabPanel, Tabs
from bokeh.core.validation.warnings import EMPTY_LAYOUT, MISSING_RENDERERS
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)

{Warning(code=1000, name='MISSING_RENDERERS', description='Plot has no renderers'),
 Warning(code=1002, name='EMPTY_LAYOUT', description='Layout has no children')}

# Read the data
all_files = glob.glob("../../data/internet-connectivity/proprietary/meta-network-connectivity/network_undetected/*.csv")
networkCoverage = pd.DataFrame()

li = []

for file in all_files: # Read daily files
    df1 = pd.read_csv(file)
    df1["date"] = datetime.strptime(file.split("_")[2], "%Y-%m-%d") # Assign a date based on the date from the file name
    li.append(df1)

networkCoverage = pd.concat(li, axis=0) # Concatenate daily files
networkCoverage = networkCoverage[networkCoverage["country"] == "TR"] # Make sure to read only data for the desired country

# Available time spam
networkCoverage.date.min(), networkCoverage.date.max()

(Timestamp('2023-02-05 00:00:00'), Timestamp('2023-03-25 00:00:00'))

def convert_to_gdf(df):
    geometry = [Point(xy) for xy in zip(df.lon, df.lat)]
    gdf = gpd.GeoDataFrame(df, crs="EPSG:4326", geometry=geometry)
    return gdf

def aggregate_data(data):
    df = convert_to_gdf(data)
    df = turkey_adm2.sjoin(df, how="left")
    df = (
        df.groupby(["adm2_en", "date"])
        .sum("no_coverage")
        .reset_index()[["adm2_en", "date", "no_coverage"]]
    ) #aggregate by administrative level 2
    
    df = (
        df.merge(turkey_adm2[["adm2_en", "adm1_en"]], on=["adm2_en"], how="right")
        .sort_values(by=["adm2_en", "date"])
        .reset_index(drop=True)
    ) #add administrative level 1 information
    return df
    

# Read turkey shapefile and areas of interest
turkey_adm2 = gpd.read_file(
    "../../data/internet-connectivity/turkey_administrativelevels0_1_2/tur_polbna_adm2.shp")

# Aoi is an area of interest around the earthquake event
aoi = gpd.read_file("../../data/internet-connectivity/aoi/SYRTUR_tessellation.gpkg")
affected_adm1 = list(
    turkey_adm2[turkey_adm2["adm1"].isin(aoi["ADM1_PCODE"].unique())][
        "adm1_en"
    ].unique()
) # Extract admin 1 boundaries in the aoi
affected_adm2 = list(
    turkey_adm2[turkey_adm2["pcode"].isin(aoi["ADM2_PCODE"].unique())][
        "adm2_en"
    ].unique()
) # Extract admin 2 boundaries in the aoi

# Aggregate data by administrative level 2
networkCoverage = aggregate_data(networkCoverage)

# Aggregate data weekly
week = (
    networkCoverage.groupby(
        [pd.Grouper(key="date", freq="W-MON"), "adm2_en", "adm1_en"]
    )
    .mean()
    .reset_index()
)

15.4.1. Plot the number of points without coverage by administrative level 2, aggregated by administrative level 1#

This plot will allow to detect how each administrative level 2 boundary was affected differently by the crisis event and how they recovered from it. Adding a tab by administrative level 1 boundary makes the plot easier for reading.

bokeh.core.validation.silence(EMPTY_LAYOUT, True)

def get_line_plot(networkCoverage, title, source, earthquakes=False, subtitle=None):
    p2 = figure(x_axis_type="datetime", width=800, height=400, toolbar_location="above")
    p2.add_layout(Legend(), "right")

    # Add a line for each administrative level 2
    for id, adm2 in enumerate(networkCoverage["adm2_en"].unique()):
        df = networkCoverage[networkCoverage["adm2_en"] == adm2][
            ["date", "no_coverage"]
        ].reset_index(drop=True)
        p2.line(
            df["date"],
            df["no_coverage"],
            line_width=2,
            line_color=color_palette[id],
            legend_label=adm2,
        )

    p2.legend.click_policy = "hide"
    if subtitle is not None:
        p2.title = subtitle

    title_fig = figure(
        title=title,
        toolbar_location=None,
        width=800,
        height=40,
    )
    title_fig.title.align = "left"
    title_fig.title.text_font_size = "20pt"
    title_fig.border_fill_alpha = 0
    title_fig.outline_line_width = 0

    sub_title = figure(
        title=source,
        toolbar_location=None,
        width=800,
        height=40,
    )
    sub_title.title.align = "left"
    sub_title.title.text_font_size = "10pt"
    sub_title.title.text_font_style = "normal"
    sub_title.border_fill_alpha = 0
    sub_title.outline_line_width = 0

    layout = column(title_fig, p2, sub_title)

    if earthquakes:
        p2.renderers.extend(
            [
                Span(
                    location=datetime(2023, 2, 6),
                    dimension="height",
                    line_color="#7C7C7C",
                    line_width=2,
                    line_dash=(4, 4),
                ),
                Span(
                    location=datetime(2023, 2, 20),
                    dimension="height",
                    line_color="#7C7C7C",
                    line_width=2,
                    line_dash=(4, 4),
                ),
            ]
        )

    return layout

# Color palette can be defined for each project
color_palette = [
    "#4E79A7",  # Blue
    "#F28E2B",  # Orange
    "#E15759",  # Red
    "#76B7B2",  # Teal
    "#59A14F",  # Green
    "#EDC948",  # Yellow
    "#B07AA1",  # Purple
    "#FF9DA7",  # Pink
    "#9C755F",  # Brown
    "#BAB0AC",  # Gray
    "#7C7C7C",  # Dark gray
    "#6B4C9A",  # Violet
    "#D55E00",  # Orange-red
    "#CC61B0",  # Magenta
    "#0072B2",  # Bright blue
    "#329262",  # Peacock green
    "#9E5B5A",  # Brick red
    "#636363",  # Medium gray
    "#CD9C00",  # Gold
    "#5D69B1",  # Medium blue
]

# Create a plot that has a tab for each admin level 1 boundary in which it can be observed the data 
# disaggregated by admin level 2 boundaries
output_notebook() # For printing the image in the notebook
tabs = [] # Add tabs to divide data by admin boundary
for adm in affected_adm1:
    df = networkCoverage[networkCoverage["adm1_en"] == adm]

    tabs.append(
        TabPanel(
            child=get_line_plot(
                df,
                "Areas with Undetected Network post earthquake",
                "Source: Data for Good Meta",
                earthquakes=True,
                subtitle="~600m areas without network post earthquake compared to 30 day prior baseline",
            ),
            title=adm.capitalize(),
        )
    )

tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)

Loading BokehJS ...

15.4.2. Choropleth with time slider#

This step plots the network coverage data aggregated at administrative level 2 with a time slider to quickly visualize how the network coverage changed across the days after the earthquake.

import json
import folium
from folium.plugins import TimestampedGeoJson, TimeSliderChoropleth
import branca

turkey_adm2.set_index('OBJECTID', inplace = True)
turkey_adm2_json = json.loads(turkey_adm2.to_json())

networkCoverage = pd.merge(networkCoverage, turkey_adm2.reset_index()[['adm1_en', 'adm2_en', 'OBJECTID']], on = ['adm2_en', 'adm1_en'], how = 'right')

networkCoverage['unixtime'] = networkCoverage.date.apply(lambda x: int(x.timestamp()))
all_values = list(networkCoverage['no_coverage'].unique())
min_value = min(all_values)
max_value = max(all_values)
colormap = branca.colormap.linear.YlOrRd_09.scale(min_value, max_value)
colormap.caption = 'Undetected Network Points'
networkCoverage['color'] = networkCoverage['no_coverage'].apply(lambda x: colormap(x))
networkCoverage['opacity'] = 0.8

styledata = {}
for idx in turkey_adm2.index:
    df = networkCoverage[(networkCoverage['OBJECTID']==idx)].copy()
    df.set_index('unixtime', inplace = True)
    styledata[idx] = df.drop('date', axis = 1)

styledict = {
    str(idx): data.to_dict(orient="index") for idx, data in styledata.items()
}

# Create the Folium map
m = folium.Map(location=[37.385911, 37.331974], zoom_start=6)

# Add the time slider
TimeSliderChoropleth(
    turkey_adm2_json,
    styledict=styledict,
).add_to(m)
colormap.add_to(m)
m

Make this Notebook Trusted to load map: File -> Trust Notebook

15.4.3. Plot the weekly average number of points without network coverage by administrative level 2, aggregated by administrative level 1#

output_notebook()

tabs = []

for adm in affected_adm1:
    df = week[week["adm1_en"] == adm]

    tabs.append(
        TabPanel(
            child=get_line_plot(
                df,
                "Weekly trends for Areas with Undetected Network post earthquake",
                "Source: Data for Good Meta",
                earthquakes=True,
                subtitle="~600m areas without network post earthquake compared to 30 day prior baseline",
            ),
            title=adm.capitalize(),
        )
    )

tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)

Loading BokehJS ...

15.4.4. Limitations#

In areas affected by a crisis, it is very difficult to distinguish between a nonoperational cell site and one that didn’t receive signals because users have left the area. This methodology relies on a phone’s reporting of cell IDs, and has found that different phone models occasionally report different cell IDs for the same physical base station. This leads to overcounting the number of cell sites covering a grid tile.

In addition, those incorrect cell IDs lead to fewer location samples for the correct cell ID and create additional unreliable coverage polygons for the incorrectly reported cell IDs. As the method relies on “last known location” for coverage area estimates, discrepancies between the actual location and the reported location of a user’s device may be included in the samples for any given cell site, especially when the cell site is close to an airport.