Movement Analysis#
Movement analysis studies population movement patterns during crises to identify trends, disruptions, and responses. This includes analyzing travel distances, displacement, and migration to inform disaster response and recovery strategies.
For this analysis, several datasets provided by Meta under the proposal Using Alternative Data to Understand Crisis Impact on Poverty in Bangladesh, Sri Lanka, and India through the Development Data Partnership are used.
Movement Distribution#
The Movement Distribution dataset provides daily insights into mobility patterns, helping answer questions such as:
How far do people travel from home on average?
How does mobility vary with changes in public health messaging, policy, or crisis events?
This dataset spans December 1, 2022, to December 1, 2024, capturing population movement through Facebook mobile app data.
The dataset relies on Facebook users with enabled Location Services. While it offers timely large-scale mobility insights, it may not represent the entire population evenly.
Data Aggregation#
Spatial Aggregation: Data is aggregated at GADM admin level 2 (~county-level) or admin level 1 (~state-level) if level 2 data is unavailable. Based on GADM boundaries.
Temporal Aggregation: Aggregated over 24-hour periods in Pacific Time.
Distance Categories: Movement is grouped into 4 distance ranges:
0 km
>0 km but <10 km
10 km to <100 km
100+ km
Privacy Measures: Noise is added, and regions with fewer than 10 users are excluded.
Key Variables#
GADM ID (gadm_id): Unique identifier for polygons.
Polygon Name (polygon_1_name): Admin level 2 region names.
Country (country): 2-letter ISO code.
Distance Category (home_to_ping_distance_category): Travel distance ranges.
Ping Fraction (distance_category_ping_fraction): Fraction of users in each distance category (privacy-adjusted).
Date (ds): Reporting date.
Methodology#
Data Collection:
Nighttime pings (8 PM - 6 AM) identify home locations.
Daytime pings represent “visit” locations.
Distance Calculation:
Compute distances between home and visit tiles.
Data Aggregation:
Aggregate percentages by distance category with privacy adjustments.
Key Metric:
Classifies users based on daily travel distances.
More details are available on the Range of Motion Maps page.
Facebook Population During Crisis#
The Facebook Population During Crisis dataset provides insights into population movements and density changes during crisis events. It relies on aggregated and privacy-preserved location data from Facebook users with Location Services enabled.
Data Aggregation#
Temporal Aggregation: Tracks changes in population density over fixed 8-hour time windows (Pacific Time).
Spatial Aggregation: Data is reported at Bing tile level 14 (~2.4 km²) for precise localization.
Dynamic Monitoring: Highlights population increases or decreases during crises, aiding evacuation and service placement planning.
Applications#
Evacuation Patterns: Understand where populations are relocating during disasters.
Service Positioning: Identify regions requiring immediate relief efforts.
Connectivity Insights: Detect potential connectivity issues based on sudden population decreases.
Key Variables#
Latitude/Longitude: Tile center coordinates.
Quadkey: Unique tile identifier in the Bing system.
Date/Time: Start of the reporting period in Pacific Time.
Baseline (n_baseline): Average number of users in the 90 days prior to the crisis.
Users During Crisis (n_crisis): Active users during the crisis period.
Percent Change (percent_change): Relative change in user counts from baseline.
Z-score: Highlights significant deviations between baseline and crisis values.
Methodology#
Baseline Calculation: Baseline values are derived from the average number of users in the 90 days preceding the crisis. These are segmented by day of the week and time window. Winsorization is used to manage outliers in the baseline data.
Crisis Period Counts: During the crisis period, the dataset records the number of users present in each tile for each 8-hour time window. If counts are below the privacy threshold of 10 users, they are nullified or dropped.
To ensure user privacy, the dataset incorporates:
Random Noise: Adds noise to counts to obscure exact numbers.
Spatial Smoothing: Averages counts with nearby locations.
Minimum Thresholds: Excludes rows with fewer than 10 users.
Facebook Colocation Dataset#
The Facebook Colocation Dataset provides estimates of how frequently individuals from different geographic regions are located in the same area at the same time (colocated), enabling analysis of population interaction patterns. This data is particularly useful for modeling population dynamics during crises, such as predicting the spread of infectious diseases.
Data Aggregation#
Temporal Aggregation:
Colocation Events: Defined using 5-minute intervals.
Weekly Aggregation: Daily colocation rates are averaged over 7 consecutive days to calculate the weekly colocation rate.
Spatial Aggregation:
Colocation rates are reported at administrative level 2 (e.g., districts in Bangdladesh) where privacy thresholds are met.
Falls back to administrative level 1 (e.g., divisions in Bangladesh) if sufficient data is unavailable.
Key Variables#
Weekly Measured Colocation Rate: The rate at which people from two regions are colocated within the same Bing tile (level-16, ~600m²) during a randomly chosen 5-minute period.
Weekly Colocation Rate: The adjusted colocation rate accounting for the fraction of time when individuals from two regions could be observed simultaneously.
Limitations#
Spatial Boundaries: Colocation rates are limited to regions within the same country; cross-country colocation is not reported.
Data Representation: The dataset reflects only Facebook users who have Location Services enabled, which may not represent the entire population.
Distance Granularity: GPS-derived data cannot reliably differentiate colocation at short distances (e.g., 6 feet for COVID-19 modeling).
Movement Between Places During Crisis#
(To be added: Overview, variables, methodology)
Travel Patterns#
(To be added: Overview, variables, methodology)
General Implementation#
The data was sourced from the Meta Data For Good portal and aligned with UNOCHA shapefiles for consistency. Each dataset was processed to extract insights, with visualizations presented in the attached notebook.
Limitations#
The methodology presented is an investigative pilot aiming to shed light on the economic situation in Syria and Türkiye leveraging alternative data, when confront with the absence of traditional data and methods.
Caution
In summary, beyond standing-by peer-review, the limitations can be summarized in the following.
The methodology relies on private intent data in the form of mobilily data. In other words, the input data was not produced or collected to analyze the population of interest or address the research question as its primary goal but repurposed for the public good. The benefits and caveats when using private intent data have been extensively discussed in the World Development Report 2021 [WorldBank21].
On the one hand, the mobility data panel is spatially and temporally granular and readily available, on the other hand it is created as a convenience sampling which constitutes an important source of bias. The panel composition is not entirely known and susceptible to change, the data collection and the composition of the mobility data panel cannot be controlled.
In summary, the results cannot be interpreted to generalize the entirety of population movement but can potentially provide information on movement panels to inform Syrian economic situation, considering time constraints and the scarcity of traditional data sources in the context of Syria.