Mapping and Monitoring Conflict

5. Mapping and Monitoring Conflict#

Using ACLED data.

5.1. Summary#

ACLED collects reported information on the type, agents, location, date, and other characteristics of political violence events, demonstration events, and other select non-violent, politically-relevant developments in every country and territory in the world. ACLED focuses on tracking a range of violent and non-violent actions by or affecting political agents, including governments, rebels, militias, identity groups, political parties, external forces, rioters, protesters, and civilians. Source.

This notebook will teach the student how to access ACLED data, analyze it and produce insightful visualizations with it.

5.2. Learning Objectives#

5.2.1. Overall goals#

The main goal of this class is to teach students to work with ACLED data to monitor conflicts in an area of interest.

5.2.2. Specific goals#

At the end of this notebook, you should have gained an understanding and appreciation of the following:

ACLED data:
- How to download the data with Export Tool.
- How to make an API request to ACLED server.
- Understand the ACLED data.
Visualize ACLED data:
- Conflict events overtime.
- Conflict heatmaps.
- Fatalities events maps.

5.3. Get the ACLED data#

Accessing this dataset requires registration in ACLED Access Portal. Once the account is approved, a key for retrieving data can be created.

Data can be obtained using ACLED Export Tool or through an API call.

5.3.1. Using the Data Export Tool#

The Data Export Tool can be accessed here.
Set up any necessary filter. In this example, Syria is studied.
Export the data.

../../_images/filters_acled.png — Fig. 5.1 Filters available at Acled Data Export Tool.#

5.3.2. Using the API#

5.3.2.1. What is an API call?#

Application Programming Interface (API) allows applications to communicate between themselves and share information. This example uses API calls to get information from the ACLED database.

5.3.2.2. Example using API call#

To create the API call, first, a URL from where data will be downloaded is needed. ACLED offers a User Guide API that can be explored in order to build the necessary URL, here.

import os
import requests
import pandas as pd

key = os.environ.get('acled') # Retrieve access key from environmental variables
email = os.environ.get('wb_email') # Use the email you used to generate the key

# Filters to apply. This example uses the same filters that were used to download the data with the Data Export Tool
iso_country_code = 760 # 760 is for Syria
start_date = '2017-01-01'
end_date = '2024-05-20'

url = 'https://api.acleddata.com/acled/read?key={}&email={}&iso={}&event_date={}|{}&event_date_where=BETWEEN&limit=0'.format(key, email, iso_country_code, start_date, end_date)

response = requests.get(url)
data = pd.DataFrame(response.json()['data'])

data.head(3)

	event_id_cnty	event_date	year	time_precision	disorder_type	event_type	sub_event_type	actor1	inter1	...	location	latitude	longitude	geo_precision	source	source_scale	notes	fatalities	timestamp
0	SYR128321	2024-05-20	2024	1	Political violence	Violence against civilians	Abduction/forced disappearance	QSD: Syrian Democratic Forces	Rebel group	...	Al-Hawayij	35.0574	40.4924	1	Facebook; SHAAM	New media-National	On 20 May 2024, QSD forces detained four civil...	0	1720477752
1	SYR128365	2024-05-20	2024	1	Political violence	Battles	Armed clash	Unidentified Armed Group (Syria)	Political militia	...	Talilah	34.5258	38.5272	2	Facebook	New media	On 20 May 2024, an unidentified armed group at...	3	1720477752
2	SYR128369	2024-05-20	2024	1	Strategic developments	Strategic developments	Arrests	Police Forces of Syria (2000-)	State forces	...	Madaya	33.6900	36.0963	1	SOHR	Other	On 20 May 2024, Syrian police forces arrested ...	0	1720477752

3 rows × 31 columns

5.3.2.3. Which method should be used to download the data?#

It depends on the application. If the user does not know how to make API calls and there is no interest in downloading data frequently, then it might be easier to use the Data Export Tool. However, if having up-to-date data is paramount for the project, then the API call is the correct way to proceed. Since the goal of this course is to monitor crisis, the second option makes the most sense. For example, one could create a crisis monitor dashboard that updates ACLED data daily through the API without needing a person to execute the data download.

5.4. Data Analysis#

First, we will open the data and visualize the information.

5.4.1. Load the data and necessary functions#

import pandas as pd
import geopandas as gpd
from shapely.geometry import Point

import folium
from folium.plugins import HeatMap
from datetime import datetime

import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)
pd.set_option('display.max_columns', 100)

def convert_to_gdf(df, lat_col, lon_col, crs = "EPSG:4326"):
    '''Take a dataframe that has latitude and longitude columns and tranform it into a geodataframe'''
    geometry = [Point(xy) for xy in zip(df[lon_col], df[lat_col])]
    gdf = gpd.GeoDataFrame(df, crs=crs, geometry=geometry)
    return gdf

# Define the color palette (make sure this has enough colors for the categories you will be using)
color_palette = ["#1f77b4", "#ff7f0e", "#2ca02c", "#d62728", "#9467bd", "#8c564b"]

import bokeh
from bokeh.layouts import column
from bokeh.models import Legend, TabPanel, Tabs

from bokeh.core.validation.warnings import EMPTY_LAYOUT, MISSING_RENDERERS

bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)
from bokeh.plotting import figure, show, output_notebook


from bokeh.plotting import ColumnDataSource
from bokeh.io import output_notebook
from bokeh.core.validation import silence
from bokeh.core.validation.warnings import EMPTY_LAYOUT

# Use the silence function to ignore the EMPTY_LAYOUT warning
silence(EMPTY_LAYOUT, True)

def get_line_plot(dataframe, title, source, x_axis_label, y_axis_label, subtitle=None, measure="measure", 
                  category="category", color_code=None):
    # Initialize the figure
    p = figure(x_axis_type="datetime", width=1000, height=400, toolbar_location="above", 
               x_axis_label=x_axis_label, y_axis_label=y_axis_label)
    p.add_layout(Legend(), "right")

    # Loop through each unique category and plot the line
    for id, unique_category in enumerate(dataframe[category].unique()):
        # Filter the DataFrame for each category
        category_df = dataframe[dataframe[category] == unique_category].copy()
        category_source = ColumnDataSource(category_df)
        color_code = color_palette[id]
        
        # Plot the line
        p.line(
            x="event_date",
            y=measure,
            source=category_source,
            color=color_code,
            legend_label = unique_category
        )

    # Configure legend
    p.legend.click_policy = "hide" # What happens when clicking in the legend category
    p.legend.location = "top_right"

    # Set the subtitle as the title of the plot if it exists
    if subtitle:
        p.title.text = subtitle

    # Create title and subtitle text using separate figures
    title_fig = figure(title=title, toolbar_location=None, width=800, height=40)
    title_fig.title.align = "left"
    title_fig.title.text_font_size = "20pt"
    title_fig.border_fill_alpha = 0
    title_fig.outline_line_color = None

    sub_title_fig = figure(title=source, toolbar_location=None, width=800, height=40)
    sub_title_fig.title.align = "left"
    sub_title_fig.title.text_font_size = "10pt"
    sub_title_fig.title.text_font_style = "normal"
    sub_title_fig.border_fill_alpha = 0
    sub_title_fig.outline_line_color = None

    # Combine the title, plot, and subtitle into a single layout
    layout = column(title_fig, p, sub_title_fig)

    return layout

# Data manipulation
data['year'] = data['year'].astype('int')
data['fatalities'] = data['fatalities'].astype('int')
data = data[data['year'].isin([2022, 2023, 2024])] # Adjust the data to the timespam needed
# Parse string date into datetime
data.event_date = data.event_date.apply(lambda x: datetime.strptime(x, '%Y-%m-%d'))

data.head(3)

	event_id_cnty	event_date	year	time_precision	disorder_type	event_type	sub_event_type	actor1	inter1	actor2	inter2	interaction	civilian_targeting	iso	region	country	admin1	admin2	admin3	location	latitude	longitude	geo_precision	source	source_scale	notes	fatalities	timestamp
0	SYR128321	2024-05-20	2024	1	Political violence	Violence against civilians	Abduction/forced disappearance	QSD: Syrian Democratic Forces	Rebel group	Civilians (Syria)	Civilians	Rebel group-Civilians	Civilian targeting	760	Middle East	Syria	Deir ez Zor	Al Mayadin	Thiban	Al-Hawayij	35.0574	40.4924	1	Facebook; SHAAM	New media-National	On 20 May 2024, QSD forces detained four civil...	0	1720477752
1	SYR128365	2024-05-20	2024	1	Political violence	Battles	Armed clash	Unidentified Armed Group (Syria)	Political militia	Military Forces of Syria (2000-)	State forces	State forces-Political militia		760	Middle East	Syria	Homs	Tadmor	Tadmor	Talilah	34.5258	38.5272	2	Facebook	New media	On 20 May 2024, an unidentified armed group at...	3	1720477752
2	SYR128369	2024-05-20	2024	1	Strategic developments	Strategic developments	Arrests	Police Forces of Syria (2000-)	State forces	Civilians (Syria)	Civilians	State forces-Civilians		760	Middle East	Syria	Rural Damascus	Az Zabdani	Madaya	Madaya	33.6900	36.0963	1	SOHR	Other	On 20 May 2024, Syrian police forces arrested ...	0	1720477752

5.4.2. Trends - Absolute#

The goal of this section is to create a trend line chart with absolute number of conflict events by type over time (aggregated monthly), with tabs by administrative zone 1.

events_by_type_month = data.groupby([pd.Grouper(key="event_date", freq="ME", label = 'left', closed = 'left')] + ['event_type','admin1'])["fatalities"].agg(["sum", "count"]).reset_index()
events_by_type_month.rename(columns={"sum": "fatalities", "count": "nrEvents"}, inplace=True)
events_by_type_month.head()

	event_date	event_type	admin1	fatalities	nrEvents
0	2021-12-31	Battles	Al Hasakeh	203	24
1	2021-12-31	Battles	Aleppo	8	32
2	2021-12-31	Battles	Ar Raqqa	52	22
3	2021-12-31	Battles	As Sweida	1	2
4	2021-12-31	Battles	Dara	9	12

from bokeh.resources import INLINE
import bokeh.io
from bokeh import *
bokeh.io.output_notebook(INLINE)
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)

tabs = []
events_by_type_month = events_by_type_month.sort_values(by=['admin1', 'event_date'], ascending=True)
for admin in events_by_type_month.admin1.unique():
    plot_data = events_by_type_month[events_by_type_month['admin1']==admin]
    tabs.append(
        TabPanel(
            child=get_line_plot(
                plot_data,
                "Monthly Trend in Number of Events for {}".format(admin),
                "Source: ACLED",
                "date",
                "number of events",
                subtitle="",
                category="event_type",
                measure='nrEvents',
                color_code=None,
            ),
            title=admin.capitalize(),
        )
    )
from bokeh.io import output_file

tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
output_file("debug_tabs.html")
show(tabs)

Loading BokehJS ...

5.4.3. Trends - Percentage change#

events_by_type_month = data.groupby([pd.Grouper(key="event_date", freq="ME", label = 'right', closed = 'left')] + ['event_type','admin1'])["fatalities"].agg(["sum", "count"]).reset_index()
events_by_type_month.rename(columns={"sum": "fatalities", "count": "nrEvents"}, inplace=True)
events_by_type_month.head()

	event_date	event_type	admin1	fatalities	nrEvents
0	2022-01-31	Battles	Al Hasakeh	203	24
1	2022-01-31	Battles	Aleppo	8	32
2	2022-01-31	Battles	Ar Raqqa	52	22
3	2022-01-31	Battles	As Sweida	1	2
4	2022-01-31	Battles	Dara	9	12

# Add period from previous year period, if available
events_by_type_month['period'] = events_by_type_month['event_date'].apply(lambda x: pd.Period(year=x.year, month=x.month, freq='M'))
events_by_type_month['last_year'] = events_by_type_month['period'] - 12
events_by_type_month.set_index(['last_year', 'event_type', 'admin1'], inplace = True)
events_by_type_month['nrEvents_last_year'] = (events_by_type_month.copy().reset_index()).set_index(['period', 'event_type', 'admin1'])['nrEvents']
events_by_type_month.reset_index(inplace = True)
events_by_type_month.fillna({'nrEvents_last_year': 0}, inplace = True)
events_by_type_month['percentage_change'] = (events_by_type_month['nrEvents_last_year']-events_by_type_month['nrEvents'])/events_by_type_month['nrEvents']
events_by_type_month.head()

	last_year	event_type	admin1	event_date	fatalities	nrEvents	period	percentage_change
0	2021-01	Battles	Al Hasakeh	2022-01-31	203	24	2022-01	-1.0
1	2021-01	Battles	Aleppo	2022-01-31	8	32	2022-01	-1.0
2	2021-01	Battles	Ar Raqqa	2022-01-31	52	22	2022-01	-1.0
3	2021-01	Battles	As Sweida	2022-01-31	1	2	2022-01	-1.0
4	2021-01	Battles	Dara	2022-01-31	9	12	2022-01	-1.0

# Filter data to plot for the information we have available
events_by_type_month = events_by_type_month[events_by_type_month['period']>=pd.Period(year=2023, month=1, freq='M')].copy()

output_notebook()
bokeh.core.validation.silence(EMPTY_LAYOUT, True)
bokeh.core.validation.silence(MISSING_RENDERERS, True)

tabs = []
events_by_type_month = events_by_type_month.sort_values(by=['admin1', 'event_date'], ascending=True)
for admin in events_by_type_month.admin1.unique():
    plot_data = events_by_type_month[events_by_type_month['admin1']==admin]
    tabs.append(
        TabPanel(
            child=get_line_plot(
                plot_data,
                "Percentage change wrt to previous year for {}".format(admin),
                "Source: ACLED",
                x_axis_label = 'date',
                y_axis_label = 'Percentage change in number of events',
                subtitle="",
                category="event_type",
                measure='percentage_change',
                color_code=None,
            ),
            title=admin.capitalize(),
        )
    )

tabs = Tabs(tabs=tabs, sizing_mode="scale_both")
show(tabs, warn_on_missing_glyphs=False)