Part 1 - Query the API: indicators and locations

R

Date: 10/2/2019
Updated: 11/16/2020

Data source
Developer Information
GitHub

This is first part of a two-part series on accessing the International Debt Statistics (IDS) database through the World Bank Data API. In this part, we will query the World Bank Data API to retrieve indicator names and location codes.

The following code in this guide will show step-by-step how to:

  1. Setup up your environment with the needed R packages
  2. Create Indicator API Queries
  3. Create Location API Queries (for both debtor and creditor locations)

1. Setup

To start, make sure you have the following packages installed on your machine. To install an R package, type install.packages(“wbstats”) with the correct package name into R. You can also visit each of the linked packages for reference.

Then, open up your preferred mode of writing R, like R Studio. Now follow the rest of the steps below to query the World Bank Data API to find your indicator and location codes.

library(wbstats)
library(httr)
library(jsonlite)
library(tidyverse)

2. Indicator API Queries

To get a data series from the World Bank Data’s API, you first need to use an “indicator code.” For example, the indicator code for long-term external debt stock is “DT.DOD.DLXF.CD.” These indicator codes can be found in the World Bank Data Catalog, but you can also use the API to explore and select indicators. In this section, we will guide you through going from a debt related idea to using the World Bank Data API to select an indictor code.

More information on indicator API queries can also be found through the Data Help Desk.

Getting the Source ID for International Debt Statistics

To find the indicator code, we will first need to specify the API source. The World Bank Data API has numerous sources, including one specific to International Debt Statistics.

The following GET request will get us every source from the World Bank Data API. However, the request is returned in a json format that is difficult to read as is, so we will include some code to parse through it. Then we can see each source name and ID available in the World Bank Data API to find IDS.

# Get all the source names from the World Bank API
sourceRequest <- GET(url = "http://api.worldbank.org/v2/sources?per_page=100&format=json")
sourceResponse <- content(sourceRequest, as = "text", encoding = "UTF-8")

# Parse the JSON content and convert it to a data frame.
sourceJSON <- fromJSON(sourceResponse, flatten = TRUE) %>%
  data.frame()

# Print the data frame and review the source names and ids
cols <- c("id","name")
#sourceJSON[,cols] # to view all sources, remove the # at the beginning of the line
sourceJSON[1:10,cols]
##    id                              name
## 1   1                    Doing Business
## 2   2      World Development Indicators
## 3   3   Worldwide Governance Indicators
## 4   5 Subnational Malnutrition Database
## 5   6     International Debt Statistics
## 6  11     Africa Development Indicators
## 7  12              Education Statistics
## 8  13                Enterprise Surveys
## 9  14                 Gender Statistics
## 10 15           Global Economic Monitor

From the above result, you can see that “International Debt Statistics” has a source ID of 6. We will add that source ID to the end of our API queries.

Getting the Indicator code

Now that we have the source ID for International Debt Statistics. we can make another request to the World Bank API to receive all of the indicator names and codes associated with that source.

In this API response, the default number of results on each page is 50, so we will set the “per_page” parameter to 500. This will allow us to view all the results with one query, instead of having to request multiple pages. Then we can parse through the result to explore the different indicator Names and corresponding IDs.

# make request to World Bank API
indicatorRequest <- GET(url = "http://api.worldbank.org/v2/indicator?per_page=500&format=json&source=6")
indicatorResponse <- content(indicatorRequest, as = "text", encoding = "UTF-8")

# Parse the JSON content and convert it to a data frame.
indicatorsJSON <- fromJSON(indicatorResponse, flatten = TRUE) %>%
  data.frame()

# Print and review the indicator names and ids from the dataframe
cols <- c("id","name")
#indicatorsJSON[,cols] # to view all the indicator names, remove the # at the beginning of this line. Please note that there are over 500 indicators.

The result gives us all of the External Debt indicators and their codes. You can also view the IDS indicators and codes in their hierarchical order on our data tables. The Analytical view shows the select indicators from the IDS publication and the Standard view shows all available indicators.

The indicator and code we want is “DT.DOD.DLXF.CD External debt stocks, long-term (DOD, current US$).” Before using this data, we need to understand its full definition. You can use an API query to get that information as well.

# use the indicator code to define the "indicator" variable
indicator <- "DT.DOD.DLXF.CD"

# use the above indicator code to find associate "sourceNote" or definition
definition <- which(indicatorsJSON$id == indicator)
print(indicatorsJSON$sourceNote[definition])
## [1] "Long-term debt is debt that has an original or extended maturity of more than one year. It has three components: public, publicly guaranteed, and private nonguaranteed debt. Data are in current U.S. dollars."

By using the API queries above, we were able sort through the World Bank Data API sources to find IDS, find “External Debt,” explore the corresponding indicators, and then select one indicator of interest and find its ID and definition.

3. Location API Queries*

*The 11/16/20 update to this guide added how to also get the full list of creditor locations

Now that we have the indicator code, we need to select debtor and creditor locations. To select a location by country, region, or income level category you will need to know its 3 letter code. This section will show you how to use similar API queries to pull the location names and codes.

For more information on location API queries visit the Data Help Desk.

When pulling the list of debtor locations, I will set the per_page result to 300.

# Make request to World Bank API
dlocationRequest <- GET(url = "http://api.worldbank.org/v2/sources/6/country?per_page=300&format=JSON")
dlocationResponse <- content(dlocationRequest, as = "text", encoding = "UTF-8")

# Parse the JSON content
dlocationsJSON <- fromJSON(dlocationResponse, flatten = TRUE)

# Create a dataframe with the location codes and names
dlocationList <- dlocationsJSON[["source"]][["concept"]][[1]][["variable"]]%>%
  data.frame()

#We can view the first 25 entries below
dlocationList[1:25,]
##     id                    value
## 1  AFG              Afghanistan
## 2  AGO                   Angola
## 3  ALB                  Albania
## 4  ARG                Argentina
## 5  ARM                  Armenia
## 6  AZE               Azerbaijan
## 7  BDI                  Burundi
## 8  BEN                    Benin
## 9  BFA             Burkina Faso
## 10 BGD               Bangladesh
## 11 BGR                 Bulgaria
## 12 BIH   Bosnia and Herzegovina
## 13 BLR                  Belarus
## 14 BLZ                   Belize
## 15 BOL                  Bolivia
## 16 BRA                   Brazil
## 17 BTN                   Bhutan
## 18 BWA                 Botswana
## 19 CAF Central African Republic
## 20 CHN                    China
## 21 CIV            Cote d'Ivoire
## 22 CMR                 Cameroon
## 23 COD         Congo, Dem. Rep.
## 24 COG              Congo, Rep.
## 25 COL                 Colombia

IDS 2021 added a 4th dimension to the data - creditor country. You can read more about that on the World Bank Data Blog. To get the full list of the creditor codes and names, you use a query like as the one above, but in the get request you replace “country” with “counterpart-area.”

# Make request to World Bank API
clocationRequest <- GET(url = "http://api.worldbank.org/v2/sources/6/counterpart-area?per_page=300&format=JSON")
clocationResponse <- content(clocationRequest, as = "text", encoding = "UTF-8")

# Parse the JSON content and convert it to a data frame.
clocationsJSON <- fromJSON(clocationResponse, flatten = TRUE)

# Create a dataframe with the location codes and names
clocationList <- clocationsJSON[["source"]][["concept"]][[1]][["variable"]]%>%
  data.frame()

#We can view the first 25 entries below
clocationList[1:25,]
##     id                value
## 1  001              Austria
## 2  002              Belgium
## 3  003              Denmark
## 4  004               France
## 5  005 Germany, Fed.Rep. Of
## 6  006                Italy
## 7  007          Netherlands
## 8  008               Norway
## 9  009             Portugal
## 10 010               Sweden
## 11 011          Switzerland
## 12 012       United Kingdom
## 13 018              Finland
## 14 020              Iceland
## 15 021              Ireland
## 16 022           Luxembourg
## 17 025              Andorra
## 18 026               Monaco
## 19 030               Cyprus
## 20 035            Gibraltar
## 21 040               Greece
## 22 044               Kosovo
## 23 045                Malta
## 24 050                Spain
## 25 055               Turkey

The results from these two queries are also saved as location-codes csv*.

*The location-codes csv was created using the API queries: http://api.worldbank.org/v2/sources/6/country?per_page=500&format=JSON & http://api.worldbank.org/v2/sources/6/counterpart-area?per_page=500&format=JSON