Getting Started with the World Development Indicators (WDI) data#
In this notebook, we will show the steps to get started with the WDI data. We will use the WDI API to get the data.
We will be using the data/
folder to store the data. You can change the location of the data folder by changing the data_dir
parameter in the code below. Make sure to refer to the correct location of the data folder in the rest of the notebook.
After the data is collected, we will store it in a SQLite database. With the data in a database, we can then use LLM4Data to query the data using natural language.
Downloading the data#
If the data is not yet available in the data
folder, we will download the data from the WDI API.
poetry run python -m scripts.scrapers.indicators.wdi --data_dir=data/indicators/wdi --force
This will scrape the data from the WDI API and store it in the data/indicators/wdi
folder. Each indicator will be stored in a separate file.
Storing the data to a database#
After the data is downloaded, we will store it in a database. We will use SQLite for this example. You can use other databases as well, as long as you have the appropriate drivers installed.
Please review the setting up the environment
section for instructions on how to update the relevant environment variables.
You can then run the following command to store the data in a database:
poetry run python -m scripts.scrapers.indicators.wdi_db --wdi_jsons_dir=data/indicators/wdi
This will create a database file in the data/indicators/wdi
folder. The database file will be named based on the information you specified in the environment variables.
Alternatively, you can run the cells below to store the data in a database.
from llm4data.llm.indicators.wdi_sql import WDISQL
wdi_jsons_dir = "data/indicators/wdi"
WDISQL.load_wdi_jsons(wdi_jsons_dir)