Stage 2 and 3: Define Data Needs#
Once information needs have been defined, we can decide which data items we need. There have been several attempts to generate a complete list of possible data layers for use in humanitarian situations: Nuala Cowan (2013; HGF model), HDX (HDX data grids), van den Homberg et al. (2018), scientific lead of Red Cross NL 510, an interviewee from the World Bank, the data models in use by UNHCR for refugee sites, and the taxonomy from ACAPS made for the Coordinated Data Scramble (CDS). We combined these data information assessments into a combined list through collation and reduction of the various evaluations in Annex 2 - Geospatial Data Items. The unified list contains 233 data items; for each data item we identified a potential primary source. This assessment is based on the data sources suggested by ACAPS in their taxonomy of data for the CDS, and by our review of the geospatial data items we found on HDX. We mentioned an organization here when it either had contributed many layers of the type in question to HDX, or when it claimed on their self-description on HDX to harbor and curate such data.
Stage 3: Identify Data Sources#
The third step in the process is data assembly and gap analysis: check each layer for geographic completeness, geometric accuracy, attribute correctness, and timeliness. A useful protocol for this process has been provided by Mapaction and can be accessed here - protocol. In many humanitarian and development projects, the first source for data is OpenStreetMap; table 1 provides a breakdown of what data can be found in OSM, and what is mappable from satellite imagery into OSM.
While OSM is an important source for geospatial data, and should be a primary repository for disseminating data, many datasets are not suitable for OSM, or are of low quality. For such data, Annex 2 provides potential resources for these data. The heirarchy of available geospatial resources varies by project, agency, and location. However, we propose three principal sources for geospatial data:
Authoritative government/local sources: whenever possible, working with local government and experts is the most desireable solution. This can be through searches on existing open data portals, or leveraging existing relationships and connections, but authoritative local sources are the best option.
OpenStreetMap: get all data you might need from OpenStreetMap. You can use the data mirrored from OSM into HDX, the HOT OSM export tool, or other providers like Geofabrik.de.
Alternative sources: Check out other data portals, such as the World Bank’s Development Data Hub or HDX. Annex 2 gives a hint as to where such layers might be found, but further searching is likely necessary. An extensive list of online repositories for geospatial data can be found here. The list was initially compiled by Payne et al. (2012) and updated until 2019. Following the data survey, you should have a set of data that needs to be assessed, and other datasets that could not be located.