Tools & Materials
Theory of Change
Applications
A theory of change is a detailed description of the mechanisms through which a change is expected to occur in a particular situation. In the context of data in development, the focus is on identifying the insights are most likely to lead to a desired result or change in behavior. This type of thinking helps identify and prioritize the data, systems, programs and other resources that are required to drive the most impactful change.
Prototyping is a great way to validate a Theory of Change. The Data in Action canvas is used to develop ideas and conceptual prototypes to put data into action. It is most useful when used by teams comprised of both domain and data specialists. The iterative process begins with a question or problem, and works through how data can be transformed into insights, results and outcomes.
Worksheet
Operation and Business Model
Applications
An operation and business model is critical to keeping data projects organized. It assists analysts in aligning activities and modeling potential trade-offs. It helps answer questions of accountability by clearly answering what a project is trying to achieve, who owns which activities, and how to engage various stakeholders.
Worksheet
Action Plan
Applications
Data innovation projects benefit greatly from multi-stakeholder perspectives, but such projects can be complex to manage. For this reason, the World Bank's data innovation program has used highly structured roadmaps and action plans during the ideation and planning stages (World Bank approved projects use formal documentation). These tools help ensure that data products are informed by diverse perspectives empathy for the from governments and people whose lives can be improved by data innovation. This is particularly important in development where it is often difficult to clearly grasp on-the-ground realities, and the final product is rarely what was envisioned at the beginning of the project.
Worksheet
Data Project Scoping
Applications
The early and proper scoping of data projects is critical to their success. This includes clearly defining goals; identifying actions the project should inform, outlining data requirements; and, sizing technical scope and effort. Over the years, we've often referenced a guide from the Center for Data Science & Public Policy and Data Science for Social Good Fellowship program for this work, which has been field tested and refined over hundreds of applied projects.
The scoping process suggested is iterative and the scope gets refined both during the scoping process as well as during the project:
- Step 1: Goals – Define the goal(s) of the project
- Step 2: Actions – What actions/interventions do you have that this project will inform?
- Step 3: Data – What data do you have access to internally? What data do you need? What can you augment from external and/or public sources?
- Step 4: Analysis – What analysis needs to be done? Does it involve description, detection, prediction, or behavior change? How will the analysis be validated?
- Ethical Considerations: How have you thought through privacy, transparency, discrimination/equity, and accountability issues around this project? See the Ethics section for more on this topic.
Below, we include a worksheet to work through the more detailed questions of this framework.
Worksheet
Data Innovation Cycle
Applications
In large organizations, smaller engagements at the unit or practice level are helpful to promote change. Activities designed to reinforce one another to accelerate knowledge flows and deliver a critical mass of projects, knowledge, and talent within the Practice or unit are more likely to realize results to build and sustain capabilities in big data.
Practitioners that pursue a large data development programs typically follow three phases.
1. Discovery | 2. Incubation | 3. Scale |
---|---|---|
|
|
|
Worksheet
Big Data Sources
Big Data Sources and Applications
In the context of international development, Big data increases the value of survey data, and vice-versa. This relationship underpins the opportunity to create new forms of value from the careful combination of traditional and new data to build more robust and higher resolution data products than either traditional data or new data on their own. This happens because each type of data has relative strengths and weaknesses.
Big data is used in a wide range of applications in every sector in development (urban, environment, transport, health, governance, etc). The 3I's Big Data Systematic Map illustrates the scope and scale of how big data is used in impact evaluations, systematic reviews and development applications.
- Digital Exhaust - Passively collected transactional data from people's use of digital or government services mediated through mobile phones, purchases, web searches, online forms, social network interactions, etc.
- Digital Content – Web published content such as news, multi-media, digital reports, blogs, websites and community portals
- Sensing Machine Generated Data – Remotely sensed data from satellites, including optical imagery and other spectrum data used to detect changes in built environment, nighttime light, and environmental changes; in-situ cameras; in-situ sensors detecting ambient conditions (noise, humidity, temperature, movement, etc)
Three basic approaches stand with respect to general applications of big data in development. One, predicting or nowcasting development conditions, for example the use of news and social media to predict macro-conditions (GDP, employment); (2) higher spatial and temporal measurements of development conditions, for example small area estimates of Poverty ([Engstom; Newhouse 2017])(https://openknowledge.worldbank.org/handle/10986/29075) and Population; and (3) measures of human mobility using GPS data from smart phones, call detail records and other platforms ([Matekenya 2020])(https://documents.worldbank.org/en/publication/documents-reports/documentdetail/224761611175801192/using-mobile-data-to-understand-urban-mobility-patterns-in-freetown-sierra-leone). These general applications apply to data products for better measurements for SDGs and Official Statistics, to monitor projects, programs to inform decisions, and utilized within single institutions, such as a government to streamline or improve services.
The future of data in development will include more hybrid data products that carefully combine multiple sources of survey and new-data sources. These hybrid products can produce more timely, accurate and granular insights into development conditions than survey-based or big data measurements alone.
Worksheet
Ethics
Applications
As development practitioners, it's easy to get lost in the techniques and methods of our trade. In doing so, we can forget to ask important questions: who will be affected by our work? How we are ensuring that by doing ‘good’ for one group, we are not inadvertently harming another? A good way to remain cognizant of these issues is to reflect on the implications of work throughout the key stages of a project. Ethical considerations often get kicked to the sidelines in data science projects. However, if they are not properly incorporated at every stage, it’s possible that the project is not implemented or, even worse, implemented with adverse consequences. By considering some of the questions above, we can improve our chances at delivering projects with truly positive social impact. The worksheet below, which is based on work done at the Data Science for Social Good Fellowship, offers a helpful checklist to work through these questions.
New, promising technologies are emerging to help with ethical challenges in data science. For example, new privacy preserving methods and secure data infrastructure is helping keep sensitive, personal data more secure. The most basic techniques in this space involve obfuscating individual records and aggregating data to a minimum sample size. The World Bank has also supported projects to develop Trusted Execution Environments, like the “open algorithms (OPAL)” approach, that enable algorithms to be brought to the data and special software and hardware modules to restrict the scope of data processing. Another approach is through Privacy Preserving methods like Differential Privacy and Homographic Encryption, which protect privacy by allowing broad statistical information to be gathered without revealing the specifics of individual records. Using trusted methods to work with micro data and big data enables practitioners to see when big data is useful on its own, what biases and flaws may exist, and when to support hybrid data product development to overcome these flaws. The UN Handbook for Privacy-Preserving Techniques included below offers a more thorough discussion of these topics.