DISCLAIMER: This is not the most recent version of the methodological handbook. You can find the most recent version here: https://worldbank.github.io/PIP-Methodology.

Chapter 2 Constructing welfare aggregates

Several different concepts of poverty exist, building on different notions of what a good life – or in the language of economists, welfare – entails. Poverty, in short, is the absence of welfare, and individuals are considered poor when their welfare levels fall short of a certain threshold. The Poverty and Inequality Platform (PIP) primarily utilizes a monetary measure of poverty. Monetary poverty is estimated from an aggregation of households’ income or from the monetary value of their consumption. We refer to such aggregates jointly as welfare aggregates. This section explains how welfare aggregates are constructed. For a more detailed exposition of the construction of welfare aggregates, see Deaton and Zaidi (2002).

2.1 Income or consumption

Monetary poverty estimates are based on either income or consumption aggregates. Consumption aggregates typically capture household expenditure on a set of items over a given period of time. These usually include purchased, own-produced, exchanged, and gifted food and non-food items (for example clothing, housing—including imputed rent—and the use value of durable consumer goods). Income aggregates capture the value of monetary inflow a household receives or earns over a given period of time. Household surveys usually provide information on labor income (salaries, own-business, and self-employment income), as well as non-labor income coming from pensions, subsidies, transfers, property income, scholarships, etc. Income aggregates in PIP aim to measure disposable income defined as the sum of labor and non-labor income (including transfers) less taxes and contributions. The exact definition and operationalization of income aggregates varies across different data sources.

Both income and consumption aggregates do not directly account for important non-monetary aspects of households’ welfare such as access to basic services, education, health-care, and infrastructure, which are better captured by other indicators or by measures of multidimensional poverty.

Both income and consumption approaches to measuring monetary poverty have advantages and disadvantages. Countries typically choose the concept that can be more accurately measured and that is more relevant to their context, while balancing concerns about respondent burden in surveys.

The consumption approach is arguably more directly connected to economic welfare. Yet, consumption aggregates require a wide range of questions, detailed price data, and often post-fieldwork adjustments. The design of consumption questionnaires varies widely and, as shown by numerous experiments, can have significant effects on final poverty estimates (see section on questionnaire design below).

Income aggregates, on the other hand, often rely on no more than a handful of questions and can, at times, be verified from other sources. Yet they are difficult to obtain when a large fraction of the population works in the informal sector or is self-employed, which is frequently the case in poorer economies. When households produce their own food with limited market interactions, it is harder to measure income than consumption. Income aggregates also suffer from the disadvantage that incomes might be very low—even negative—in a given period, whereas consumption is smoothed to safeguard against such shocks. Subsistence requires a minimum level of consumption, which is strictly above zero (World Bank 2018). PIP’s current practice is to drop observations with negative incomes, while zeros are included.

The differences between income and consumption matter for comparing trends and levels of poverty and inequality. Given that incomes can be very low or negative, poverty rates are typically higher when income is used rather than consumption. For a given poverty rate, poor households also tend to be further below the poverty line when income is used, as explained by the earlier point about very low incomes. Incomes are also more likely than consumption to be very high for a given year, which in conjunction with the very low values means that inequality often is higher when incomes are used rather than consumption.

2.2 Questionnaire design

Constructing a welfare aggregate, particularly a consumption aggregate, requires many specific questions to be asked in household surveys. There are many different ways to ask survey respondents about their consumption habits, and how one asks has a significant effect on how people respond. Consumption aggregates are sensitive to whether respondents recall consumption from memory or use consumption diaries, how many and how detailed consumption items are listed, and the time period interviewers ask respondents to recall. De Weerdt, Gibson, and Beegle (2020) provide a recent review of the impact these choices can have on the final poverty estimates.

To give one concrete example, the recall period affects reported consumption through two main channels: memory decay and telescoping. A longer recall period is better at encompassing expenditure on infrequently purchased items, but it can lead to underreporting if respondents forget about past purchases. Despite lower average consumption, measured poverty might be lower under the longer recall period because it captures the purchases of low frequency items of households in the lower parts of the distribution. Short recall periods can mitigate underreporting but can lead to telescoping, where respondents mistakenly report the consumption that took place outside of the reference period.

India provides one example of the relevance of the recall period for poverty estimation. The official 2004–05 poverty rate for India based on a uniform recall period of 30 days was 27.5 percent. Using a mixed recall period of 7 and 30 days on food and tobacco and a 365-day recall period for non-food items, the corresponding figure was 21.8 percent (World Bank 2018).

Changes in questionnaire design imply that poverty estimates within countries become incomparable. Whenever such changes occur, they are marked in the comparability database.

2.3 Imputed rent

During the process of constructing a welfare aggregate, housing is arguably one of the most important and difficult components to include. From a conceptual point of view, housing could be understood as any other durable good: Its present value is not relevant for the analysis of current welfare because purchasing a house is such a large and relatively rare expenditure that it should not be included in the welfare aggregate (Deaton and Zaidi 2002, 35). The objective is instead to measure the value of occupying the dwelling for a month or another limited duration.

Whether the welfare aggregate is based on consumption or income, accounting for housing is important. When the welfare aggregate is based on consumption, household surveys usually record the value of rent paid by market tenants. Since rent is the market value of occupying a house for a given period of time, it is feasible and empirically viable to estimate the flow of housing services. Yet, the same information is not available for household owners, even though they also “consume” housing services. By not including the value of housing consumed by owners, it would look like two household with the very same consumption patterns have different welfare just because one pays rent and the other does not. In fact, the market tenant would look better off than the household owner because the consumption of the former is higher than the one of the latter.

When the welfare aggregate is based on income the logic is similar. In this case, the welfare aggregate should be composed of the remuneration of all the assets of the household including labor, capital, and durable goods such as houses (Gasparini and Sosa Escudero 2004). Families that own their housing receive an implicit value that is equivalent to the amount of money they would have to pay in the market if they had to rent a dwelling similar to the one they are currently living in. Hence, to properly account for housing in welfare aggregates, one needs to impute rent for owners.

Several methods exist to impute rent (see Balcázar et al. (2017) for a review). Due to the data needs of these approaches, imputed rent is not always computed, leading housing to be excluded from the welfare aggregate. Across the surveys used in PIP, it varies between country and within countries over time whether imputed rent is included. Due to the value of housing services relative to total welfare, this can matter a great deal for poverty comparability over time.

2.4 Within-survey spatial/temporal deflation

When constructing a welfare aggregate, one is met with the challenge that prices differ geographically within a country and change over the time of the fieldwork. This means that the same level of income can have a different value for different households in the survey, and that households with the same consumption patterns can have very different consumption aggregates if the prices they face differ. To make the welfare aggregates comparable across time and space, spatial and temporal price deflation are needed. Whether or not such deflation is carried out depends on the availability of price data and country-specific practices. Whenever a country switches from not spatially deflating to spatially deflating, this might lead to incomparable poverty estimates over time.

2.4.1 Spatial deflation

Suppose a household pays $1 for a kilo of rice in an urban center, while a rural household in the same country only pays $0.5 for a similar quality and amount of rice. If one were to assess poverty based on the value of the goods and services consumed without accounting for these price differences, everything else equal, one would conclude that the rural household is poorer than the urban household. Yet, from a welfare perspective, they are equally well off. To properly compare the welfare levels of the two households one needs to account for the differences in price levels that the two households face.

There are several ways to account for spatial price differences. One way is to compare the price of a representative basket of goods across locations and convert household consumption expenditure into the prices of a reference location, such as the capital city or the national average price level.

Currently most surveys from East Asia & Pacific, Latin America & the Caribbean, and Europe & Central Asia are spatially deflated, while it is less common in Sub-Saharan Africa, South Asia, and the Middle East & North Africa. China, India, and Indonesia have rural/urban purchasing power parities (PPPs) that are used to deflate welfare aggregates to account for rural/urban price differences within these countries.

2.4.2 Temporal deflation

Household surveys are often carried out over several months, leading to the possibility that prices evolve notably over this period. If unaccounted for, this means that two households at the same location that have the same consumption patterns but are interviewed at the beginning and the end of the fieldwork may have different consumption aggregates. Accounting for such temporal deflation implies choosing a reference period and deflating all consumption aggregates to that period. This can be a specific year, quarter or month. The exact time chosen for each survey is listed in the appendix table of the most recent What’s New document. Temporal deflation within a survey is not carried out in all cases, often because the month of the fieldwork is not available.

Sometimes, a spatio-temporal price index is used to jointly adjust for both spatial and temporal price differences. The household welfare aggregate from the survey conducted in Ghana in 2016-17, for example, is evaluated at 2013 Greater Accra regional prices.

2.5 Equivalence scale

All welfare aggregates are converted into per capita terms by dividing the total household welfare with the number of household members. The use of a per capita normalization is standard in the literature on developing countries. Behind the per capita conversion lies an assumption that all household members have equal consumption levels or derive equal welfare from the total household income. This is based on the assumption that there is little scope for economies of size in consumption for poor people.

This assumption can be questioned (Lanjouw and Ravallion 1995). If household members have different needs, which could be the case if children, for example, require less consumption than adults, then using per capita welfare might be misleading. The assumption would also be violated if household members with the same needs do not get the same share of household income and consumption. Evidence has shown that females often get less than an equal part of household income and consumption (see chapter 5 of World Bank (2018) and references therein). An alternative assumption often used by countries when measuring poverty, is to use an adult equivalence scale. Such a scale defines the consumption need of each household member—often based on age and the household size—relative to a benchmark. Total household welfare is then converted into welfare per adult equivalent household member.

Despite of its compelling features, the lack of agreement on which equivalence scale to use across countries and the more difficult interpretation of the final poverty rates have prevented the World Bank from using such numbers for global poverty monitoring thus far. Per capita estimates have the advantage that they have a clear counterpart in national accounts, which matters for extrapolating and interpolating poverty estimates to a common reference year.

Though equivalence scales aren’t used, it is important to note that each household is weighted by the product of its sampling weight and the household size. The latter implies that the poverty rates show the share of the population living in poverty, not the share of households living in poverty.

2.6 Treatment of grouped data

A welfare aggregate is also needed to estimate poverty and inequality from grouped data. Grouped data are consumption expenditure or income organized in intervals or bins, such as deciles or percentiles. These bins are used to derive a continuous Lorenz curve, which plots the cumulative welfare share (on the y-axis) against the cumulative population share (on the x-axis). Together with information about mean welfare, the Lorenz curve can be used to construct a full distribution. Two approaches are used to derive a Lorenz function, the general quadratic (GQ) Lorenz function and the Beta Lorenz function (Datt 1998). Both functions are parameterized and estimated. The function that provides the best fit is selected for poverty and distributional statistics, conditional on passing normality and validity tests. The tests vary for poverty and distributional statistics (see here for the code on the poverty tests and here for the code on the distributional tests).

The GQ Lorenz function is estimated using the following specification:

\[ L(1-L) = a(p^2-L) + bL(p-1) + c(p-L), \]

where \(p\) is the cumulative proportion of the population, \(L\) is the cumulative proportion of consumption expenditure or income, and \(a\), \(b\), \(c\) are parameter estimates (Datt 1998). Poverty and inequality measures are based on the parameter estimates.

The Beta Lorenz function is estimated using the following specification:

\[ L(p) = p - \theta p^\gamma (1-p)^\delta, \] where \(\theta\), \(\gamma\), and \(\delta\) are parameter estimates (Datt 1998).

2.7 Comparability database

As countries frequently improve the questionnaire design of household surveys and the methodology for the construction of welfare aggregates, poverty estimates over time for a country are not always comparable to each other.

In order to guide users about when poverty can be compared over time within a country, PIP’s line charts only connect poverty estimates when these are comparable. In addition, a “Comparability Over Time” dataset has been put together. This dataset provides metadata on the comparability of poverty estimates within countries over time. Within a country, comparability of poverty estimates over time is assumed unless there is a known change to the survey instrument, survey methodology, measurement, or data structure. The assessment of comparability is country-dependent and relies on the accumulation of knowledge from past and current World Bank staff, as well as close dialogue with national data producers with knowledge of survey design and methodology. The complete documentation of this dataset is available in Atamanov et al. (2019).

The dataset can be accessed in different ways. It is available in the Development Data Hub of the World Bank and it is possible to access it directly from R, Stata, or Python by calling this file, "https://development-data-hub-s3-public.s3.amazonaws.com/ddhfiles/506801/povcalnet_comparability.csv%22.

R

link <- "https://development-data-hub-s3-public.s3.amazonaws.com/ddhfiles/506801/povcalnet_comparability.csv"
df   <- readr::read_csv(link)

Stata

local dh "https://development-data-hub-s3-public.s3.amazonaws.com/ddhfiles/506801/povcalnet_comparability.csv"
import delimited using "`dh'", clear

Python

import pandas as pd
link = "https://development-data-hub-s3-public.s3.amazonaws.com/ddhfiles/506801/povcalnet_comparability.csv"
df   = pd.read_csv(path)

Each observation of the comparability dataset represents a survey datapoint which is the combination of ISO3 country code (countrycode), year, survey acronym (survname), welfare type (i.e., income or consumption) (datatype), and survey coverage (coveragetype). The variable comparabiliy specifies the comparability over time with the following logic. The oldest comparable series in each country starts with the value zero (0) in the comparability variable. When comparability is broken, the value changes to one (1) for the year of the break and it goes on until the comparability is broken again in a subsequent year. The process repeats until the most recent survey datapoint available. Thus, to get the most recent comparable spell, you need the max(comparability) by countrycode, survname, datatype, coveragetype.

References

Atamanov, Aziz, R. Andres Castaneda Aguilar, Carolina Diaz-Bonilla, Dean Jolliffe, Christoph Lakner, Daniel Gerszon Mahler, Jose Montes, et al. 2019. “September 2019 PovcalNet Update: What’s New.” Global Poverty Monitoring Technical Note 10. https://openknowledge.worldbank.org/handle/10986/32478.
Balcázar, Carlos Felipe, Lidia Ceriani, Sergio Olivieri, and Marco Ranzani. 2017. “Rent-Imputation for Welfare Measurement: A Review of Methodologies and Empirical Findings.” Review of Income and Wealth 63 (4): 881–98. https://onlinelibrary.wiley.com/doi/abs/10.1111/roiw.12312.
Datt, Gaurav. 1998. “Computational Tools for Poverty Measurement and Analysis.” http://ebrary.ifpri.org/utils/getfile/collection/p15738coll2/id/125673/filename/125704.pdf.
De Weerdt, Joachim, John Gibson, and Kathleen Beegle. 2020. “What Can We Learn from Experimenting with Survey Methods?” Annual Review of Resource Economics 12 (1): 431–47. https://www.annualreviews.org/doi/10.1146/annurev-resource-103019-105958.
Deaton, Angus, and Salman Zaidi. 2002. “Guidelines for Constructing Consumption Aggregates for Welfare Analysis.” LSMS Working Paper, no. 135. https://openknowledge.worldbank.org/handle/10986/14101.
Gasparini, Leonardo, and Walter Sosa Escudero. 2004. “Implicit Rents from Own-Housing and Income Distribution: Econometric Estimates for Greater Buenos Aires.” Documentos de Trabajo Del CEDLAS no. 14. http://hdl.handle.net/10915/3576.
Lanjouw, Peter, and Martin Ravallion. 1995. “Poverty and Household Size.” The Economic Journal 105 (433): 1415–34. https://doi.org/10.2307/2235108.
World Bank. 2018. Poverty and Shared Prosperity 2018: Piecing Together the Poverty Puzzle. Washington, DC: World Bank. https://openknowledge.worldbank.org/handle/10986/30418.