Chapter 10 Availability, missing values, and zeros in SARMD


Figures 10.1, 10.2, and 10.3 provide three ways of visualizing the availability, percentage of missing values and frequency of zeros in the harmonized variables.

According to the SARMD protocols, if the raw data does not have the necessary information to harmonize a particular SARMD variable, the variable must still be included in the dataset as a vector of missing values. If the variable is absent from the dataset, it could be the case that the raw data contains the necessary information to harmonize such variable, but it has not been harmonized yet; or that there is no information in the raw data and previous harmonizers decided not to include it in the variable as a vector of missing values. As it is impossible to know what the correct answer is, figure 10.1 shows all the variables that have not been harmonized in each dataset available in SARMD.

For example, note that all the variables of the assets category are absent from the datasets of the Maldives in 1997 and 2004. Another example, Pakistan 2015 lacks several variables of the assets category that were present in the surveys of previoues years. Moreoever, some variables like landphone, cellphone, and computer that are present in 2015, were absent before 2015.

Absent variables

Figure 10.1: Absent variables

For thosse varaibles that are not absent in the dataset, figure 10.2 shows the share of observations with missing values in each dataset. For example, variable welfare, which is used to estimate poverty and inequality meadures, is mostly available for all the observations of the SARMD datasets. However, Nepal 2010 has an astonishing 18% of missing values in such a variable. In other words, almost a fifth of the households surveyed in 2010, are not included in the any socioeconomic indicator of Nepal.

Share of observations with missing values

Figure 10.2: Share of observations with missing values

Finally, figure 10.3 shows for each variable its proportion of missings (like fig. 10.2), proportion of zeros, and mean.

% Missing, % Zeros, Mean

Figure 10.3: % Missing, % Zeros, Mean