2 Coverage Analysis

While the 2018 paper looked at coverage gaps primarily in terms of most recent available values (MRVs), in this analysis we wanted to develop a more detailed approach to identify different types of coverage gaps over a broader time span. Accordingly, we developed the heat map shown in Figure 2.1. In this chart, discrete indicators are arranged along the Y axis while time is plotted on the X axis for the 2000-2018 period. Colors indicate the number of observations (i.e., countries) for the corresponding indicator and year. Darker colors in the purple part of the spectrum indicate relatively low-density coverage, while lighter colors in the yellow part of the spectrum represent high-density coverage, up to the maximum of 217 countries. Blank areas indicate no data for that particular indicator and year.

Figure 2.1: Number of countries per indicator over time

Several patterns emerge from a visual assessment of Figure 2.1, which are not necessarily mutually exclusive, nor is a visual assessment the only approach to identifying indicator clusters. Appendix 2 includes several alternatives to and in-depth definitions of the patterns discussed in this section.

2.1 Consistently low coverage

One group of indicators is characterized by consistently low country coverage over the 2000-2018 time period. In this case we’ve defined “low coverage” has having values for no more than 100 countries in any given year. These indicators generally appear as steady and consistently dark horizontal lines towards the bottom of Figure 2.1.

Table 2.1: Indicators with consistently low coverage

Indicator Code Max. Countries
Children in employment, total (% of children ages 7-14) SL.TLF.0714.ZS 31
Unmet need for contraception (% of married women ages 15-49) SP.UWT.TFRT 37
Retirement Age WBL 42
Annualized average growth rate in per capita real survey mean consumption or income, total population (%) SI.SPR.PCAP.ZG 46
Value lost due to electrical outages (% of sales for affected firms) IC.FRM.OUTG.ZS 51
GHG net emissions/removals by LUCF (Mt of CO2 equivalent) EN.CLC.GHGR.MT.CE 58
Poverty headcount ratio at national poverty lines (% of population) SI.POV.NAHC 59
Technicians in R&D (per million people) SP.POP.TECH.RD.P6 69
Literacy rate, adult total (% of people ages 15 and above) SE.ADT.LITR.ZS 71
Income share held by lowest 20% SI.DST.FRST.20 84
Poverty headcount ratio at $1.90 a day (2011 PPP) (% of population) SI.POV.DDAY 84
GINI index (World Bank estimate) SI.POV.GINI 84
Researchers in R&D (per million people) SP.POP.SCIE.RD.P6 84
People using safely managed sanitation services (% of population) SH.STA.SMSS.ZS 94
Research and development expenditure (% of GDP) GB.XPD.RSDV.GD.ZS 99
Incidence of malaria (per 1,000 population at risk) SH.MLR.INCD.P3 99

It’s important to note, however, that while total country coverage may be low for these indicators in any given year, the country composition often varies from year to year for reasons discussed in the next section. For example, “Poverty Headcount Ratio” is available for no more than 59 countries in any given year, but includes values for 135 countries across all years. By comparison, “Incidence of Malaria” is available for 99 countries in nearly all years with very little variation in any given year. These patterns may be important for ESG analysis if it is possible to extrapolate or impute missing values from prior years; indicators whose countries vary from year-to-year (and thus have larger coverage in the aggregate) may benefit to a greater degree.

2.2 Moderate to high coverage

By contrast, most indicators include values for at least 100 countries in at least one year. 99 indicators have single-year coverage of at least 100 countries, 78 indicators cover at least 150 countries, and 27 indicators cover at least 200 countries. In Figure 2.1 these indicators range from magenta to light yellow in the middle to upper sections of the heat map.

12 indicators in this cluster include values for 2018 or later for at least 90% of countries. These indicators appear at the top-most section of Figure 2.1 and correspond to the “perfect case” classification in the next chapter.

Among the remaining indicators, year-to-year composition of coverage can vary in a manner similar to those in the “consistently low coverage” group for methodological reasons, as discussed in the next section.

2.3 Measurable improvement

A handful of indicators demonstrate significant, measureable improvement in country coverage over time. We define “measurable improvement” by regressing country coverage over time for each indicator (gap-filling for years where coverage is missing entirely). Indicators with a coefficient greater than 1 are shown in Table 2.2. In Figure 2.1 these appear as indicators that are colored dark purple or magenta on the left side of their coverage with increasingly light colors on the right side.

Table 2.2: Select indicators that improve over time

Indicator Code
Current account balance (% of GDP) BN.CAB.XOKA.GD.ZS
Access to electricity (% of population) EG.ELC.ACCS.ZS
Enforcing contracts: Cost (% of claim) ENF.CONT.COEN.COST.ZS
Outstanding international public debt securities to GDP (%) GFDD.DM.06
Time required to start a business (days) IC.REG.DURS
Total tax and contribution rate (% of profit) IC.TAX.TOTL.CP.ZS
Patent applications, nonresidents IP.PAT.NRES
Patent applications, residents IP.PAT.RESD
Fixed broadband subscriptions (per 100 people) IT.NET.BBND.P2
Literacy rate, adult total (% of people ages 15 and above) SE.ADT.LITR.ZS
Proportion of seats held by women in national parliaments (%) SG.GEN.PARL.ZS
Annualized average growth rate in per capita real survey mean consumption or income, total population (%) SI.SPR.PCAP.ZG

These and similar indicators may warrant further study to better understand the factors behind the increases in country coverage. For instance, if country coverage improved as a result of better methodologies, increased production capacity, or broader demand, they may provide a model for improving country coverage for indicators that need it.

2.4 High coverage and sudden decline

A significant group of indicators has consistent coverage through most of the time period, but with declining coverage or no coverage in recent years. In Figure 2.1 these tend to appear as “truncated” series with no coloring for large portions of the right side of the chart. Table 2.3 summarizes indicators by the the year of their most recent available value (MRV). As shown, over 50% of ESG indicators in the study period have no values for the most recent study year, and 13% of indicators have no values for the most recent four years or more.

Table 2.3: Indicators by Most Recent Available Year

Year of MRV # Indicators
2018+ 55
2017 25
2016 10
2015 10
<2015 15

As noted previously, MRV years are a significant factor in ESG data use, as older data is less relevant to investment decisions being made today and in the near future. Many important factors could explain the wide variance in MRV years, and this is the focus of the next chapter.

2.5 Intermittent coverage

A handful of indicators are only available for periodic years with no values available for intermediate years. These appear in Figure 2.1 as as intermittent series resembling “dashed” lines, the majority (but not all) of which are environmental indicators. Appendix 2 provides a technical description of indicators in this category.

Table 2.4: Indicators with intermittent coverage

Indicator Code
Droughts, floods, extreme temperatures (% of population, average 1990-2009) EN.CLC.MDAT.ZS
Mammal species, threatened EN.MAM.THRD.NO
Population living in areas where elevation is below 5 meters (% of total population) EN.POP.EL5M.ZS
Annual freshwater withdrawals, total (% of internal resources) ER.H2O.FWTL.ZS
Renewable internal freshwater resources, total (billion cubic meters) ER.H2O.INTR.K3
Renewable internal freshwater resources per capita (cubic meters) ER.H2O.INTR.PC
Natural capital, subsoil assets: coal (constant 2014 US$) NW.NCA.SACO.TO
Natural capital, subsoil assets: gas (constant 2014 US$) NW.NCA.SAGA.TO
Natural capital, subsoil assets: oil (constant 2014 US$) NW.NCA.SAOI.TO
Cause of death, by communicable diseases and maternal, prenatal and nutrition conditions (% of total) SH.DTH.COMM.ZS
Diabetes prevalence (% of population ages 20 to 79) SH.STA.DIAB.ZS
Net migration SM.POP.NETM
Retirement Age WBL

While not the primary focus of this analysis, there are a handful of factors that could explain the coverage characteristics of this group. The most obvious explanation is that indicators may simply not be designed as time-series data. This is the most likely explanation for Retirement Age and Threatened Mammal Species, which are available for only a single year. In other cases, there may not be resources to collect data on an annual basis, even if doing so would be useful. Other indicators may measure environmental or social phenomena that change gradually so that annual data collection would not be efficient. This last possibility is material to ESG data use because it implies that older data may still be relevant if properly understood.