LLM as a Judge Critic (PART 2)

LLM as a Judge Critic (PART 2)#

Objective#

This notebook demonstrates how to leverage structured outputs from OpenAI’s GPT-4o-mini model for data labeling of climate related research papers. The task involves analyzing academic texts to identify and classify mentions of datasets while ensuring consistency in context across pages.

Workflow#

PDF Text Extraction:

Use PyMuPDF to extract pages from PDF documents.
Prefiltering document pages using an HF-trained model.

Weakly Supervised Data Labeling

Use the GPT-4o-mini model with a customized prompt for structured data extraction.

LLM as a Judge (Validation & Error Correction):

Use an LLM to validate extracted dataset mentions.
Correct or remove errors in dataset identification.
Filter only valid dataset mentions (valid: true), discarding invalid entries. Autonomous Reasoning Agent
Use a reasoning pipeline to validate the LLM as a judge output Next Steps
Scale this into a batch processing of multiple files / directory of research papers.

Install Required Packages

%%capture
!pip install pymupdf openai nltk scikit-learn python-dotenv

LLM-as-a-Judge for Quality Assessment#

After getting the initial extraction of dataset mentions, we will validate its output using via an LLM-as-a-judge pipeline

import json
import os
from tqdm.auto import tqdm
from pydantic import BaseModel, Field, ValidationError
from typing import List, Optional
from enum import Enum


# Define Enums for categorical fields
class Context(str, Enum):
    background = "background"
    supporting = "supporting"
    primary = "primary"


class Specificity(str, Enum):
    properly_named = "properly_named"
    descriptive_but_unnamed = "descriptive_but_unnamed"
    vague_generic = "vague_generic"


class Relevance(str, Enum):
    directly_relevant = "directly_relevant"
    indirectly_relevant = "indirectly_relevant"
    not_relevant = "not_relevant"


class DatasetEntry(BaseModel):
    raw_name: Optional[str] = Field(
        ..., description="The exact dataset name as it appears in the text."
    )
    harmonized_name: Optional[str] = Field(
        None, description="The standardized or full name of the dataset."
    )
    acronym: Optional[str] = Field(
        None, description="The short name or acronym associated with the dataset."
    )
    context: Context
    specificity: Specificity
    relevance: Relevance
    mentioned_in: Optional[str] = Field(
        None, description="The exact text excerpt where the dataset is mentioned."
    )
    producer: Optional[str] = Field(
        None, description="The organization responsible for producing the dataset."
    )
    data_type: Optional[str] = Field(
        None, description="The type of data represented by the dataset."
    )


class LabelledResponseFormat(BaseModel):
    dataset: List[DatasetEntry] = Field(
        ..., description="A list of datasets mentioned in the paper."
    )
    dataset_used: bool = Field(
        ..., description="A boolean indicating if a dataset is used in the paper."
    )

/opt/hostedtoolcache/Python/3.11.12/x64/lib/python3.11/site-packages/tqdm/auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html
  from .autonotebook import tqdm as notebook_tqdm

# Create a pydantic model for the judge response
from pydantic import model_validator


class JudgeDatasetEntry(BaseModel):
    raw_name: Optional[str] = Field(
        ..., description="The exact dataset name as it appears in the text."
    )
    harmonized_name: Optional[str] = Field(
        None, description="The standardized or full name of the dataset."
    )
    acronym: Optional[str] = Field(
        None, description="The short name or acronym associated with the dataset."
    )
    context: Context
    specificity: Specificity
    relevance: Relevance
    producer: Optional[str] = Field(
        None, description="The organization responsible for producing the dataset."
    )
    data_type: Optional[str] = Field(
        None, description="The type of data represented by the dataset."
    )
    year: Optional[str] = Field(
        None,
        description="The year associated with the dataset, if explicitly mentioned.",
    )
    valid: bool = Field(
        ..., description="True if the mention is valid, false otherwise."
    )
    invalid_reason: Optional[str] = Field(
        None, description="Reason why the mention was invalid (if applicable)."
    )
    sent: Optional[str] = Field(
        None, description="The exact sentence where the dataset is mentioned."
    )
    # entities: Optional[EmpiricalMention] = Field(None, description="Additional empirical context for the dataset.")

    # Validator to ensure valid and invalid_reason consistency
    @model_validator(mode="after")
    def check_validity(cls, instance):
        if not instance.valid and not instance.invalid_reason:
            raise ValueError("If 'valid' is False, 'invalid_reason' must be provided.")
        return instance


class JudgeDatasetGroup(BaseModel):
    mentioned_in: Optional[str] = Field(
        None, description="The exact text excerpt where the dataset is mentioned."
    )
    datasets: List[JudgeDatasetEntry] = Field(
        ..., description="A list of validated datasets mentioned in the paper."
    )


class JudgeResponseFormat(BaseModel):
    page_number: int = Field(..., description="The page number in the document.")
    dataset_used: bool = Field(
        ...,
        description="Flag indicating whether a valid dataset is mentioned in the page.",
    )
    data_mentions: List[JudgeDatasetGroup] = Field(
        ...,
        description="A list of structured dataset information mentioned in the paper.",
    )

# judge prompt
JUDGE_PROMPT = """You are an expert in dataset validation. Your task is to assess whether each dataset mention is **valid, invalid, or requires clarification**, ensuring correctness and consistency based on the dataset's **empirical context**.

---

### **Dataset Validation Criteria**
A dataset is **valid** if:
1. **It is structured**—collected systematically for research, policy, or administrative purposes.
2. **It is reproducible**—meaning it consists of collected records rather than being derived purely from computations or models.

**Always Valid Datasets:**
- Government statistical and geospatial datasets (e.g., census, official land records).  
- Official surveys, administrative records, economic transaction data, and scientific research datasets.  

**Invalid Datasets:**
Set as invalid all `"raw_name"` that belong under the following classes.
- Derived indicators or computational constructs (e.g., "wealth score", "mine dummy", "district total production").  
- Standalone statistical metrics without a clear underlying dataset (e.g., "average income growth rate" without source data).  
- General organizations, reports, or methodologies (e.g., "World Bank", "UNDP Report", "machine learning model").  

**Uncertain Cases:**
- If a dataset is **vaguely named but potentially valid**, set it as valid but return: `"Potentially valid—needs dataset name confirmation."`  
- If a dataset reference is **too generic** (e.g., `"time-varying data on production"`), set it as valid but return: `"Needs clarification—dataset name is too generic."`  

---

### **Key Validation Rules**
1. **Consistency Check:**  
   - If a `"raw_name"` has been marked **valid earlier**, it **must remain valid** unless its meaning significantly differs in a new context.

2. **Context-Aware Inference:**  
   - If certain details are missing such as the **Year**, **Producer**, or **Data Type**, try to extract them from the `mentioned_in` field if available and correctly relate to the data.

3. **Data Type Classification (Flexible & Adaptive):**  
   - Infer the most appropriate `"data_type"` dynamically from context.  
   - Possible types: **Surveys, geospatial data, administrative records, financial reports, research datasets, climate observations, etc.**  
   - If **no predefined category fits**, create a **new `"data_type"` that best describes the dataset.**  

4. **Producer Identification:**  
   - If the **producer (organization/institution) is explicitly mentioned**, extract it.  
   - If not mentioned, **do not infer—set `"producer": None"` instead.**  

---

### **JudgeResponseFormat Schema**
Each dataset assessment must conform strictly to the JudgeResponseFormat schema."""

def validate_with_llm_judge(page):
    """
    Validate dataset mentions using LLM-as-a-judge with structured outputs.

    Parameters:
        page (dict): A single page's data from the extracted data JSON.

    Returns:
        dict: The page with validated dataset mentions or None if an error occurs.
    """
    # Prepare the input for LLM
    input_data = {
        "page_number": page.get("page"),
        "data_mentions": page.get("data_mentions", []),
    }

    # Skip validation if there are no data mentions
    if not input_data["data_mentions"]:
        return None

    # Prepare messages for the LLM
    messages = [
        {"role": "system", "content": JUDGE_PROMPT},
        {"role": "user", "content": f"{json.dumps(input_data, indent=2)}"},
    ]

    try:
        completion = client.beta.chat.completions.parse(
            model=MODEL,
            messages=messages,
            temperature=0.2,
            response_format=JudgeResponseFormat,
        )

        # Validate and parse the LLM's structured response
        parsed_data = completion.choices[0].message.parsed

        # Update the page with validated mentions
        page["dataset_used"] = parsed_data.dataset_used
        page["data_mentions"] = [
            mention.model_dump() for mention in parsed_data.data_mentions
        ]

        return page

    except ValidationError as ve:
        print(f"Validation error on page {page.get('page')}: {ve}")
        return None
    except Exception as e:
        print(f"Error validating page {page.get('page')}: {e}")
        return None

def process_judge_validation(input_json):
    """
    Process the entire JSON file with LLM-as-a-judge for validation.

    Parameters:
        input_json (dict): The JSON structure containing the source and pages.

    Returns:
        dict: The updated JSON structure with validated pages.
    """
    # Process each page in the JSON file
    for page_idx, page in tqdm(
        enumerate(input_json.get("pages", [])), desc="Processing pages"
    ):
        # Validate each page with data_mentions
        if page.get("data_mentions"):
            validated_page = validate_with_llm_judge(page)
            if validated_page:
                # Update the page with validated data mentions
                input_json["pages"][page_idx] = (
                    validated_page  # page_idx might be wrong
                )
        else:
            pass
    output_path = "./output/llm_judge_validation"
    os.makedirs(output_path, exist_ok=True)
    output_file_path = os.path.join(output_path, f"{input_json['source']}.json")
    # Save the updated JSON file with validated pages
    with open(output_file_path, "w") as outfile:
        json.dump(input_json, outfile, indent=4)

# Output from the previous step
input_file_path = "output/extracted_data/The-local-socioeconomic-effects-of-gold-mining-evidence-from-Ghana.json"
with open(input_file_path, "r") as infile:
    input_data = json.load(infile)

from openai import OpenAI

# Load environment variables from .env file
# load_dotenv()

API_KEY = "YOUR_API_KEY"
MODEL = "gpt-4o-mini"
client = OpenAI(api_key=API_KEY)  # initialize the client

# inspect input

input_data.get("pages")

[{'page': 4,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'We also allow for spillovers across \ndistricts, in a district-level analysis. We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes. The paper contributes to the growing literature on the local effects of mining.',
    'datasets': [{'raw_name': 'Demographic and Health Survey (DHS)',
      'harmonized_name': 'Demographic and Health Survey (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes.'},
     {'raw_name': 'Ghana Living Standard Survey (GLSS)',
      'harmonized_name': 'Ghana Living Standard Survey (GLSS)',
      'acronym': 'GLSS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes.'}]}]},
 {'page': 5,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'Mining is also associated with more economic \nactivity measured by nightlights (Benshaul-Tolonen, 2019; Mamo et al, 2019). Kotsadam and Tolonen (2016) use DHS data from Africa, and find that mine openings cause \nwomen to shift from agriculture to service production and that women become more likely to \nwork for cash and year-round as opposed to seasonally. Continuing this analysis, Benshaul-\nTolonen (2018) explores the links between mining and female empowerment in eight gold-\nproducing countries in East and West Africa, including Ghana.',
    'datasets': [{'raw_name': 'DHS data',
      'harmonized_name': 'Demographic and Health Surveys (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Health & Public Safety Data',
      'sent': 'Kotsadam and Tolonen (2016) use DHS data from Africa, and find that mine openings cause \nwomen to shift from agriculture to service production and that women become more likely to \nwork for cash and year-round as opposed to seasonally.'}]},
   {'mentioned_in': 'We explore the effects of mining activity on employment, earnings, expenditure, and children’s \nhealth outcomes in local communities and in districts with gold mining. We combine the DHS \nand GLSS with production data for 17 large-scale gold mines in Ghana. We find that a new \nlarge-scale gold mine changes economic outcomes, such as access to employment and cash \nearnings.',
    'datasets': [{'raw_name': 'GLSS',
      'harmonized_name': 'Ghana Living Standards Survey (GLSS)',
      'acronym': 'GLSS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Economic & Trade Data',
      'sent': 'We combine the DHS \nand GLSS with production data for 17 large-scale gold mines in Ghana.'}]}]},
 {'page': 7, 'dataset_used': False, 'data_mentions': []},
 {'page': 8,
  'dataset_used': False,
  'data_mentions': [{'mentioned_in': '12 currently active mines dominate the sector, and there are an additional five suspended mines \nthat have been in production in recent decades. Table 1 presents a full list of the mines, the year \nthey opened, and their status as of December 2012. Company name and country are for the \nmain shareowner in the mine.',
    'datasets': [{'raw_name': 'Table 1 Gold Mines in Ghana',
      'harmonized_name': 'Gold Mines in Ghana',
      'acronym': None,
      'context': 'background',
      'specificity': 'properly_named',
      'relevance': 'indirectly_relevant',
      'producer': None,
      'data_type': 'Mining Operations Data',
      'sent': 'Table 1 presents a full list of the mines, the year \nthey opened, and their status as of December 2012.'}]},
   {'mentioned_in': 'Most are open-pit mines, although a few consist of a combination of open-pit \nand underground operations. Table 1 Gold Mines in Ghana \nName \nOpening \nyear \nClosing year \nCompany \nCountry \nAhafo \n2006 \nactive \nNewmont Mining Corp. \nUSA \nBibiani \n1998 \nactive \nNoble Mineral Resources \nAustralia \nBogoso Prestea \n1990 \nactive \nGolden Star Resources \nUSA \nChirano \n2005 \nactive \nKinross Gold \nCanada \nDamang \n1997 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nEdikan (Ayanfuri) \n1994 \nactive \nPerseus Mining \nAustralia \nIduapriem \n1992 \nactive \nAngloGold Ashanti \nSouth Africa \nJeni (Bonte) \n1998 \n2003 \nAkrokeri-Ashanti \nCanada \nKonongo \n1990 \nactive \nLionGold Corp. \nSingapore \nKwabeng \n1990 \n1993 \nAkrokeri-Ashanti \nCanada \nNzema \n2011 \nactive \nEndeavour \nCanada \nObotan \n1997 \n2001 \nPMI Gold \nCanada \nObuasi \n1990 \nactive \nAngloGold Ashanti \nSouth Africa \nPrestea Sankofa \n1990 \n2001 \nAnglogold Ashanti \nSouth Africa \nTarkwa \n1990 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nTeberebie \n1990 \n2005 \nAnglogold Ashanti \nSouth Africa \nWassa \n1999 \nactive \nGolden Star Resources \nUSA \nSource: InterraRMG 2013. Note: Active is production status as of December 2012, the last available data point.',
    'datasets': [{'raw_name': 'InterraRMG 2013',
      'harmonized_name': 'InterraRMG 2013',
      'acronym': None,
      'context': 'background',
      'specificity': 'properly_named',
      'relevance': 'indirectly_relevant',
      'producer': 'InterraRMG',
      'data_type': 'Mining Data',
      'sent': 'Table 1 Gold Mines in Ghana \nName \nOpening \nyear \nClosing year \nCompany \nCountry \nAhafo \n2006 \nactive \nNewmont Mining Corp. \nUSA \nBibiani \n1998 \nactive \nNoble Mineral Resources \nAustralia \nBogoso Prestea \n1990 \nactive \nGolden Star Resources \nUSA \nChirano \n2005 \nactive \nKinross Gold \nCanada \nDamang \n1997 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nEdikan (Ayanfuri) \n1994 \nactive \nPerseus Mining \nAustralia \nIduapriem \n1992 \nactive \nAngloGold Ashanti \nSouth Africa \nJeni (Bonte) \n1998 \n2003 \nAkrokeri-Ashanti \nCanada \nKonongo \n1990 \nactive \nLionGold Corp. \nSingapore \nKwabeng \n1990 \n1993 \nAkrokeri-Ashanti \nCanada \nNzema \n2011 \nactive \nEndeavour \nCanada \nObotan \n1997 \n2001 \nPMI Gold \nCanada \nObuasi \n1990 \nactive \nAngloGold Ashanti \nSouth Africa \nPrestea Sankofa \n1990 \n2001 \nAnglogold Ashanti \nSouth Africa \nTarkwa \n1990 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nTeberebie \n1990 \n2005 \nAnglogold Ashanti \nSouth Africa \nWassa \n1999 \nactive \nGolden Star Resources \nUSA \nSource: InterraRMG 2013.'}]}]},
 {'page': 9,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': '3 Data \nTo conduct this analysis, we combine different data sources using spatial analysis. The main \nmining data is a dataset from InterraRMG covering all large-scale mines in Ghana, explained \nin more detail in section 3.1. This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information. Geographical coordinates of enumeration areas in GLSS are from \nGhana Statistical Services (GSS).2 Point coordinates (global positioning system [GPS]) for the \nsurveyed DHS clusters3 allow us to match all individuals to one or several mineral mines. We \ndo this in two ways.',
    'datasets': [{'raw_name': 'InterraRMG dataset',
      'harmonized_name': 'InterraRMG dataset',
      'acronym': None,
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Mining Data',
      'sent': 'The main \nmining data is a dataset from InterraRMG covering all large-scale mines in Ghana, explained \nin more detail in section 3.1.'},
     {'raw_name': 'DHS',
      'harmonized_name': 'Demographic and Health Surveys',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Survey Data',
      'sent': 'This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information.'},
     {'raw_name': 'GLSS',
      'harmonized_name': 'Ghana Living Standards Survey',
      'acronym': 'GLSS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Survey Data',
      'sent': 'This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information.'},
     {'raw_name': 'Ghana Statistical Services (GSS) geographical coordinates',
      'harmonized_name': 'Ghana Statistical Services geographical coordinates',
      'acronym': None,
      'context': 'primary',
      'specificity': 'descriptive_but_unnamed',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Geospatial Data',
      'sent': 'Geographical coordinates of enumeration areas in GLSS are from \nGhana Statistical Services (GSS).2 Point coordinates (global positioning system [GPS]) for the \nsurveyed DHS clusters3 allow us to match all individuals to one or several mineral mines.'}]}]},
 {'page': 10,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': '8\xa0\n\xa0\nwe use a cutoff distance of 20 km, we assume there is little economic footprint beyond that \ndistance. Of course, any such distance is arbitrarily chosen, which is why we try different \nspecifications to explore the spatial heterogeneity by varying this distance (using 10 km, 20 km, \nthrough 50 km) as well as a spatial lag structure (using 0 to 10 km, 10 to 20 km, through 40 to \n50 km distance bins).4  \nSecond, we collapse the DHS mining data at the district level.5 The number of districts has \nchanged over time in Ghana, because districts with high population growth have been split into \nsmaller districts. To avoid endogeneity concerns, we use the baseline number of districts that \nexisted at the start of our analysis period, which are 137.',
    'datasets': [{'raw_name': 'DHS mining data',
      'harmonized_name': 'Demographic and Health Surveys (DHS) mining data',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': 'None',
      'data_type': 'Resource Data',
      'sent': 'Of course, any such distance is arbitrarily chosen, which is why we try different \nspecifications to explore the spatial heterogeneity by varying this distance (using 10 km, 20 km, \nthrough 50 km) as well as a spatial lag structure (using 0 to 10 km, 10 to 20 km, through 40 to \n50 km distance bins).4  \nSecond, we collapse the DHS mining data at the district level.5 The number of districts has \nchanged over time in Ghana, because districts with high population growth have been split into \nsmaller districts.'}]},
   {'mentioned_in': 'Because some mines are close to district boundaries, we additionally test whether there \nis an effect in neighboring districts. 3.1 Resource data \nThe Raw Materials Data are from InterraRMG (2013). The data set contains information on \npast or current industrial mines.',
    'datasets': [{'raw_name': 'Raw Materials Data',
      'harmonized_name': 'Raw Materials Data from InterraRMG',
      'acronym': 'None',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': 'InterraRMG',
      'data_type': 'Resource Data',
      'sent': '3.1 Resource data \nThe Raw Materials Data are from InterraRMG (2013).'}]},
   {'mentioned_in': 'All mines have information on annual production volumes, \nownership structure, and GPS coordinates on location. We complete this data with exact \ngeographic location data from MineAtlas (2013), where satellite imagery shows the actual mine \nboundaries, which allows us to identify and update the center point of each mine. The \nproduction data and ownership information are double-checked against the companies’ annual \nreports.',
    'datasets': [{'raw_name': 'MineAtlas data',
      'harmonized_name': 'MineAtlas geographic location data',
      'acronym': 'None',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': 'MineAtlas',
      'data_type': 'Geospatial Data',
      'sent': 'We complete this data with exact \ngeographic location data from MineAtlas (2013), where satellite imagery shows the actual mine \nboundaries, which allows us to identify and update the center point of each mine.'}]}]},
 {'page': 11,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'Road data is an alternative way of defining \ndistance from mines, but time series data on roads is not available. 3.2 Household data \nWe use microdata from the DHS, obtained from standardized surveys across years and \ncountries. We combine the respondents from all four DHS standard surveys in Ghana for which \nthere are geographic identifiers.',
    'datasets': [{'raw_name': 'DHS',
      'harmonized_name': 'Demographic and Health Surveys',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': '3.2 Household data \nWe use microdata from the DHS, obtained from standardized surveys across years and \ncountries.'}]},
   {'mentioned_in': 'See Appendix table 1 for definition of \noutcome variables. We complement the analysis with household data from the GLSS collected in the years—1998–\n99, 2004–05, and 2012–13. These data are a good complement to the DHS data, because they \n\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\xa0\n6 The first mines were opened in 1990, prior to the first household survey.',
    'datasets': [{'raw_name': 'GLSS',
      'harmonized_name': 'Ghana Living Standards Survey',
      'acronym': 'GLSS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'We complement the analysis with household data from the GLSS collected in the years—1998–\n99, 2004–05, and 2012–13.'}]}]},
 {'page': 12,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'In addition, they provide more detailed information on labor market \nparticipation, such as exact profession (where, for example, being a miner is a possible \noutcome), hours worked, and a wage indicator. The data estimate household expenditure and \nhousehold income. Wages, income, and expenditure can, however, be difficult to measure in \neconomies where nonmonetary compensation for labor and subsistence farming are common \npractices. 4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018). The \ndifference-in-difference model compares the treatment group (close to mines) before and after \nthe mine opening, while removing the change that happens in the control group (far away from \nmines) over time under the assumption that such changes reflect underlying temporal variation \ncommon to both treatment and control areas.',
    'datasets': [{'raw_name': 'household expenditure and household income',
      'harmonized_name': None,
      'acronym': None,
      'context': 'primary',
      'specificity': 'descriptive_but_unnamed',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Economic Data',
      'sent': 'The data estimate household expenditure and \nhousehold income.'},
     {'raw_name': 'data on production',
      'harmonized_name': None,
      'acronym': None,
      'context': 'primary',
      'specificity': 'descriptive_but_unnamed',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Economic Data',
      'sent': '4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018).'},
     {'raw_name': 'survey data',
      'harmonized_name': None,
      'acronym': None,
      'context': 'primary',
      'specificity': 'descriptive_but_unnamed',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Survey Data',
      'sent': '4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018).'}]}]},
 {'page': 13,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'Moreover, cluster \nfixed effects are not possible because of clusters are not repeatedly sampled over time. However, since the estimation is at individual level, all standard errors are clustered at the DHS \ncluster level. The sample is restricted to individuals living within 100 km of a deposit location (mine), so \nmany parts of Northern Ghana where there are few gold mines are not included in the analysis.',
    'datasets': [{'raw_name': 'DHS',
      'harmonized_name': 'Demographic and Health Surveys',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'However, since the estimation is at individual level, all standard errors are clustered at the DHS \ncluster level.'}]}]},
 {'page': 15, 'dataset_used': False, 'data_mentions': []},
 {'page': 17,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'The baseline differences in observable characteristics – in particular, \nlower levels of economic development preceding the mine opening - indicate that a cross-\nsectional approach using only the post-period may not be sufficient to understand the impact of \ngold mining on socio-economic variables. Table 2 Summary statistics for women’s survey  \n \n(1) \n(2) \n \n(3)                 (4)  \n \nBefore mining \n  \n \nDuring Mining \n \n>20 km \n<20 km \n \n>20 km \n<20 km \n \nMean \nCoefficient  \nMean \nCoefficient \n \n \n \n \n \n \nWoman Characteristics \n \n \n \n \nAge \n28.79 \n0.836 \n \n28.95 \n-0.352 \nTotal children \n2.18 \n0.417* \n \n2.56 \n-0.035 \nWealth \n3.85 \n-0.619** \n \n3.33 \n-0.028 \nNonmigrant \n0.32 \n0.123** \n \n0.33 \n-0.028 \nUrban \n0.62 \n-0.300** \n \n0.49 \n-0.150** \nNo education \n0.17 \n-0.045 \n \n0.20 \n-0.042** \n<3 years education \n0.77 \n0.035 \n \n0.74 \n0.045**',
    'datasets': [{'raw_name': 'women’s survey',
      'harmonized_name': "Women's Survey Data",
      'acronym': None,
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Survey Data',
      'sent': 'Table 2 Summary statistics for women’s survey  \n \n(1) \n(2) \n \n(3)                 (4)  \n \nBefore mining \n  \n \nDuring Mining \n \n>20 km \n<20 km \n \n>20 km \n<20 km \n \nMean \nCoefficient  \nMean \nCoefficient \n \n \n \n \n \n \nWoman Characteristics \n \n \n \n \nAge \n28.79 \n0.836 \n \n28.95 \n-0.352 \nTotal children \n2.18 \n0.417* \n \n2.56 \n-0.035 \nWealth \n3.85 \n-0.619** \n \n3.33 \n-0.028 \nNonmigrant \n0.32 \n0.123** \n \n0.33 \n-0.028 \nUrban \n0.62 \n-0.300** \n \n0.49 \n-0.150** \nNo education \n0.17 \n-0.045 \n \n0.20 \n-0.042** \n<3 years education \n0.77 \n0.035 \n \n0.74 \n0.045**'}]}]},
 {'page': 18,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'To test for exogeneity, we run regressions using baseline individual-level data to explore \nchanges in observable characteristics among women (the main part of the sample). Table 3 \nshows that there are no significant effects of the mine opening on the age structure, migration \nhistory, marital status, fertility, or education, using the difference-in-difference specification \nwith a full set of controls. If anything, it seems that women in active mining communities are \nmarginally older, more likely to never have moved, and more likely to be or have been in a \ncohabiting relationship or married.',
    'datasets': [{'raw_name': 'DHS individual data',
      'harmonized_name': 'Demographic and Health Surveys (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'Table 3 \nshows that there are no significant effects of the mine opening on the age structure, migration \nhistory, marital status, fertility, or education, using the difference-in-difference specification \nwith a full set of controls.'}]}]},
 {'page': 19,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'There is no change in the likelihood that she is not working. These 5 categories \nstem from the same occupational variable in the DHS data, and are mutually exclusive. The \nsurveyed individual is told to report their main occupation.',
    'datasets': [{'raw_name': 'DHS data',
      'harmonized_name': 'Demographic and Health Surveys (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': 'These 5 categories \nstem from the same occupational variable in the DHS data, and are mutually exclusive.'}]}]},
 {'page': 21,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'Splitting the sample by gender, we note that this decrease is \nonly statistically significant for boys at an effect size of 6.6 percentage points. Table 5 OLS estimates of birth outcomes, infant survival, and child health in the DHS individual-\nlevel analysis \n \nPANEL A                                  size at birth                        infant mortality (<12months)                  antenatal visits \n \nsmall \naverage \nlarge \n  \nall \nboys \ngirls \n  \n# visits \nat least 1 \nactive*mine \n0.022 \n0.053 \n-0.075* \n \n-0.041* \n-0.066** \n-0.020 \n \n-0.151 \n-0.007 \n \n(0.028) \n(0.041) \n(0.041) \n \n(0.022) \n(0.030) \n(0.035) \n \n(0.331) \n(0.028) \nmine \n-0.010 \n0.071** \n-0.061** \n \n0.004 \n0.008 \n0.001 \n \n0.153 \n0.000 \n \n(0.019) \n(0.028) \n(0.030) \n \n(0.015) \n(0.020) \n(0.024) \n \n(0.241) \n(0.019) \nactive \n-0.010 \n0.054** \n-0.044 \n \n0.002 \n0.014 \n-0.012 \n \n0.012 \n0.002 \n \n(0.016) \n(0.026) \n(0.027) \n \n(0.014) \n(0.022) \n(0.018) \n \n(0.209) \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,771 \n6,771 \n6,771 \n \n5,356 \n2,718 \n2,638 \n \n5,704 \n5,704 \nR-squared \n0.031 \n0.054 \n0.059 \n \n0.135 \n0.160 \n0.152 \n \n0.186 \n0.062 \nMean of dep var. 0.136 \n0.359 \n0.505 \n \n0.073 \n0.08 \n0.066 \n \n5.79 \n0.941 \n \n \n \n \n \n \n \n \n \n \n \nPANEL B \nin the last 2 weeks, had:         \nfever          cough        diarrhea \n  \nanthropometrics (WHO) in sd \nht/a              wt/a            wt/ht \n  \nhas                             \nhealth card \nactive*mine \n-0.035 \n-0.061* \n0.042 \n \n-3.532 \n-5.208 \n-0.641 \n \n0.014 \n  \n \n(0.037) \n(0.033) \n(0.027) \n \n(11.472) \n(9.283) \n(8.948) \n \n(0.027) \n \nmine \n-0.002 \n-0.006 \n-0.038 \n \n-0.828 \n3.481 \n3.853 \n \n-0.006 \n \n \n(0.031) \n(0.028) \n(0.024) \n \n(10.385) \n(8.574) \n(7.468) \n \n(0.022) \n \nactive \n0.023 \n-0.003 \n-0.033** \n \n-1.904 \n5.265 \n9.433* \n \n0.009 \n \n \n(0.020) \n(0.020) \n(0.016) \n \n(5.942) \n(5.304) \n(5.183) \n \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,246 \n6,257 \n6,262 \n \n5,627 \n5,627 \n5,727 \n \n6,378 \n \nR-squared \n0.024 \n0.043 \n0.024 \n \n0.136 \n0.080 \n0.036 \n \n0.084 \n \nMean of dep var.',
    'datasets': [{'raw_name': 'DHS individual-level analysis',
      'harmonized_name': 'Demographic and Health Surveys (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Health & Public Safety Data',
      'sent': 'Table 5 OLS estimates of birth outcomes, infant survival, and child health in the DHS individual-\nlevel analysis \n \nPANEL A                                  size at birth                        infant mortality (<12months)                  antenatal visits \n \nsmall \naverage \nlarge \n  \nall \nboys \ngirls \n  \n# visits \nat least 1 \nactive*mine \n0.022 \n0.053 \n-0.075* \n \n-0.041* \n-0.066** \n-0.020 \n \n-0.151 \n-0.007 \n \n(0.028) \n(0.041) \n(0.041) \n \n(0.022) \n(0.030) \n(0.035) \n \n(0.331) \n(0.028) \nmine \n-0.010 \n0.071** \n-0.061** \n \n0.004 \n0.008 \n0.001 \n \n0.153 \n0.000 \n \n(0.019) \n(0.028) \n(0.030) \n \n(0.015) \n(0.020) \n(0.024) \n \n(0.241) \n(0.019) \nactive \n-0.010 \n0.054** \n-0.044 \n \n0.002 \n0.014 \n-0.012 \n \n0.012 \n0.002 \n \n(0.016) \n(0.026) \n(0.027) \n \n(0.014) \n(0.022) \n(0.018) \n \n(0.209) \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,771 \n6,771 \n6,771 \n \n5,356 \n2,718 \n2,638 \n \n5,704 \n5,704 \nR-squared \n0.031 \n0.054 \n0.059 \n \n0.135 \n0.160 \n0.152 \n \n0.186 \n0.062 \nMean of dep var.'}]}]},
 {'page': 27,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': '25\xa0\n\xa0\n \nNote: Figure 5 shows the main treatment coefficients (active*mine) using the baseline estimation strategy (with \nDHS individual-level data; see table 4 for more information) in the top panel, but with different cutoffs (10 km, \n20 km, 30 km, 40 km, and 50 km). *** p<0.01, **p<0.05, *p<0.1.',
    'datasets': [{'raw_name': 'DHS individual-level data',
      'harmonized_name': 'Demographic and Health Surveys (DHS)',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': '25\xa0\n\xa0\n \nNote: Figure 5 shows the main treatment coefficients (active*mine) using the baseline estimation strategy (with \nDHS individual-level data; see table 4 for more information) in the top panel, but with different cutoffs (10 km, \n20 km, 30 km, 40 km, and 50 km).'}]}]},
 {'page': 29,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'OLS = ordinary least squares. 6.4 Bottom 40% of the population \nTo understand the welfare effects of the bottom 40 percent of the population in the income \nscale, we split the sample according to the wealth score provided by DHS. Given the data \nstructure, which is repeated cross-section, we cannot follow a particular household that was \nidentified as belonging to the bottom 40 percent in the initial time period.',
    'datasets': [{'raw_name': 'DHS',
      'harmonized_name': 'Demographic and Health Surveys',
      'acronym': 'DHS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Surveys & Census Data',
      'sent': '6.4 Bottom 40% of the population \nTo understand the welfare effects of the bottom 40 percent of the population in the income \nscale, we split the sample according to the wealth score provided by DHS.'}]}]},
 {'page': 32,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': '30\xa0\n\xa0\nTable 11 Using GLSS: Employment on extensive and intensive margin and wages \n  \n(1) \nworked \nlast  year \n(2) \nwork 7 \ndays \n(3) \nhours \nworked \nper week \n(4) \nagri- \nculture \n(5) \nservice \nand sales \n(6) \nminer \nPanel A: Women  \n  \n  \n  \n  \n  \n  \n1. baseline  \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.032 \n3.565 \n-0.075 \n0.074 \n0.025 \n \n(0.040) \n(0.038) \n(3.140) \n(0.064) \n(0.054) \n(0.016) \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.062 \n-0.039 \n3.849 \n-0.076 \n0.094* \n0.026* \n \n(0.040) \n(0.039) \n(3.359) \n(0.064) \n(0.057) \n(0.015) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.031 \n3.565 \n-0.087 \n0.080 \n0.024 \n \n(0.040) \n(0.038) \n(3.140) \n(0.065) \n(0.055) \n(0.016) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.067 \n-0.012 \n8.560* \n-0.084 \n0.104 \n0.025* \n \n(0.051) \n(0.048) \n(5.125) \n(0.075) \n(0.065) \n(0.015) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.032 \n3.565 \n-0.075 \n0.074 \n0.025 \n \n(0.032) \n(0.036) \n(3.521) \n(0.081) \n(0.080) \n(0.022) \n \n \n \n \n \n \n \nMean dep var. 0.727 \n0.673 \n40.39 \n42.32 \n0.391 \n0.005 \nPanel B: Men \n  \n  \n  \n  \n  \n  \n1. baseline  \n-0.086** \n-0.055 \n3.705 \n-0.058 \n-0.032 \n0.125*** \nactive*mine \n(0.041) \n(0.039) \n(3.460) \n(0.066) \n(0.036) \n(0.043) \n \n \n \n \n \n \n \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.893 \n-0.064 \n-0.031 \n0.126*** \n \n(0.042) \n(0.040) \n(3.842) \n(0.066) \n(0.038) \n(0.042) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.708 \n-0.071 \n-0.026 \n0.125*** \n \n(0.041) \n(0.039) \n(3.459) \n(0.067) \n(0.036) \n(0.043) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.123** \n-0.094* \n8.233 \n-0.068 \n-0.049 \n0.113** \n \n(0.057) \n(0.051) \n(5.425) \n(0.075) \n(0.044) \n(0.045) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.086*** \n-0.055** \n3.705 \n-0.058 \n-0.032 \n0.125** \n \n(0.025) \n(0.025) \n(2.898) \n(0.086) \n(0.032) \n(0.051) \n \n \n \n \n \n \n \nMean dep var \n0.715 \n0.705 \n45.71 \n0.491 \n0.259 \n0.028 \n \nNote: The table uses GLSS data for Ghana for the survey years 1998, 2005, 2012. The sample is restricted to \nwomen and men aged 15–49.',
    'datasets': [{'raw_name': 'GLSS data',
      'harmonized_name': 'Ghana Living Standards Survey (GLSS)',
      'acronym': 'GLSS',
      'context': 'primary',
      'specificity': 'properly_named',
      'relevance': 'directly_relevant',
      'producer': 'None',
      'data_type': 'Surveys & Census Data',
      'sent': '0.727 \n0.673 \n40.39 \n42.32 \n0.391 \n0.005 \nPanel B: Men \n  \n  \n  \n  \n  \n  \n1. baseline  \n-0.086** \n-0.055 \n3.705 \n-0.058 \n-0.032 \n0.125*** \nactive*mine \n(0.041) \n(0.039) \n(3.460) \n(0.066) \n(0.036) \n(0.043) \n \n \n \n \n \n \n \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.893 \n-0.064 \n-0.031 \n0.126*** \n \n(0.042) \n(0.040) \n(3.842) \n(0.066) \n(0.038) \n(0.042) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.708 \n-0.071 \n-0.026 \n0.125*** \n \n(0.041) \n(0.039) \n(3.459) \n(0.067) \n(0.036) \n(0.043) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.123** \n-0.094* \n8.233 \n-0.068 \n-0.049 \n0.113** \n \n(0.057) \n(0.051) \n(5.425) \n(0.075) \n(0.044) \n(0.045) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.086*** \n-0.055** \n3.705 \n-0.058 \n-0.032 \n0.125** \n \n(0.025) \n(0.025) \n(2.898) \n(0.086) \n(0.032) \n(0.051) \n \n \n \n \n \n \n \nMean dep var \n0.715 \n0.705 \n45.71 \n0.491 \n0.259 \n0.028 \n \nNote: The table uses GLSS data for Ghana for the survey years 1998, 2005, 2012.'}]}]},
 {'page': 37,
  'dataset_used': True,
  'data_mentions': [{'mentioned_in': 'Natural resource extraction is often argued to have detrimental effects on countries, \nhowever, and the so-called natural resource curse may imply that resource wealth is harmful to \nsocial development and inclusive growth. We use rich geocoded data with information on \nhouseholds and mining production over time to evaluate the gold boom at the local and district \nlevels in difference-in-differences analyses. Men benefit from direct job creation within the mining sector, and women seem to benefit from \nindirectly generated jobs in the service sector (statistically significant within 10 km from a \nmine).',
    'datasets': [{'raw_name': 'geocoded data with information on households and mining production over time',
      'harmonized_name': None,
      'acronym': None,
      'context': 'primary',
      'specificity': 'descriptive_but_unnamed',
      'relevance': 'directly_relevant',
      'producer': None,
      'data_type': 'Geospatial & Economic Data',
      'sent': 'We use rich geocoded data with information on \nhouseholds and mining production over time to evaluate the gold boom at the local and district \nlevels in difference-in-differences analyses.'}]}]},
 {'page': 39, 'dataset_used': False, 'data_mentions': []}]

# uncomment to run
# process_judge_validation(input_data)

print(("done!"))

done!

# inspect the output of the validation

with open(
    "output/llm_judge_validation/The-local-socioeconomic-effects-of-gold-mining-evidence-from-Ghana.json",
    "r",
) as f:
    extracted_data = json.load(f)
    print(json.dumps(extracted_data, ensure_ascii=False, indent=2))

{
  "source": "The-local-socioeconomic-effects-of-gold-mining-evidence-from-Ghana",
  "pages": [
    {
      "page": 4,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "We also allow for spillovers across \ndistricts, in a district-level analysis. We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes. The paper contributes to the growing literature on the local effects of mining.",
          "datasets": [
            {
              "raw_name": "Demographic and Health Survey (DHS)",
              "harmonized_name": "Demographic and Health Survey (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes."
            },
            {
              "raw_name": "Ghana Living Standard Survey (GLSS)",
              "harmonized_name": "Ghana Living Standard Survey (GLSS)",
              "acronym": "GLSS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "We use two complementary geocoded household data sets \nto analyze outcomes in Ghana: the Demographic and Health Survey (DHS) and the Ghana \nLiving Standard Survey (GLSS), which provide information on a wide range of welfare \noutcomes."
            }
          ]
        }
      ]
    },
    {
      "page": 5,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "Mining is also associated with more economic \nactivity measured by nightlights (Benshaul-Tolonen, 2019; Mamo et al, 2019). Kotsadam and Tolonen (2016) use DHS data from Africa, and find that mine openings cause \nwomen to shift from agriculture to service production and that women become more likely to \nwork for cash and year-round as opposed to seasonally. Continuing this analysis, Benshaul-\nTolonen (2018) explores the links between mining and female empowerment in eight gold-\nproducing countries in East and West Africa, including Ghana.",
          "datasets": [
            {
              "raw_name": "DHS data",
              "harmonized_name": "Demographic and Health Surveys (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Health & Public Safety Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Kotsadam and Tolonen (2016) use DHS data from Africa, and find that mine openings cause \nwomen to shift from agriculture to service production and that women become more likely to \nwork for cash and year-round as opposed to seasonally."
            }
          ]
        },
        {
          "mentioned_in": "We explore the effects of mining activity on employment, earnings, expenditure, and children\u0019s \nhealth outcomes in local communities and in districts with gold mining. We combine the DHS \nand GLSS with production data for 17 large-scale gold mines in Ghana. We find that a new \nlarge-scale gold mine changes economic outcomes, such as access to employment and cash \nearnings.",
          "datasets": [
            {
              "raw_name": "GLSS",
              "harmonized_name": "Ghana Living Standards Survey (GLSS)",
              "acronym": "GLSS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Economic & Trade Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "We combine the DHS \nand GLSS with production data for 17 large-scale gold mines in Ghana."
            }
          ]
        }
      ]
    },
    {
      "page": 7,
      "dataset_used": false,
      "data_mentions": []
    },
    {
      "page": 8,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "12 currently active mines dominate the sector, and there are an additional five suspended mines \nthat have been in production in recent decades. Table 1 presents a full list of the mines, the year \nthey opened, and their status as of December 2012. Company name and country are for the \nmain shareowner in the mine.",
          "datasets": [
            {
              "raw_name": "Table 1 Gold Mines in Ghana",
              "harmonized_name": "Gold Mines in Ghana",
              "acronym": null,
              "context": "background",
              "specificity": "properly_named",
              "relevance": "indirectly_relevant",
              "producer": null,
              "data_type": "Mining Operations Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Table 1 presents a full list of the mines, the year \nthey opened, and their status as of December 2012."
            }
          ]
        },
        {
          "mentioned_in": "Most are open-pit mines, although a few consist of a combination of open-pit \nand underground operations. Table 1 Gold Mines in Ghana \nName \nOpening \nyear \nClosing year \nCompany \nCountry \nAhafo \n2006 \nactive \nNewmont Mining Corp. \nUSA \nBibiani \n1998 \nactive \nNoble Mineral Resources \nAustralia \nBogoso Prestea \n1990 \nactive \nGolden Star Resources \nUSA \nChirano \n2005 \nactive \nKinross Gold \nCanada \nDamang \n1997 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nEdikan (Ayanfuri) \n1994 \nactive \nPerseus Mining \nAustralia \nIduapriem \n1992 \nactive \nAngloGold Ashanti \nSouth Africa \nJeni (Bonte) \n1998 \n2003 \nAkrokeri-Ashanti \nCanada \nKonongo \n1990 \nactive \nLionGold Corp. \nSingapore \nKwabeng \n1990 \n1993 \nAkrokeri-Ashanti \nCanada \nNzema \n2011 \nactive \nEndeavour \nCanada \nObotan \n1997 \n2001 \nPMI Gold \nCanada \nObuasi \n1990 \nactive \nAngloGold Ashanti \nSouth Africa \nPrestea Sankofa \n1990 \n2001 \nAnglogold Ashanti \nSouth Africa \nTarkwa \n1990 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nTeberebie \n1990 \n2005 \nAnglogold Ashanti \nSouth Africa \nWassa \n1999 \nactive \nGolden Star Resources \nUSA \nSource: InterraRMG 2013. Note: Active is production status as of December 2012, the last available data point.",
          "datasets": [
            {
              "raw_name": "InterraRMG 2013",
              "harmonized_name": "InterraRMG 2013",
              "acronym": null,
              "context": "background",
              "specificity": "properly_named",
              "relevance": "indirectly_relevant",
              "producer": "InterraRMG",
              "data_type": "Mining Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Table 1 Gold Mines in Ghana \nName \nOpening \nyear \nClosing year \nCompany \nCountry \nAhafo \n2006 \nactive \nNewmont Mining Corp. \nUSA \nBibiani \n1998 \nactive \nNoble Mineral Resources \nAustralia \nBogoso Prestea \n1990 \nactive \nGolden Star Resources \nUSA \nChirano \n2005 \nactive \nKinross Gold \nCanada \nDamang \n1997 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nEdikan (Ayanfuri) \n1994 \nactive \nPerseus Mining \nAustralia \nIduapriem \n1992 \nactive \nAngloGold Ashanti \nSouth Africa \nJeni (Bonte) \n1998 \n2003 \nAkrokeri-Ashanti \nCanada \nKonongo \n1990 \nactive \nLionGold Corp. \nSingapore \nKwabeng \n1990 \n1993 \nAkrokeri-Ashanti \nCanada \nNzema \n2011 \nactive \nEndeavour \nCanada \nObotan \n1997 \n2001 \nPMI Gold \nCanada \nObuasi \n1990 \nactive \nAngloGold Ashanti \nSouth Africa \nPrestea Sankofa \n1990 \n2001 \nAnglogold Ashanti \nSouth Africa \nTarkwa \n1990 \nactive \nGold Fields Ghana Ltd. \nSouth Africa \nTeberebie \n1990 \n2005 \nAnglogold Ashanti \nSouth Africa \nWassa \n1999 \nactive \nGolden Star Resources \nUSA \nSource: InterraRMG 2013."
            }
          ]
        }
      ]
    },
    {
      "page": 9,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "3 Data \nTo conduct this analysis, we combine different data sources using spatial analysis. The main \nmining data is a dataset from InterraRMG covering all large-scale mines in Ghana, explained \nin more detail in section 3.1. This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information. Geographical coordinates of enumeration areas in GLSS are from \nGhana Statistical Services (GSS).2 Point coordinates (global positioning system [GPS]) for the \nsurveyed DHS clusters3 allow us to match all individuals to one or several mineral mines. We \ndo this in two ways.",
          "datasets": [
            {
              "raw_name": "InterraRMG dataset",
              "harmonized_name": "InterraRMG dataset",
              "acronym": null,
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "InterraRMG",
              "data_type": "Mining Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "The main \nmining data is a dataset from InterraRMG covering all large-scale mines in Ghana, explained \nin more detail in section 3.1."
            },
            {
              "raw_name": "DHS",
              "harmonized_name": "Demographic and Health Surveys",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "DHS Program",
              "data_type": "Survey Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information."
            },
            {
              "raw_name": "GLSS",
              "harmonized_name": "Ghana Living Standards Survey",
              "acronym": "GLSS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "Ghana Statistical Service",
              "data_type": "Survey Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "This dataset is linked to survey data from the DHS and GLSS, \nusing spatial information."
            },
            {
              "raw_name": "Ghana Statistical Services (GSS) geographical coordinates",
              "harmonized_name": "Ghana Statistical Services geographical coordinates",
              "acronym": null,
              "context": "primary",
              "specificity": "descriptive_but_unnamed",
              "relevance": "directly_relevant",
              "producer": "Ghana Statistical Service",
              "data_type": "Geospatial Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Geographical coordinates of enumeration areas in GLSS are from \nGhana Statistical Services (GSS).2 Point coordinates (global positioning system [GPS]) for the \nsurveyed DHS clusters3 allow us to match all individuals to one or several mineral mines."
            }
          ]
        }
      ]
    },
    {
      "page": 10,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "8\n0\n\n0\nwe use a cutoff distance of 20 km, we assume there is little economic footprint beyond that \ndistance. Of course, any such distance is arbitrarily chosen, which is why we try different \nspecifications to explore the spatial heterogeneity by varying this distance (using 10 km, 20 km, \nthrough 50 km) as well as a spatial lag structure (using 0 to 10 km, 10 to 20 km, through 40 to \n50 km distance bins).4  \nSecond, we collapse the DHS mining data at the district level.5 The number of districts has \nchanged over time in Ghana, because districts with high population growth have been split into \nsmaller districts. To avoid endogeneity concerns, we use the baseline number of districts that \nexisted at the start of our analysis period, which are 137.",
          "datasets": [
            {
              "raw_name": "DHS mining data",
              "harmonized_name": "Demographic and Health Surveys (DHS) mining data",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "None",
              "data_type": "Resource Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Of course, any such distance is arbitrarily chosen, which is why we try different \nspecifications to explore the spatial heterogeneity by varying this distance (using 10 km, 20 km, \nthrough 50 km) as well as a spatial lag structure (using 0 to 10 km, 10 to 20 km, through 40 to \n50 km distance bins).4  \nSecond, we collapse the DHS mining data at the district level.5 The number of districts has \nchanged over time in Ghana, because districts with high population growth have been split into \nsmaller districts."
            }
          ]
        },
        {
          "mentioned_in": "Because some mines are close to district boundaries, we additionally test whether there \nis an effect in neighboring districts. 3.1 Resource data \nThe Raw Materials Data are from InterraRMG (2013). The data set contains information on \npast or current industrial mines.",
          "datasets": [
            {
              "raw_name": "Raw Materials Data",
              "harmonized_name": "Raw Materials Data from InterraRMG",
              "acronym": "None",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "InterraRMG",
              "data_type": "Resource Data",
              "year": "2013",
              "valid": true,
              "invalid_reason": null,
              "sent": "3.1 Resource data \nThe Raw Materials Data are from InterraRMG (2013)."
            }
          ]
        },
        {
          "mentioned_in": "All mines have information on annual production volumes, \nownership structure, and GPS coordinates on location. We complete this data with exact \ngeographic location data from MineAtlas (2013), where satellite imagery shows the actual mine \nboundaries, which allows us to identify and update the center point of each mine. The \nproduction data and ownership information are double-checked against the companies\u0019 annual \nreports.",
          "datasets": [
            {
              "raw_name": "MineAtlas data",
              "harmonized_name": "MineAtlas geographic location data",
              "acronym": "None",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "MineAtlas",
              "data_type": "Geospatial Data",
              "year": "2013",
              "valid": true,
              "invalid_reason": null,
              "sent": "We complete this data with exact \ngeographic location data from MineAtlas (2013), where satellite imagery shows the actual mine \nboundaries, which allows us to identify and update the center point of each mine."
            }
          ]
        }
      ]
    },
    {
      "page": 11,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "Road data is an alternative way of defining \ndistance from mines, but time series data on roads is not available. 3.2 Household data \nWe use microdata from the DHS, obtained from standardized surveys across years and \ncountries. We combine the respondents from all four DHS standard surveys in Ghana for which \nthere are geographic identifiers.",
          "datasets": [
            {
              "raw_name": "DHS",
              "harmonized_name": "Demographic and Health Surveys",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Surveys & Census Data",
              "sent": "3.2 Household data \nWe use microdata from the DHS, obtained from standardized surveys across years and \ncountries."
            }
          ]
        },
        {
          "mentioned_in": "See Appendix table 1 for definition of \noutcome variables. We complement the analysis with household data from the GLSS collected in the years—1998–\n99, 2004–05, and 2012–13. These data are a good complement to the DHS data, because they \n                                                            \n6 The first mines were opened in 1990, prior to the first household survey.",
          "datasets": [
            {
              "raw_name": "GLSS",
              "harmonized_name": "Ghana Living Standards Survey",
              "acronym": "GLSS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Surveys & Census Data",
              "sent": "We complement the analysis with household data from the GLSS collected in the years—1998–\n99, 2004–05, and 2012–13."
            }
          ]
        }
      ]
    },
    {
      "page": 12,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "In addition, they provide more detailed information on labor market \nparticipation, such as exact profession (where, for example, being a miner is a possible \noutcome), hours worked, and a wage indicator. The data estimate household expenditure and \nhousehold income. Wages, income, and expenditure can, however, be difficult to measure in \neconomies where nonmonetary compensation for labor and subsistence farming are common \npractices. 4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018). The \ndifference-in-difference model compares the treatment group (close to mines) before and after \nthe mine opening, while removing the change that happens in the control group (far away from \nmines) over time under the assumption that such changes reflect underlying temporal variation \ncommon to both treatment and control areas.",
          "datasets": [
            {
              "raw_name": "household expenditure and household income",
              "harmonized_name": null,
              "acronym": null,
              "context": "primary",
              "specificity": "descriptive_but_unnamed",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Economic Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "The data estimate household expenditure and \nhousehold income."
            },
            {
              "raw_name": "data on production",
              "harmonized_name": null,
              "acronym": null,
              "context": "primary",
              "specificity": "descriptive_but_unnamed",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Economic Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018)."
            },
            {
              "raw_name": "survey data",
              "harmonized_name": null,
              "acronym": null,
              "context": "primary",
              "specificity": "descriptive_but_unnamed",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Survey Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "4 Empirical Strategies \n4.1 Individual-level difference-in-differences  \nTime-varying data on production and repeated survey data allow us to use a difference-in-\ndifferences approach.7 However, due to the spatial nature of our data and the fact that some \nmines are spatially clustered, we use a strategy developed by Benshaul-Tolonen (2018)."
            }
          ]
        }
      ]
    },
    {
      "page": 13,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "Moreover, cluster \nfixed effects are not possible because of clusters are not repeatedly sampled over time. However, since the estimation is at individual level, all standard errors are clustered at the DHS \ncluster level. The sample is restricted to individuals living within 100 km of a deposit location (mine), so \nmany parts of Northern Ghana where there are few gold mines are not included in the analysis.",
          "datasets": [
            {
              "raw_name": "DHS",
              "harmonized_name": "Demographic and Health Surveys",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "None",
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "However, since the estimation is at individual level, all standard errors are clustered at the DHS \ncluster level."
            }
          ]
        }
      ]
    },
    {
      "page": 15,
      "dataset_used": false,
      "data_mentions": []
    },
    {
      "page": 17,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "The baseline differences in observable characteristics – in particular, lower levels of economic development preceding the mine opening - indicate that a cross-sectional approach using only the post-period may not be sufficient to understand the impact of gold mining on socio-economic variables. Table 2 Summary statistics for women’s survey (1) (2) (3) (4) Before mining During Mining >20 km <20 km >20 km <20 km Mean Coefficient Mean Coefficient Woman Characteristics Age 28.79 0.836 28.95 -0.352 Total children 2.18 0.417* 2.56 -0.035 Wealth 3.85 -0.619** 3.33 -0.028 Nonmigrant 0.32 0.123** 0.33 -0.028 Urban 0.62 -0.300** 0.49 -0.150** No education 0.17 -0.045 0.20 -0.042** <3 years education 0.77 0.035 0.74 0.045**",
          "datasets": [
            {
              "raw_name": "women’s survey",
              "harmonized_name": "Women's Survey Data",
              "acronym": null,
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Survey Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Table 2 Summary statistics for women’s survey (1) (2) (3) (4) Before mining During Mining >20 km <20 km >20 km <20 km Mean Coefficient Mean Coefficient Woman Characteristics Age 28.79 0.836 28.95 -0.352 Total children 2.18 0.417* 2.56 -0.035 Wealth 3.85 -0.619** 3.33 -0.028 Nonmigrant 0.32 0.123** 0.33 -0.028 Urban 0.62 -0.300** 0.49 -0.150** No education 0.17 -0.045 0.20 -0.042** <3 years education 0.77 0.035 0.74 0.045**"
            }
          ]
        }
      ]
    },
    {
      "page": 18,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "To test for exogeneity, we run regressions using baseline individual-level data to explore \nchanges in observable characteristics among women (the main part of the sample). Table 3 \nshows that there are no significant effects of the mine opening on the age structure, migration \nhistory, marital status, fertility, or education, using the difference-in-difference specification \nwith a full set of controls. If anything, it seems that women in active mining communities are \nmarginally older, more likely to never have moved, and more likely to be or have been in a \ncohabiting relationship or married.",
          "datasets": [
            {
              "raw_name": "DHS individual data",
              "harmonized_name": "Demographic and Health Surveys (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Table 3 \nshows that there are no significant effects of the mine opening on the age structure, migration \nhistory, marital status, fertility, or education, using the difference-in-difference specification \nwith a full set of controls."
            }
          ]
        }
      ]
    },
    {
      "page": 19,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "There is no change in the likelihood that she is not working. These 5 categories \nstem from the same occupational variable in the DHS data, and are mutually exclusive. The \nsurveyed individual is told to report their main occupation.",
          "datasets": [
            {
              "raw_name": "DHS data",
              "harmonized_name": "Demographic and Health Surveys (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "None",
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "These 5 categories \nstem from the same occupational variable in the DHS data, and are mutually exclusive."
            }
          ]
        }
      ]
    },
    {
      "page": 21,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "Splitting the sample by gender, we note that this decrease is \nonly statistically significant for boys at an effect size of 6.6 percentage points. Table 5 OLS estimates of birth outcomes, infant survival, and child health in the DHS individual-\nlevel analysis \n \nPANEL A                                  size at birth                        infant mortality (<12months)                  antenatal visits \n \nsmall \naverage \nlarge \n  \nall \nboys \ngirls \n  \n# visits \nat least 1 \nactive*mine \n0.022 \n0.053 \n-0.075* \n \n-0.041* \n-0.066** \n-0.020 \n \n-0.151 \n-0.007 \n \n(0.028) \n(0.041) \n(0.041) \n \n(0.022) \n(0.030) \n(0.035) \n \n(0.331) \n(0.028) \nmine \n-0.010 \n0.071** \n-0.061** \n \n0.004 \n0.008 \n0.001 \n \n0.153 \n0.000 \n \n(0.019) \n(0.028) \n(0.030) \n \n(0.015) \n(0.020) \n(0.024) \n \n(0.241) \n(0.019) \nactive \n-0.010 \n0.054** \n-0.044 \n \n0.002 \n0.014 \n-0.012 \n \n0.012 \n0.002 \n \n(0.016) \n(0.026) \n(0.027) \n \n(0.014) \n(0.022) \n(0.018) \n \n(0.209) \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,771 \n6,771 \n6,771 \n \n5,356 \n2,718 \n2,638 \n \n5,704 \n5,704 \nR-squared \n0.031 \n0.054 \n0.059 \n \n0.135 \n0.160 \n0.152 \n \n0.186 \n0.062 \nMean of dep var. 0.136 \n0.359 \n0.505 \n \n0.073 \n0.08 \n0.066 \n \n5.79 \n0.941 \n \n \n \n \n \n \n \n \n \n \n \nPANEL B \nin the last 2 weeks, had:         \nfever          cough        diarrhea \n  \nanthropometrics (WHO) in sd \nht/a              wt/a            wt/ht \n  \nhas                             \nhealth card \nactive*mine \n-0.035 \n-0.061* \n0.042 \n \n-3.532 \n-5.208 \n-0.641 \n \n0.014 \n  \n \n(0.037) \n(0.033) \n(0.027) \n \n(11.472) \n(9.283) \n(8.948) \n \n(0.027) \nmine \n-0.002 \n-0.006 \n-0.038 \n \n-0.828 \n3.481 \n3.853 \n \n-0.006 \n \n \n(0.031) \n(0.028) \n(0.024) \n \n(10.385) \n(8.574) \n(7.468) \n \n(0.022) \n \nactive \n0.023 \n-0.003 \n-0.033** \n \n-1.904 \n5.265 \n9.433* \n \n0.009 \n \n \n(0.020) \n(0.020) \n(0.016) \n \n(5.942) \n(5.304) \n(5.183) \n \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,246 \n6,257 \n6,262 \n \n5,627 \n5,627 \n5,727 \n \n6,378 \n \nR-squared \n0.024 \n0.043 \n0.024 \n \n0.136 \n0.080 \n0.036 \n \n0.084 \n \nMean of dep var.",
          "datasets": [
            {
              "raw_name": "DHS individual-level analysis",
              "harmonized_name": "Demographic and Health Surveys (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Health & Public Safety Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "Table 5 OLS estimates of birth outcomes, infant survival, and child health in the DHS individual-\nlevel analysis \n \nPANEL A                                  size at birth                        infant mortality (<12months)                  antenatal visits \n \nsmall \naverage \nlarge \n  \nall \nboys \ngirls \n  \n# visits \nat least 1 \nactive*mine \n0.022 \n0.053 \n-0.075* \n \n-0.041* \n-0.066** \n-0.020 \n \n-0.151 \n-0.007 \n \n(0.028) \n(0.041) \n(0.041) \n \n(0.022) \n(0.030) \n(0.035) \n \n(0.331) \n(0.028) \nmine \n-0.010 \n0.071** \n-0.061** \n \n0.004 \n0.008 \n0.001 \n \n0.153 \n0.000 \n \n(0.019) \n(0.028) \n(0.030) \n \n(0.015) \n(0.020) \n(0.024) \n \n(0.241) \n(0.019) \nactive \n-0.010 \n0.054** \n-0.044 \n \n0.002 \n0.014 \n-0.012 \n \n0.012 \n0.002 \n \n(0.016) \n(0.026) \n(0.027) \n \n(0.014) \n(0.022) \n(0.018) \n \n(0.209) \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,771 \n6,771 \n6,771 \n \n5,356 \n2,718 \n2,638 \n \n5,704 \n5,704 \nR-squared \n0.031 \n0.054 \n0.059 \n \n0.135 \n0.160 \n0.152 \n \n0.186 \n0.062 \nMean of dep var. 0.136 \n0.359 \n0.505 \n \n0.073 \n0.08 \n0.066 \n \n5.79 \n0.941 \n \n \n \n \n \n \n \n \n \n \n \nPANEL B \nin the last 2 weeks, had:         \nfever          cough        diarrhea \n  \nanthropometrics (WHO) in sd \nht/a              wt/a            wt/ht \n  \nhas                             \nhealth card \nactive*mine \n-0.035 \n-0.061* \n0.042 \n \n-3.532 \n-5.208 \n-0.641 \n \n0.014 \n  \n \n(0.037) \n(0.033) \n(0.027) \n \n(11.472) \n(9.283) \n(8.948) \n \n(0.027) \nmine \n-0.002 \n-0.006 \n-0.038 \n \n-0.828 \n3.481 \n3.853 \n \n-0.006 \n \n \n(0.031) \n(0.028) \n(0.024) \n \n(10.385) \n(8.574) \n(7.468) \n \n(0.022) \n \nactive \n0.023 \n-0.003 \n-0.033** \n \n-1.904 \n5.265 \n9.433* \n \n0.009 \n \n \n(0.020) \n(0.020) \n(0.016) \n \n(5.942) \n(5.304) \n(5.183) \n \n(0.012) \n \n \n \n \n \n \n \n \n \n \n \n \nObservations \n6,246 \n6,257 \n6,262 \n \n5,627 \n5,627 \n5,727 \n \n6,378 \n \nR-squared \n0.024 \n0.043 \n0.024 \n \n0.136 \n0.080 \n0.036 \n \n0.084 \n \nMean of dep var."
            }
          ]
        }
      ]
    },
    {
      "page": 27,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "25\n\n\n\n \nNote: Figure 5 shows the main treatment coefficients (active*mine) using the baseline estimation strategy (with \nDHS individual-level data; see table 4 for more information) in the top panel, but with different cutoffs (10 km, \n20 km, 30 km, 40 km, and 50 km). *** p<0.01, **p<0.05, *p<0.1.",
          "datasets": [
            {
              "raw_name": "DHS individual-level data",
              "harmonized_name": "Demographic and Health Surveys (DHS)",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "Demographic and Health Surveys",
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "25\n\n\n\n \nNote: Figure 5 shows the main treatment coefficients (active*mine) using the baseline estimation strategy (with \nDHS individual-level data; see table 4 for more information) in the top panel, but with different cutoffs (10 km, \n20 km, 30 km, 40 km, and 50 km)."
            }
          ]
        }
      ]
    },
    {
      "page": 29,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "OLS = ordinary least squares. 6.4 Bottom 40% of the population \nTo understand the welfare effects of the bottom 40 percent of the population in the income \nscale, we split the sample according to the wealth score provided by DHS. Given the data \nstructure, which is repeated cross-section, we cannot follow a particular household that was \nidentified as belonging to the bottom 40 percent in the initial time period.",
          "datasets": [
            {
              "raw_name": "DHS",
              "harmonized_name": "Demographic and Health Surveys",
              "acronym": "DHS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "None",
              "data_type": "Surveys & Census Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "6.4 Bottom 40% of the population \nTo understand the welfare effects of the bottom 40 percent of the population in the income \nscale, we split the sample according to the wealth score provided by DHS."
            }
          ]
        }
      ]
    },
    {
      "page": 32,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "30\n0\n\n0\nTable 11 Using GLSS: Employment on extensive and intensive margin and wages \n  \n(1) \nworked \nlast  year \n(2) \nwork 7 \ndays \n(3) \nhours \nworked \nper week \n(4) \nagri- \nculture \n(5) \nservice \nand sales \n(6) \nminer \nPanel A: Women  \n  \n  \n  \n  \n  \n  \n1. baseline  \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.032 \n3.565 \n-0.075 \n0.074 \n0.025 \n \n(0.040) \n(0.038) \n(3.140) \n(0.064) \n(0.054) \n(0.016) \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.062 \n-0.039 \n3.849 \n-0.076 \n0.094* \n0.026* \n \n(0.040) \n(0.039) \n(3.359) \n(0.064) \n(0.057) \n(0.015) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.031 \n3.565 \n-0.087 \n0.080 \n0.024 \n \n(0.040) \n(0.038) \n(3.140) \n(0.065) \n(0.055) \n(0.016) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.067 \n-0.012 \n8.560* \n-0.084 \n0.104 \n0.025* \n \n(0.051) \n(0.048) \n(5.125) \n(0.075) \n(0.065) \n(0.015) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.067* \n-0.032 \n3.565 \n-0.075 \n0.074 \n0.025 \n \n(0.032) \n(0.036) \n(3.521) \n(0.081) \n(0.080) \n(0.022) \n \n \n \n \n \n \n \nMean dep var. 0.727 \n0.673 \n40.39 \n42.32 \n0.391 \n0.005 \nPanel B: Men \n  \n  \n  \n  \n  \n  \n1. baseline  \n-0.086** \n-0.055 \n3.705 \n-0.058 \n-0.032 \n0.125*** \nactive*mine \n(0.041) \n(0.039) \n(3.460) \n(0.066) \n(0.036) \n(0.043) \n \n \n \n \n \n \n \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.893 \n-0.064 \n-0.031 \n0.126*** \n \n(0.042) \n(0.040) \n(3.842) \n(0.066) \n(0.038) \n(0.042) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.708 \n-0.071 \n-0.026 \n0.125*** \n \n(0.041) \n(0.039) \n(3.459) \n(0.067) \n(0.036) \n(0.043) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.123** \n-0.094* \n8.233 \n-0.068 \n-0.049 \n0.113** \n \n(0.057) \n(0.051) \n(5.425) \n(0.075) \n(0.044) \n(0.045) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.086*** \n-0.055** \n3.705 \n-0.058 \n-0.032 \n0.125** \n \n(0.025) \n(0.025) \n(2.898) \n(0.086) \n(0.032) \n(0.051) \n \n \n \n \n \n \n \nMean dep var \n0.715 \n0.705 \n45.71 \n0.491 \n0.259 \n0.028 \n \nNote: The table uses GLSS data for Ghana for the survey years 1998, 2005, 2012. The sample is restricted to \nwomen and men aged 15\u001349.",
          "datasets": [
            {
              "raw_name": "GLSS data",
              "harmonized_name": "Ghana Living Standards Survey (GLSS)",
              "acronym": "GLSS",
              "context": "primary",
              "specificity": "properly_named",
              "relevance": "directly_relevant",
              "producer": "None",
              "data_type": "Surveys & Census Data",
              "year": "1998, 2005, 2012",
              "valid": true,
              "invalid_reason": null,
              "sent": "0.727 \n0.673 \n40.39 \n42.32 \n0.391 \n0.005 \nPanel B: Men \n  \n  \n  \n  \n  \n  \n1. baseline  \n-0.086** \n-0.055 \n3.705 \n-0.058 \n-0.032 \n0.125*** \nactive*mine \n(0.041) \n(0.039) \n(3.460) \n(0.066) \n(0.036) \n(0.043) \n \n \n \n \n \n \n \n2. drop 20-40 km \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.893 \n-0.064 \n-0.031 \n0.126*** \n \n(0.042) \n(0.040) \n(3.842) \n(0.066) \n(0.038) \n(0.042) \n3. drop 2 years before \n \n \n \n \n \n \nactive*mine \n-0.094** \n-0.062 \n3.708 \n-0.071 \n-0.026 \n0.125*** \n \n(0.041) \n(0.039) \n(3.459) \n(0.067) \n(0.036) \n(0.043) \n4. mine FE \n \n \n \n \n \n \nactive*mine \n-0.123** \n-0.094* \n8.233 \n-0.068 \n-0.049 \n0.113** \n \n(0.057) \n(0.051) \n(5.425) \n(0.075) \n(0.044) \n(0.045) \n5. mine clustering \n \n \n \n \n \n \nactive*mine \n-0.086*** \n-0.055** \n3.705 \n-0.058 \n-0.032 \n0.125** \n \n(0.025) \n(0.025) \n(2.898) \n(0.086) \n(0.032) \n(0.051) \n \n \n \n \n \n \n \nMean dep var \n0.715 \n0.705 \n45.71 \n0.491 \n0.259 \n0.028 \n \nNote: The table uses GLSS data for Ghana for the survey years 1998, 2005, 2012."
            }
          ]
        }
      ]
    },
    {
      "page": 37,
      "dataset_used": true,
      "data_mentions": [
        {
          "mentioned_in": "Natural resource extraction is often argued to have detrimental effects on countries, \nhowever, and the so-called natural resource curse may imply that resource wealth is harmful to \nsocial development and inclusive growth. We use rich geocoded data with information on \nhouseholds and mining production over time to evaluate the gold boom at the local and district \nlevels in difference-in-differences analyses. Men benefit from direct job creation within the mining sector, and women seem to benefit from \nindirectly generated jobs in the service sector (statistically significant within 10 km from a \nmine).",
          "datasets": [
            {
              "raw_name": "geocoded data with information on households and mining production over time",
              "harmonized_name": null,
              "acronym": null,
              "context": "primary",
              "specificity": "descriptive_but_unnamed",
              "relevance": "directly_relevant",
              "producer": null,
              "data_type": "Geospatial & Economic Data",
              "year": null,
              "valid": true,
              "invalid_reason": null,
              "sent": "We use rich geocoded data with information on \nhouseholds and mining production over time to evaluate the gold boom at the local and district \nlevels in difference-in-differences analyses."
            }
          ]
        }
      ]
    },
    {
      "page": 39,
      "dataset_used": false,
      "data_mentions": []
    }
  ]
}

In this step, we perform the LLM as a Judge model to validate the results of the weakly supervised model.

Next Step#

The output from this step will be processed by the Autonomous Reasoning Agent to validate and refine the extracted dataset mentions, ensuring their quality and correctness.