Revolutionize Your Analysis in Stata and R

AI Agent-Assisted Workflow with GitHub Copilot and Claude


Eduard Bukin ebukin@worldbank.org
Distributional Impact of Policies
Fiscal Policy and Growth Department

2026-02-05

Motivation

Chat and Web-Based AI tools are impressive!

ChatGPT | Copilot | Gemini | WB MAI

Use familiar technologies:

  • Web-Browser,
  • Chat,
  • Stata editor,
  • copy-paste

Is this the best way to use AI for data analysis?

In fact, there are many
AI-powered
Integrated
Development
Environments (IDEs)

for coding and data science!

The goal

of this seminar is to introduce you to AI-assisted data analysis with Positron IDE and GitHub Copilot | Claude.

There are many IDEs ➔

Agenda

  • Introduce several AI-Concepts (vocabulary)

  • Share experience of using AI-assisted workflow in Positron with Stata and R

  • Provide kick-off instructions and resources.

Key Concepts

What do we need to know about modern analysis with AI?

  1. AI Integrated IDE: Chat | Agent | Inline Completion
  2. Context awareness: How AI understands your project
  3. Model Context Protocol (MCP): Universal adapter for AI
  4. GitHub Copilot | Claude: LLM providers
  5. Efficient prompting: Getting the best results
  6. Caveats and limitations: What to watch out for

AI Integrated IDE

AI: Chat

Positron Assistant

  • Ask AI (Claude 4.5) through Github Copilot

  • Provides explanations, suggestions, and code snippets.

  • Integrates with project context, and code.

  • Learn more:

    Assistant Chat

AI: Agent

Positron Assistant

  • Executes instructions.

  • Acts independently

    • Runs code
    • Fixes errors
    • Learns
    • Reasons
  • See more in the live demo!

AI: Inline Completion

Positron Inline Code Completion: Suggests code snippets as you type.

Context Awareness

Positron accesses project metadata. Thus AI ‘knows’:

  • Files str.: Code, docs
  • Data: Var. names, types
  • History: Edits, commands
  • Environment: Packages
  • Intent: Current task
  • Results: Output, errors

Why does it matter?

  • Project-specific suggestions
  • Understands dependencies
  • Reduces hallucinations
  • Improves efficiency

Model Context Protocol (MCP)

MCP is a universal adapter for AI—Anthropic— that connects data flows:

GitHub Copilot | Anthropic Claude

GitHub Copilot

Choose your LLM:

  • Claude Sonnet/Haiku/Opus
  • OpenAI GPT-4/o1…

Efficient Prompting

  • Be specific:

    “Write a Stata do-file to …” / “Refactor this R function to …”

  • Provide context:

    “Goal: X; Dataset: Y variables; Constraints: Z (WB rules, packages, runtime)”

  • Define expected output:

    “Save as regression_results.xlsx, format as APA table” / “Create bar chart with 95% CIs”

  • Summarize + clarify first:

    “Restate and ask clarifying questions before implementing”, “Explain why …”, “Give alternatives with trade-offs…”

  • Iterate in small steps:

    “minimal changes”, “refine”

  • Set boundaries:

    “Don’t use … data”, “Don’t print secrets, ask if in doubt.”, “Don’t change files”.

Limitations and Remedies

  • Wrong-but-plausible outputs / hallucinations: code runs but logic is wrong

    Verify and validate: ask the model to explain and justify the solution

  • Context limits: not all files/data are in context; too large projects.

    Be explicit: state assumptions, expected inputs/outputs, and references

  • Outdated knowledge: suggested APIs/packages/options may have changed

    Teach the model: provide references/links; ask it to learn

  • Over-reliance: erodes fundamentals; mistakes slip through unchallenged

    Keep learning: ask for step-by-step reasoning; request alternatives and trade-offs

  • Confidentiality / security / privacy

    Constrain context: exclude sensitive data; use .copilot-ignore; AI @ WB

  • Reproducibility: answers can vary across sessions/models/settings

    Cutomize agents: save prompts, use Git; create AI agents

Summary

Why use IDEs, not a web-browser-based workflow?

  • Context-awareness
  • Streamlined workflow
  • Reduced friction

Why Positron?

  • Built for data science, not software development
  • Integrates with Stata, R, and Python seamlessly
  • Advanced AI features for data analysis

Where to Start?

Live Demo

From an old analysis in Stata to an upgraded Stata+R reproducibility package in under 10 minutes!

Tip

Ask AI for help: “How do I download a project from GitHub and open it in Positron IDE? The link is: …”

Live Demo: Positron IDE overview

Thank You! Questins?

Additional materials

Software Setup Overview

Note

Full details: Setup Instructions

  1. Install prerequisite software (via WB Software Center)
    • Stata 19+, R 4.5+, Python 3.13+, Quarto, Git
    • Install Python uv package: pip install uv
  2. Install Positron IDE (system-level install): Request help from IT if needed
  3. Install key extensions in Positron
    • Stata MCP, Quarto
  4. Connect GitHub and configure Positron Assistant
  5. Start experimenting!
    • Open assistant: Ctrl+Shift+P > “Ask Positron Assistant”
    • Try: chat, agent mode, inline code completion

Positron IDE: Self-learning

Positron: Modern AI-native IDE for data science

Positron: Assistant

Positron: Data explorer

Positron + Stata

  1. Make sure prerequisite software is installed (Stata, R, Positron)

  2. Install Python after that the uv package: pip install uv

  3. Install Stata MCP in Positron and configure Stata path and Edition

  4. Create a new Stata do-file, write some code, save it and press run it.