iegraph

Title

iegraph - Generates graphs based on regressions with treatment dummies common in impact evaluations.

Syntax

For a more descriptive discussion on the intended usage and work flow of this command please see the DIME Wiki.

iegraph , varlist , [ basictitle(string) varlabels save(string) grayscale yzero barlabel mlabcolor(colorname) mlabposition(clockpos) mlabsize(size) barlabelformat noconfbars confbarsnone(varlist) confintval(numlist) norestore baroptions(string) ignoredummytest twoway_scatter_options ]

options Description
basictitle(string) Manually sets the title of the graph
varlabels Uses variable labels for legends instead of variable names
save(string) Sets the filename and the directory to which the graph will be set/exported
grayscale Uses grayscales for the bars instead of colors
yzero Forces y-axis on the graph to start at 0
barlabel Adds a label on top of the bars with their respective values
mlabcolor(colorname) Manually set the colors of the bars
mlabposition(clockposstyle) Set color of bar label
mlabsize(size) Set position of bar label
barlabelformat Set font size of bar label
noconfbars Customizes format of bar label. Must be used with barlabel
confbarsnone(varlist) Removes the confidence interval bars from graphs for all treatments
confintval(numlist) Sets the confidence interval for the confidence interval bars. Default is .95
norestore Allows you to debug your two way graph settings on the data set prepared by iegraph. To be used with r(cmd)
baroptions(string) Allows you to add formatting to the bars
ignoredummytest Ignores the tests that tests if the dummies fits one of the two models below

Any twoway graph scatter options that can be used with regular twoway graph scatter commands can also be used. If any of these commands conflict with any of the built in options, then the user specified settings have precedence. See example 2 for details.

Description

iegraph is a command creates bar graphs based on coefficients of treatment dummies in regression results. This command is developed for reading stored results from two types of common impact evaluation regression models, but there are countless of other examples where the command also can be used. iegraph must be used immediately after running the regression or after the regression result is restored using Stata’s ereturn results.

Model 1: OLS with Treatment Dummies

The most typical impact evaluation regression is to have the outcome variable as the dependent variable and one dummy for each treatment arm where control is the omitted category. These regressions can also include covariates, fixed effects etc., but as long as the treatment status is defined by mutually exclusive dummy variables. See especially examples 1 and 2 below. This command works with any number of treatment arms but works best from two arms (treatment and control) to five treatment arms (4 different treatments and control). More arms than that may result in a still correct but perhaps cluttered graph.

Model 2: Difference-in-Differences

Another typical regression model in impact evaluations are difference-in-difference (Diff-in-Diff) models with two treatment arms (treatment and control) and two time periods. If the Diff-in-Diff regression is specified as having the outcome variable as the dependent variable and three dummy variables (time, treatment and time*treatment) as the independent variables, then this command will produce a nice graph. Controls, treatment effects etc. may be added to the regression model. See especially example 3.

Graph Output

The graph generated by this command is created using the following values. The control bar is the mean of the outcome variable for the control group. It is not the constant from the regression as those are not identical if, for example, fixed effects and covariates were used. For each treatment group the bar is the sum of the value of the control bar and the beta coefficient in the regression of the corresponding treatment dummy. The confidence intervals are calculated from the variance in the beta coefficients in the regression.

The graph also includes the N for each treatment arm in the regression and uses that value as labels on the x-axis. Stars are added to this value if the corresponding coefficient is statistically different from zero in the regression

Options

basictitle(string) manually sets the title of the graph. To apply formatting like title size, position, etc., use Stata’s built in title() option instead.

varlabels sets the legends to the variable labels for the variables instead of the variable names.

save(string) sets the legends to the variable labels for the variables instead of the variable names.

grayscale uses grayscale for the bars instead of colors. The color of the control bar will be black and the treatment bar will run in equal shade differences from light grey to dark grey.

yzero manually sets the y-axis of the graph to start at zero instead of the Stata default. In many cases, we expect that neither the default settings nor this option will make the axes look perfect, but you may use Stata’s built in axis options that allow you to set the axes to perfectly fit your data. The command will ignore the yzero option in cases where the graph cannot be forced to zero i.e. where the values in the graph extend beyond zero, both positively or negatively. A warning will be displayed telling the user that the option has been ignored. Despite the warning, the graph will be produced correctly.

barlabel adds a label on top of the bars with their respective values. Equivalent to specifying option blabel(bar) in a bar graph (see help graph_bar).

mlabcolor(colorname) sets color of bar label. May only be used with the barlabel option. See help colorname for valid values.

mlabposition(clockpos) sets position of bar label. May only be used with the barlabel option. See help clockposstyle for valid values.

mlabsize(size) sets size of bar label. May only be used with the barlabel option. See help textsizestyle for valid values.

barlabelformat customizes barlabel format. May only be used with the barlabel option. Options allowed have the formats %#.#f or %#.#e. Default if %9.1f. See help format for valid formats.

noconfbars removes the confidence interval bars from graphs for all treatments. The default value for the confidence interval bars is 95%.

confbarsnone(varlist) removes confidence interval bars from only the varlist listed. The remaining variables in the graphs which have not been specified in option confbarsnone will still have the confidence interval bars.

confintval(numlist) sets the confidence interval for the confidence interval bars. Default is .95. Values between 0 and 1 are allowed.

norestore returns the data set that iegraph prepares to create the graph. This is helpful when de-bugging how one of Stata’s many graph options can be applied to an iegraph graph. This option is meant to be used in combination with the returned result in r(cmd). r(cmd) gives you the line of code iegraph prepares to create the graph and norestore gives you access to the data that code is meant to be used on. This approach will help you de-bug how to apply Stata’s built in graph options to an iegraph graph. Note that this option deletes any unsaved changes made to your data in memory.

baroptions(string) allows you to add formatting options that are applied to each bar and not the graph itself. Example of such option are twoway_bar options and axis_options options. It is not possible to use this option to add formatting to individual bars. Everything added in this option is added to all bars. Formatting added in this option takes precedence over any default formatting or formatting set in any other option.

ignoredummytest ignores the tests that test if the dummies fits one of the two models this command is intended for. The two models are described in detail above. There might be models we have not thought of for which this command is helpful as well. Use this option to lift the restrictions of those two models. But be careful, this command has not been tested for other models than the two described.

Stored results

Examples

Example 1

regress outcomevar treatment_dummy
iegraph treatment_dummy , basictitle("Treatment Effect on Outcome")

In the example above, there are only two treatment arms (treatment and control). treatment_dummy has a 1 for all treatment observations and a 0 for all control observations. The graph will have one bar for control and it shows the mean for outcomevar for all observations in control. The second bar in the graph will be the sum of that mean and the coefficient for treatment_dummy in the regression. The graph will also have the title: Treatment Effect on Outcome.

Example 2

regress income tmt_1 tmt_2 age education, cluster(district)
iegraph tmt_1 tmt_2, noconfbars yzero basictitle("Treatment effect on income")

In the example above, the treatment effect on income in researched. There are three treatment arms; control, treatment 1 (tmt_1) and treatment 2 (tmt_2). It is important that no observation has the value 1 in both tmt_1 and tmt_2 (i.e. no observation is in more than one treatment) and some observations must have the value 0 in both tmt_1 and tmt_2 (i.e. control observations). The variables age and education are covariates (control variables) and are not included in iegraph. The option noconfbars omits the confidence interval bars, and yzero sets the y-axis to start at 0.

Example 3

regress chld_wght time treat timeXtreat
iegraph time treat timeXtreat , basictitle("Treatment effect on Child Weight (Diff-in-Diff)")

In the example above, the data set is a panel data set with two time periods and the regression estimates the treatment effect on child weight using a Difference-in-Differences model. The dummy variable time indicates if it is time period 0 or 1. The dummy variable treat indicates if the observation is treatment or control. timeXtreat is the interaction term of time and treat. This the standard way to set up a Difference-in-Differences regression model.

Example 4

regress harvest T1 T2 T3
iegraph T1 T2 T3 , basictitle("Treatment effect on harvest") xlabel(,angle(45)) yzero ylabel(minmax) save("Graph1.gph")}

The example above shows how to save a graph to disk. It also shows that most two-way graph options can be used. In this example, the iegraph option yzero conflicts with the two-way option ylabel(minmax). In such a case the user specified option takes precedence over iegraph options like yzero.

Feedback, bug reports and contributions

Please send bug-reports, suggestions and requests for clarifications writing “ietoolkit iegraph” in the subject line to: dimeanalytics@worldbank.org

You can also see the code, make comments to the code, see the version history of the code, and submit additions or edits to the code through GitHub repository for ietoolkit.

Author

All commands in ietoolkit are developed by DIME Analytics at DIME, The World Bank’s department for Development Impact Evaluations.

Main authors: Kristoffer Bjarkefur, Luiza Cardoso De Andrade, DIME Analytics, The World Bank Group