iekdensity

Title

iekdensity - This command plots univariate kernel density estimates by treatment assignment.

Syntax

iekdensity yvar [if] [in] [weight] , by(treatmentvar) [ stat(string) statstyle(string) effect control(numlist) effectformat(%fmt) absorb(varname) regressionoptions(string) kdensityoptions(string) color(string) twoway_options ]

Where yvar is a numeric continuous outcome variable, whose distribution is to be plotted by treatment assignment.

options Description
by(treatmentvar) Treatment (dummy or factor) variable.

Content options:

options Description
stat(string) Add vertical lines for each treatment group with statistic specified.
statstyle(string) Specify graphic style of statistic lines.
effect Add note with treatment effect, containing point estimate, standard error, and p-value.
control(numlist) Specify value of variable for control group.
effectformat(%fmt) Specify format of point estimate and standard error of the treatment effect.

Estimation options:

options Description
absorb(varname) Specify fixed effects variable, if any.
regressionoptions(string) Specify regression options.
kdensityoptions(string) Specify kernel estimation options.

Graphic options:

options Description
color(string) Specify colors for each group.
twoway_options Specify graph options.

Description

iekdensity is a command that allows to easily plot the distribution of a variable by treatment group. It also allows to include additional information, such as descriptive statistics and treatment effect(s).

Options

Required options:

by(treatmentvar) indicates which variable should be used to idenfity the treatment assignment. This can be a dummy variable (0/1) or a factor variable, when there are multiple treatments.

Content options

stat(string) specifies a descriptive statitistic to be plotted over the kernel density graph. In particular, vertical lines for each treatment group are added. Accepted statistics are: mean, p1, p5, p50, p75, p90, p95, p99, min and max.

statstyle(string) specifies the graphic style to be used for the statistic lines. Namely, you will be able to use lpattern(linepatternstyle) and lwidth(linewidthstyle) opt ad controlled by option color(string).

effect adds a note with treatment effect, containing point estimate, standard error, and p-value, to the graph.

control(numlist) indicates which value the variable by(treatmentvar) takes for the control group. This is usually equal to 0 when the treatment is binary, but may vary when dealing wi arms.

effectformat(%fmt) specify the format in which treatment effect point estimate and standard error should be displayed in the graph note.

Estimation options:

absorb(varname) indicates the fixed effects variable (for example, the experimental strata when the treatment was stratified) to be included in the estimation. This variable must be numerical.

regressionoptions(string) indicates other options to be employed for the treatment effect estimations, for example suppress constant term (noconstant) or clustered standard errors (cluster(varname)). Al regress (or areg when option absorb() is specified) are accepted.

kdensityoptions(string) specifies kernel estimation options, such as kernel function and half-width of kernel. The default kernel function is kernel(epanechnikov). Many options accepted by kdensity are : kernel(kernel), bwidth(#), n(#), and all the cline_options (see help cline_options) for univariate kernel density estimation.

Graphic options:

color(string) indicates the colors to be used for each treatment arm. The colors should come in the order of the values in by(treatmentvar). For instance, if the treatment is binary, you can set the line colors (color1 color2). See colorstyle (help colorstyle).

twoway_options indicates other options to be applied to the graph, such as additional text and lines, changes axes, titles, and legend, etc. (See help twoway_options)

Examples

All the examples below can be run on the Stata’s built in automobile data set, by first running this code:

  • Open the built in data set
sysuse auto
  • Randomly assign time and treatment dummies
gen treatment = (runiform() < .5)

Example 1

iekdensity auto , by(treatment)

This is the most basic way to run this command. This will output a graph with the distributions of of the variable of interests (price in this case) by treatment assignment.

Example 2

iekdensity auto , by((treatment) stat(p50)

This is an easy way to add descriptive information to the graph. This will output the same graph as above with the addition of two vertical lines for the medians of the control and treatment groups.

Example 2.1

iekdensity auto , by(treatment) stat(p50) statstyle(lpattern(dash) lwithd(2))

This changes the style of the median vertical lines.

Example 2.2

iekdensity auto , by(treatment) stat(p50) statstyle(lpattern(dash) lwithd(2)) color(eltblue edkblue)

This sets the colors of the control and treatment lines to different shades of blue.

Example 2.3

iekdensity auto , by(treatment) stat(p50) statstyle(lpattern(dash) lwithd(2)) title(auto distribution) subtitle(By Treatment Assignment)
graphregion(color(white)) plotregion(color(white))

This changes some of the graphical options.

Example 3

iekdensity auto , by(treatment) stat(p50) effect

This adds a note to the graph, displaying the treatment effect in terms of point estimate, standard error and statistical significance.

Example 3.1

iekdensity auto , by(treatment) stat(p50) effect effectformat(%9.0fc)

This changes the format of the treatment effect in the note. The point estimate and the standard error now do not include any decimal points.

Example 4

iekdensity auto , by(treatment) effect absorb(foreign)

The treatment effect is now derived from a regression controlling for the variable foreign fixed effects.

Example 4.1

iekdensity auto , by(treatment) effect absorb(foreign) regressionoptions(cluster(foreign))

The treatment effect is now derived from a regression controlling for the variable foreign fixed effects and clustering standard errors at foreign level.

Example 5

iekdensity auto , by(treatment) kdensityoptions(epan2 bwidth(5))

The kernel density is estimated through the alternative Epanechnikov kernel function and half-width of the kernel is specified to be equal to 5.

Acknowledgements

We would like to acknowledge the help in testing and proofreading we received in relation to this command and help file from (in alphabetic order): Luiza Andrade

Feedback, bug reports and contributions

Please send bug-reports, suggestions and requests for clarifications writing “ietoolkit iekdensity” in the subject line to: dimeanalytics@worldbank.org

You can also see the code, make comments to the code, see the version history of the code, and submit additions or edits to the code through GitHub repository for ietoolkit.

Authors

All commands in ietoolkit are developed by DIME Analytics at DIME, The World Bank’s department for Development Impact Evaluations.

Main author: Matteo Ruzzante, DIME, The World Bank Group.