lint
Title
lint – Detects and corrects bad coding practices in Stata do-files.
Syntax
lint “input_file” [using “output_file”], [options]
The lint
command operates in two modes:
Detection mode identifies bad coding practices in Stata do-files and reports them.
Correction mode applies corrections to a Stata do-file based on the issues detected.
In detection mode, the command displays suggested corrections and potential issues in Stata’s Results window.
Correction mode is activated when an output_file is specified with using; the command then writes a new file with the applied corrections to output_file.
Note that not all issues flagged in detection mode can be automatically corrected.
To use this command, you need Stata version 16 or higher, Python, and the Pandas Python package installed. For instructions on installing Python and integrating it with Stata, see this guide. For installing Python packages, refer to this guide.
options | Description |
---|---|
verbose | Shows a report of all bad practices and issues flagged by the command. |
nosummary | Suppresses the summary table with counts of bad practices and potential issues. |
excel(filename) | Saves the verbose output in an Excel file. |
indent(integer) | Number of whitespaces used when checking indentation (default: 4). |
linemax(integer) | Maximum number of characters in a line (default: 80). |
Options specific to the correction mode
options | Description |
---|---|
automatic | Suppresses the prompt asking users which correction to apply. |
space(integer) | Number of whitespaces used instead of hard tabs when replacing hard tabs with spaces for indentation (default: same value used for the option indent(), 4 when no value is defined). |
replace | Allows the command to overwrite any existing output file. |
force | Allows the input_file to be the same as output_file. Not recommended, see below. |
Description
This command is a linting tool for Stata code that helps standardize code formatting and identify bad practices.
For further discussion of linting tools, see https://en.wikipedia.org/wiki/Lint_(software)
.
The linting rules used in this command are based on the DIME Analytics Stata Style Guide.
All style guides are inherently subjective, and differences in preferences exist.
An exact list of the rules used by this command can be found in this article on the repkit
web documentation.
See the list of rules and the DIME Analytics Stata Style Guide for a discussion on the motivations for these rules.
Options
verbose displays a detailed report of all bad practices and issues flagged by the command in the Results window. By default, only a summary table with counts for each linting rule is shown.
nosummary suppresses the summary table of flagged occurrences.
excel(filename) exports the verbose output to an Excel file at the specified location.
indent(integer) sets the number of whitespaces used when checking indentation. Default: 4.
linemax(integer) sets the maximum number of characters allowed in a single line. Default: 80.
Options specific to the correction feature
automatic suppresses the interactive prompt before applying corrections. By default, the command asks for confirmation before applying identified corrections.
space(integer) sets the number of whitespaces to replace instead of hard tabs for indentation. Default: same value used for the option indent(), 4 when no value is defined.
replace allows overwriting an existing output file.
force allows the output file name to be the same as the input file, overwriting the original do-file. This is not recommended; see details in the section below.
Recommended workflow for correction mode
The correction mode applies fewer rules than identified in detection mode. You may find that lint "input_file"
flags more issues than can be automatically corrected with lint "input_file" using "output_file"
.
A recommended workflow is to first use detection to identify bad practices, then manually correct them if there are only a few. This minimizes the risk of unintended changes.If many issues are detected, use the correction mode to address as many as possible, then review and manually fix any remaining issues.
Avoid using the force
option to overwrite the original input file. Instead, keep the original file as a backup to safeguard against unintended changes. Always verify that the corrected do-file produces the expected results before replacing the original file.
Examples
The following examples illustrate basic usages of lint. The example file bad.do
referred to below can be downloaded here.
Additional examples with more verbose explanation be found here
Detecting bad coding practices
- The basic usage is to point to a do-file that requires revision as follows:
lint "test/bad.do"
- Show bad coding practices line-by-line
lint "test/bad.do", verbose
- Remove the summary of bad practices
lint "test/bad.do", nosummary
- Specify the number of whitespaces used for detecting indentation practices (default: 4):
lint "test/bad.do", indent(2)
- Specify the maximum number of characters in a line allowed when detecting line extension (default: 80):
lint "test/bad.do", linemax(100)
- Export to Excel the results of the line by line analysis
lint "test/bad.do", excel("test_dir/detect_output.xlsx")
- You can also use this command to test all the do-files in a folder:
lint "test/"
Correcting bad coding practices
The basic usage of the correction feature requires to specify the input do-file and the output do-file that will have the corrections. If you do not include any options, the linter will ask you confirm if you want a specific bad practice to be corrected for each bad practice detected:
- Basic correction use (the linter will ask what to correct):
lint "test/bad.do" using "test/bad_corrected.do"
- Correction while defining the number of spaces to replace hard tabs with:
lint "test/bad.do" using "test/bad_corrected.do", space(2)
- Automatic use (Stata will correct the file automatically):
lint "test/bad.do" using "test/bad_corrected.do", automatic
- Use the same name for the output file (note that this will overwrite the input file, this is not recommended):
lint "test/bad.do" using "test/bad.do", automatic force
- Replace the output file if it already exists
lint "test/bad.do" using "test/bad_corrected.do", automatic replace
Feedback, bug reports and contributions
Read more about these commands on this repo where this package is developed. Please provide any feedback by opening an issue. PRs with suggestions for improvements are also greatly appreciated.