July 19, 2022
If you’re a public health graduate student, you know that time is precious. You deserve to spend your time on the things that matter to you most, but it can be hard to find the time to learn new coding skills.
Learning some valuable cheat codes in Stata will help you code quicker, and more efficiently!
What’s a time saving cheat code?
These are short pieces of code that you can use over and over again. Cheat codes save you time. They are quick and also easy to remember to help you code quicker.
Here are my top 10 cheat codes for Stata to get you coding quicker, and more efficiently! Save these in your Stata DO file for endless future uses.
Top 10 Cheat Codes in Stata
1. Replace
Replace is a great time-saving command. This code converts all your variables to lower-case so that you don’t need to hit Shift or Caps Lock while you’re calling in variables in your code line.
replace *, lower
2. Duplicates
How do you identify duplicates in your data? Stata has an easy solution with the duplicates command. Next time you get a dataset, use this command to get a quick report of duplicates across all variables.
duplicates report
3. Collapse
Need a quick table of means, sums, medians, or percentiles? Use the collapse command! A table of the mean age, years of school, and income by state using Census data looks like this:
collapse age school income, by(state)
4. Tab1
Need to look at the frequencies for many variables at once? Save time by using the tab1 command. You’ll get the frequency tables for multiple variables without many lines of code! Here’s what it would look like to get multiple frequency tables of age, sex, gender, school, income, and county.
tab1 age sex gender education income county
5. Set more off
In Stata, you might feel like you’re always clicking the blue –more– all the time. Did you know you can turn that off and never see it again?
Use this command next time in Stata to see all your results in just one click:
set more off, permanently
6. Foreach
Loops in Stata are really helpful for looping over multiple variables or values quickly. Have you ever found yourself typing “tab var1” “tab var2” “tab var3”… etc in order to get your frequencies for each variable? Loops help you save time!
Foreach loops over multiple variables, or in this case multiple commands for types of regression.
foreach x in “reg” “logit” “probit” { `x’ outcome riskfactor}
7. Forvalues
Forvalues is another type of loop command that loops over multiple values, in this case numbers 1-10 at 1 unit intervals.
forvalues i = 1(1)10 { display `i’ }
8. Recode
Recode is a great time saving command for recoding existing values in a variable to new values. The code below recodes 2 values to 1, and all nonmissing values to 0. In this case, missing values stay missing.
Recode var (2=1 yes) (nonmiss=0 no)
9. (Gen)erate
Generate can be used for a wide range of uses. But my favorite use is converting a string variable to a numeric. In this example code, the Stata generates a new variable, agenew, that is the numeric version of age, which is stored as a byte or string field currently.
Gen agenew = real(age)
10. Generate + Replace
A common use of generate is the combination of generate, and replace. In this example code, Stata generates a new variable, newvar1, that stores the numeric values of var1 as long as var1 is not missing.
Gen newvar1=.
replace newvar1=1 if var1==”Yes” & var1!=.
Save Time With Coding
Stata is a great coding tool for getting answers to your data questions quickly. Use these time saving cheat codes to get you your answers more efficiently!
What’s your favorite cheat code in Stata? Let us know in the comments below!
Comments