Stata - Graphing
As an example, suppose that you have a variable called height and a variable called gender. If you type graph bar height you will get a bar plot of height for all observations in your dataset. You can also type graph bar height, over(gender) to get side-by-side bar plots of height for men and women.
Open a plot in a Stata graph window, then right click it (Control-Click if using a one-button mouse on a Macintosh). This will give you a contextual menu from which you can select to print the plot. Or, you can save it in a variety of formats. Or, you also can copy your plot to the clipboard. If you choose the copy option, then open a word processor such as Microsoft Word and from the Edit choose Paste. This will copy the graph into a text document.
Note: Graphs copied and pasted into Microsoft Word for Windows do not show up properly when the Word document is opened on a Macintosh. To work around this problem, save the graph from Stata as a .tif file first and then insert that file into your Word document. Click the Insert menu then select Picture then From File. The graph will then appear correctly whether the Word document is opened on a Macintosh or a Windows machine.
The syntax is very similar between box plots and bar plots, i.e., graph box VAR1, over(VAR2) with VAR1 and VAR2 suitably defined. The “over(VAR2)” part can be dropped in which case a boxplot of VAR1 for all observations will be produced. For more information, see “How do I make a bar plot?” above.
Use the command “scatter,” i.e.:
scatter YVAR XVAR
which will make a scatter plot with YVAR on the y-axis and XVAR on the x-axis.
Use the option “title” with scatter as in: scatter VAR1 VAR2, title(“TITLE GOES HERE”)
Use the following command:
twoway (scatter YVAR XVAR) (lfit YVAR XVAR)
Note that YVAR and XVAR must be in this specified order.
The solution is to use the graphing option “msymbol” (“m” stands for marker) in conjunction with two or more || clauses. For example, the following command will make a scatter plot of the two variables height and years with squares for men and circles for women:
scatter height years if gender=="m", msymbol(square) || scatter height years if gender=="f", msymbol(circle)
You can string together any number of clauses depending on the number of categories desired. To see the types of symbols use the command “palette symbolpalette.” Note that Stata abbreviates things like “S” for square and so forth, i.e., “msymbol(S)” is the same as “msymbol(square).”
Use the option “normal” when making the histogram. For example:
histogram VARNAME, normal
will add a normal density to a histogram of VARNAME. The estimated mean and variance of the density are based on sample moments.
You need to set the option “labsize” in your bar plot command. For example,
graph bar VARNAME1, over(VARNAME2,label(labsize(small)))
Here “small” refers to a given size. Other sizes are possible, i.e., medsmall, large, and so forth. To see the available sizes type: graph query textsizestyle
To set the y axis scale, use the “yscale()” option at the end of your plot command. For example:
hist partners, by(sororityfrat) yscale(range(0 .4))
This produces a histogram with a y-axis scale of 0 - 0.4.
Use the “title()” option at the end of your plot command. For example: hist partners if partners < 30, title(“Partners Under 30”)
Use the “twoway function” command. Here is an example:
twoway function y=2 * x + 3, range(0 4)
This will plot the function f(x) = 2x+3 for x=0 to x=4. If you have a function that needs to be broken into pieces, use various pieces joined by ||. For example,
twoway function y=2 * x + 3, range(0 4) || function y=3*x-9, range(4 6)
Use the width option. For example, histogram VARNAME, width(N).
You need to use the “legend” option to do this. For example, suppose that you want to make a scatter plot of y on x based on a third variable gender. You could use the following command:
scatter y x if gender=="f", msymbol(circle) || scatter y x if gender=="m", msymbol(square) legend(label(1 "Female") label(2 "Male"))
This is easily generalized to multiple groups, different colors, and so forth.
To plot a regression line (y on x) for one set of points (gender == “F”) and another for a different set of points (gender == “M”), you can use the following command:
graph twoway (scatter y x if gender=="F") (lfit y x if gender=="F") (scatter y x if gender=="M") (lfit y x if gender=="M")
This command is easily generalized for multiple groups.
For example, to make a histogram for the variable yvar. Instead of having individual observations on yvar instead have the number of times yvar falls in various ranges. Suppose, then, that yvar falls into k ranges. In your data matrix you have a variable called yvarclass (which ranges from 1 to k) and a variable called ycount (which corresponds to counts in classes).
twoway bar ycount yvarclass
will make the appropriate bar plot, which by construction here will be a histogram.