How to Read a Coefficient Plot R
Basic usage
The bones process is to compute one or more than sets of estimates (eastward.grand. regression models) and so apply coefplot to these estimation sets to draw a plot displaying the point estimates and their conviction intervals. Interpretation commands shop their results in the then-chosen e()
returns (type ereturn listing
afterwards running an interpretation command to run across a list of what has been stored). By default, coefplot retrieves the betoken estimates from (the first equation in) vector e(b)
and computes confidence intervals from the variance estimates constitute in matrix e(V)
. See the Estimates and Confidence intervals examples for information on how to alter these defaults. Furthermore, coefplot can likewise read results from matrices that are not stored as part of an estimation set; run into Plotting results from matrices beneath.
The syntax to produce a plot of the coefficients of a single model is
coefplot [name] [, options]
where name
is the proper noun of a stored model (see help estimates store
), or .
or empty string for the agile model. For example, to plot the point estimates and 95% conviction intervals for the nearly contempo model, blazon:
. sysuse auto, clear (1978 automobile data) . regress price mpg body length turn Source | SS df MS Number of obs = 74 -------------+---------------------------------- F(four, 69) = 5.79 Model | 159570047 iv 39892511.8 Prob > F = 0.0004 Residual | 475495349 69 6891236.94 R-squared = 0.2513 -------------+---------------------------------- Adj R-squared = 0.2079 Full | 635065396 73 8699525.97 Root MSE = 2625.1 ------------------------------------------------------------------------------ toll | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- mpg | -186.8417 88.17601 -2.12 0.038 -362.748 -10.93533 trunk | -12.72642 104.8785 -0.12 0.904 -221.9534 196.5005 length | 54.55294 35.56248 1.53 0.130 -16.39227 125.4981 turn | -200.3248 140.0166 -1.43 0.157 -479.6502 79.00066 _cons | 8009.893 6205.538 one.29 0.201 -4369.817 20389.half dozen ------------------------------------------------------------------------------ . coefplot, drop(_cons) xline(0)
Code
Choice drop(_cons)
has been added to exclude the abiding of the model; selection xline(0)
has been added to describe a reference line at zero so ane can ameliorate see which coefficients are significantly different from zero.
By default, coefplot uses a horizontal layout in which the names of the coefficients are placed on the Y-axis and the estimates and their confidence intervals are plotted along the Ten-centrality. Specify option vertical
to use a vertical layout:
. coefplot, vertical drop(_cons) yline(0)
Code
Note that, because the axes were flipped, we now have to use yline(0)
instead of xline(0)
.
Go on and driblet
By default, coefplot displays all coefficients from the commencement equation of a model. Alternatively, options keep()
and drib()
can be used to specify the elements to be displayed. For instance, to a higher place, selection driblet(_cons)
was used to exclude the constant. Furthermore, coefplot automatically excluded coefficients that are flagged as "omitted" or equally "base of operations levels". To include such coefficients in the plot, specify options omitted
and baselevels
.
For instance, if you desire to display all equations from a multinomial logit model (including the equation for the base outcome for which all coefficients are zip by definition), type:
. sysuse machine, clear (1978 automobile information) . gen mpp = mpg/8 . mlogit rep78 mpp i.foreign if rep>=3 (output omitted) . coefplot, nolabel drop(_cons) keep(*:) omitted baselevels
Code
Choice continue(*:)
selects all equations for display, non just beginning. For detailed information on the syntax, meet the description of the keep()
option in the aid file.
Here is a further example that illustrates how keep()
tin can exist used to select different coefficients depending on equation:
. coefplot, nolabel proceed(three:*.foreign 4:mpp five:mpp _cons) omitted baselevels
Code
Plotting multiple models
Models as separate series
The syntax to include multiple models every bit separate series in the same graph is
coefplot (proper name [, plotopts]) (name [, plotopts]) ... [, globalopts]
where plotopts
are options that apply to a unmarried series. These options specify the information to be collected, touch on the rendition of the series, and provide a label for the series in the fable. globalopts
are options that apply to the overall graph, such as titles or centrality labels, only may also contain whatever options allowed as plot options to provide defaults for the single series. A basic example is as follows:
. sysuse auto, articulate (1978 automobile information) . regress price mpg trunk length plow if foreign==0 (output omitted) . estimates shop D . backslide price mpg trunk length turn if foreign==one (output omitted) . estimates store F . coefplot D F, drop(_cons) xline(0)
Lawmaking
To specify dissever options for an individual model, enclose the model and its options in parentheses. For example, to add a characterization for each plot in the legend, to use alternative plot styles, and to alter the marker symbol, you lot could type:
. coefplot (D, label(Domestic Cars) pstyle(p3)) /// > (F, label(Foreign Cars) pstyle(p4)) /// > , driblet(_cons) xline(0) msymbol(S)
Code
Selection msymbol()
is specified as a global option so that the same symbol is used in both series. To utilise unlike symbols, include an individual msymbol()
option for each model.
Alternatively, y'all can likewise utilise p1()
, p2()
, etc. to specify options for the unlike series:
. coefplot D F, drop(_cons) xline(0) msymbol(South) /// > p1(label(Domestic Cars) pstyle(p3)) /// > p2(label(Foreign Cars) pstyle(p4))
Lawmaking
Changing the offsets
coefplot offsets the plot positions of the coefficients so that the confidence spikes do not overlap. To deactivate the automated offsets, you tin specify global option nooffsets
. Alternatively, custom offsets may be specified by the offset()
pick (if offset()
is specified for at least one model, automated offsets are disabled). The spacing between coefficients is ane unit of measurement, and then unremarkably offsets between –0.5 and 0.5 make sense. For example, if you want to employ smaller offsets than the default, yous could type:
. sysuse auto, clear (1978 auto data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length plow if strange==1 (output omitted) . estimates store F . coefplot (D, offset(0.05)) (F, kickoff(-0.05)), drop(_cons) xline(0)
Code
Using multiple axes
If the dependent variables of the models you lot want to include in the graph have different scales, information technology tin be useful to employ the axis()
plot pick to assign specific axes to the models. For case, to include a regression on price and a regression on weight in the same graph, blazon:
. sysuse auto, clear (1978 automobile data) . regress price mpg trunk length turn (output omitted) . estimates store Price . backslide weight mpg torso length turn (output omitted) . estimates shop Weight . coefplot Price (Weight, axis(2)), drop(_cons) xtitle(Price) xtitle(Weight, axis(2))
Code
Appending models
The syntax to merge multiple models into the same series is
coefplot (namelist [, plotopts]) ...
or, more precisely,
coefplot (namelist [, modelopts] \ namelist [, modelopts] \ ... [, plotopts]) ...
where modelopts
are options that apply to a single model. For instance, if yous want to draw a graph comparing bivariate and multivariate effects, you could type:
. sysuse auto, clear (1978 automobile data) . regress price mpg body length turn (output omitted) . estimates store multivariate . foreach var in mpg torso length plough { 2. quietly regress price `var' three. estimates store `var' iv. } . coefplot (mpg trunk length turn, label(bivariate)) /// > (multivariate) /// > , drop(_cons) xline(0)
Code
When merging multiple models you may demand to apply some renaming of coefficients, because coefficients that have the same name will be printed on top of each other. This can be achieved by applying the rename()
option to the private models. An alternative approach is presented in Model names as coefficient names.
Models equally subgraphs
The syntax to create subgraphs is
coefplot plotlist [, subgropts] || plotlist [, subgropts] || ... [, globalopts]
where plotlist
is a list of models as in a higher place and subgropts
are options that use to a single subgraph. An example with one model per subgraph is as follows:
. sysuse auto, clear (1978 automobile data) . backslide cost mpg torso length turn if foreign==0 (output omitted) . estimates store D . backslide toll mpg trunk length turn if strange==i (output omitted) . estimates store F . coefplot D, bylabel(Domestic Cars) /// > || F, bylabel(Foreign Cars) /// > ||, drop(_cons) xline(0)
Code
Multiple models per subgraph
An example with multiple models per subgraph is:
. sysuse car, clear (1978 automobile data) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . regress price mpg trunk length turn if foreign==one (output omitted) . estimates store F . regress weight mpg trunk length plow if foreign==0 (output omitted) . estimates store D_weight . backslide weight mpg trunk length plough if foreign==1 (output omitted) . estimates shop F_weight . coefplot (D, label(Domestic)) (F, label(Foreign)), bylabel(Price) /// > || (D_weight) (F_weight) , bylabel(Weight) /// > ||, drib(_cons) xline(0) byopts(xrescale)
Code
Choice byopts(xrescale)
has been added so that the two subgraphs can take different scales. Furthermore, the plot labels for the legend were gear up within the showtime subgraph. They could also accept been specified within the second subgraph, as plot styles are recycled with each new subgraph and plot options are collected across subgraphs (unless norecycle
is specified; run into below).
If the subgraphs practise not contain the same number of models, it may be necessary to insert "empty" models to reach the correct alignment. This tin can be achieved by typing _skip
:
. coefplot (D, label(Domestic)) (F, label(Foreign)), bylabel(Toll) /// > || _skip (F_weight) , bylabel(Weight) /// > ||, drop(_cons) xline(0) byopts(xrescale)
Code
Dissimilar plot styles per subgraph
Equally axiomatic in the last example, coefplot recycles plot styles within each subgraph. If you want each subgraph to use its own set up of styles, use the norecycle
selection:
. sysuse auto, clear (1978 automobile data) . forvalues i = 2/five { 2. quietly regress price mpg trunk length plough if rep78==`i' 3. estimates shop rep`i' 4. } . coefplot (rep2, label(rep78=2)) (rep3, label(rep78=3)), bylabel(Low tape) /// > || (rep4, label(rep78=four)) (rep5, label(rep78=v)), bylabel(Loftier tape) /// > ||, drop(_cons) xline(0) norecycle fable(colfirst)
Code
How subgraphs are combined
Use option byopts(byopts)
to determine how subgraphs are combined. See help by_option
for bachelor byopts
. For example, to utilize a compact style and stack the subgraphs in one cavalcade, y'all could type:
. sysuse auto, articulate (1978 machine information) . regress price mpg trunk length turn if foreign==0 (output omitted) . estimates store D . backslide price mpg body length turn if strange==ane (output omitted) . estimates store F . coefplot D, bylabel(Domestic Cars) || F, bylabel(Strange Cars) /// > ||, driblet(_cons) xline(0) byopts(compact cols(1))
Code
Note that Stata renders the titles of the subgraphs as "subtitles". Hence, you lot can use the subtitle()
option to modify their styling:
. coefplot D, bylabel(Domestic Cars) || F, bylabel(Strange Cars) /// > ||, drib(_cons) xline(0) byopts(meaty cols(ane)) /// > subtitle(, size(vlarge) margin(medium) justification(left) /// > color(white) bcolor(black) bmargin(top_bottom))
Lawmaking
Subgraphs past coefficients
Sometimes information technology makes sense to conform coefficients in dissever subgraphs with private scales, as the size of coefficients may vary considerably. For example, when comparing results by subgroups or estimation techniques, the focus unremarkably lies on differences across models and less on differences within models, so that it appears natural to utilize individuals subgraphs for the different coefficients.
Creating subgraphs by coefficients requires lengthy commands equally for each coefficient a carve up piece of subgraph syntax has to be put together. To avoid this actress typing you tin use the bycoefs
option. Technically, bycoefs
flips coefficients and subgraphs, that is, the coefficients are treated as "subgraphs" and what was specified as subgraphs is treated every bit "coefficients". This seems difficult to sympathize, but should become clear in the following example:
. sysuse machine, clear (1978 automobile data) . forv i = 3/5 { 2. quietly regress price mpg headroom weight plow if rep78==`i' 3. estimate shop rep78_`i' four. } . coefplot rep78_3 || rep78_4 || rep78_5, drop(_cons) xline(0) /// > bycoefs byopts(xrescale)
Lawmaking
As some people prefer vertical fashion for such a graph, you lot might want to specify the vertical
selection:
. coefplot rep78_3 || rep78_4 || rep78_5, drop(_cons) yline(0) /// > bycoefs byopts(yrescale) vertical
Code
Here is an example that adds another dimension. Displayed are the means of some variables past repair tape and car type:
. sysuse motorcar, articulate (1978 machine data) . forv s = 0/i { 2. forv i = 3/5 { 3. quietly mean toll mpg headroom weight if foreign==`due south' & rep78==`i' iv. estimate store m`southward'_`i' 5. } 6. } . coefplot m0_3 m1_3, bylabel(rep78=3) /// > || m0_4 m1_4, bylabel(rep78=4) /// > || m0_5 m1_5, bylabel(rep78=5) /// > || , bycoefs byopts(xrescale) /// > plotlabels("Domestic cars" "Foreign cars")
Code
Using wildcards in model names
Instead of providing distinct model names to coefplot, y'all can also specify a proper name pattern containing *
(whatever string) and ?
(whatsoever nonzero character) wildcards. coefplot will then plot the results from all matching models.
If a name design is specified within parentheses, the results from the matching models volition be combined into the same plot. An case is as follows:
. sysuse machine, clear (1978 machine information) . foreach var of varlist mpg body length turn { 2. quietly backslide price `var' if foreign==0 3. estimates store d_`var' 4. quietly backslide toll `var' if strange==i five. estimates store f_`var' 6. } . estimates dir ---------------------------------------------------------------- | Dependent Number of Proper name | Command variable param. Title -------------+-------------------------------------------------- d_mpg | regress price 2 Linear regression f_mpg | regress cost 2 Linear regression d_trunk | regress toll 2 Linear regression f_trunk | regress cost ii Linear regression d_length | regress price 2 Linear regression f_length | regress price 2 Linear regression d_turn | regress price 2 Linear regression f_turn | backslide price two Linear regression ---------------------------------------------------------------- . coefplot (d*, label(domestic)) (f*, characterization(foreign)) /// > , drop(_cons) xline(0) title("Bivariate effects on price by machine type")
Code
This is equivalent to the post-obit control using explicit names:
. coefplot (d_mpg d_trunk d_length d_turn, label(domestic)) /// > (f_mpg f_trunk f_length f_turn, label(strange)) /// > , drop(_cons) xline(0) title("Bivariate effects on toll by automobile type")
Code
If multiple patterns are specified, options fastened to a specific design will exist applied to all matching models. Example:
. coefplot (d*, asequation(Domestic) \ f*, asequation(Foreign) \ , pstyle(p4)) /// > , drib(_cons) xline(0) title("Bivariate furnishings on price by car type")
Code
This is equivalent to the post-obit control using explicit names:
. coefplot (d_mpg d_trunk d_length d_turn, asequation(Domestic) \ /// > f_mpg f_trunk f_length f_turn, asequation(Foreign) \ /// > , pstyle(p4)) /// > , drop(_cons) xline(0) title("Bivariate effects on price by motorcar type")
Code
If a name pattern is specified without parentheses, the matching models will be treated equally split up serial:
. sysuse nlsw88, clear (NLSW, 1988 excerpt) . generate lnwage = ln(wage) . forvalues i=0/i { 2. forvalues j=1/2 { 3. quietly regress lnwage grade ttl_exp tenure if southward==`i' & race==`j' iv. estimates store est`i'_`j' five. } vi. } . coefplot est0* || est1*, drop(_cons) xline(0) /// > plotlabels("White" "Blackness") bylabels("North" "South")
Code
This is equivalent to the post-obit command using explicit names:
. coefplot est0_1 est0_2 || est1_1 est1_2, drop(_cons) xline(0) /// > plotlabels("White" "Black") bylabels("North" "South")
Code
When using a name design that is expanded into multiple serial, you lot need to use p1()
, p2()
, etc. to provide separate options for the different series:
. sysuse auto, clear (1978 automobile information) . forvalues i=3/v { ii. quietly regress price mpg trunk if rep78==`i' 3. estimates store rep_`i' 4. } . coefplot rep*, drop(_cons) xline(0) /// > p1(pstyle(p3) characterization("Rep=iii")) /// > p2(pstyle(p4) label("Rep=4")) /// > p3(pstyle(p5) label("Rep=5"))
Code
How coefficients are matched
The default for coefplot is to utilise the kickoff (nonzero) equation from each model and lucifer coefficients across models by their names (ignoring equation names). For example, backslide
returns i (unnamed) equation containing the regression coefficients whereas tobit
returns ii equations, an equation named afterwards the dependent variable containing the regression coefficients and equation /
containing the variance of the error term (this is true for Stata 16 or newer; in Stata 15 or lower, tobit
uses unlike equation names). Hence, the default for coefplot is to lucifer the regression coefficients from the ii models and ignore equation /
from the Tobit model:
. webuse laborsub, clear . backslide whrs kl6 k618 wa we Source | SS df MS Number of obs = 250 -------------+---------------------------------- F(four, 245) = five.27 Model | 16526046.1 4 4131511.52 Prob > F = 0.0004 Residual | 192218058 245 784563.5 R-squared = 0.0792 -------------+---------------------------------- Adj R-squared = 0.0641 Total | 208744104 249 838329.733 Root MSE = 885.76 ------------------------------------------------------------------------------ whrs | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- kl6 | -462.1233 124.6768 -iii.71 0.000 -707.6985 -216.5481 k618 | -91.141 45.85001 -1.99 0.048 -181.4515 -.8305151 wa | -xiii.1577 8.334958 -1.58 0.116 -29.57502 3.259612 we | 53.26156 26.09369 2.04 0.042 1.864986 104.6581 _cons | 940.0593 530.7197 1.77 0.078 -105.296 1985.415 ------------------------------------------------------------------------------ . estimate store backslide . tobit whrs kl6 k618 wa we, ll(0) Refining starting values: Filigree node 0: log likelihood = -1402.6764 Fitting full model: Iteration 0: log likelihood = -1402.6764 Iteration 1: log likelihood = -1371.1868 Iteration two: log likelihood = -1367.1258 Iteration iii: log likelihood = -1367.0904 Iteration 4: log likelihood = -1367.0903 Tobit regression Number of obs = 250 Uncensored = 150 Limits: Lower = 0 Left-censored = 100 Upper = +inf Correct-censored = 0 LR chi2(4) = 23.03 Prob > chi2 = 0.0001 Log likelihood = -1367.0903 Pseudo R2 = 0.0084 ------------------------------------------------------------------------------ whrs | Coefficient Std. err. t P>|t| [95% conf. interval] -------------+---------------------------------------------------------------- kl6 | -827.7655 214.7521 -iii.85 0.000 -1250.753 -404.7781 k618 | -140.0191 74.22719 -1.89 0.060 -286.221 6.182766 wa | -24.97918 13.25715 -1.88 0.061 -51.09118 1.13281 we | 103.6896 41.82629 2.48 0.014 21.30625 186.0729 _cons | 589.0002 841.5952 0.70 0.485 -1068.651 2246.652 -------------+---------------------------------------------------------------- var(eastward.whrs)| 1715859 216775.7 1337864 2200650 ------------------------------------------------------------------------------ . mat listing e(b) e(b)[i,6] whrs: whrs: whrs: whrs: whrs: /: kl6 k618 wa we _cons var(e.whrs) y1 -827.76553 -140.01914 -24.979183 103.68958 589.00023 1715858.7 . estimate store tobit . coefplot backslide tobit, xline(0)
Code
Fifty-fifty though the nerveless results from backslide
and tobit
have dissimilar equation names (_
and whrs
, respectively), coefplot matches their coefficients, that is, the equation names are ignored. This is the default if only one equation per model is nerveless. If you want to take equation names into account nonetheless, y'all can specify the eqstrict
option:
. coefplot regress tobit, xline(0) eqstrict
Lawmaking
Although eqstrict
causes equation names to be relevant, the second equation from the tobit
model is still ignored. To include all equations, specify proceed(*:)
:
. coefplot regress tobit, xline(0) keep(*:) /// > transform(var(e.whrs) = sqrt(@)) rename(var(e.whrs) = Sigma)
Code
(Option transform()
has been practical to take the square root of the error variance, then that the coefficients are on a similar scale. Option rename()
has been specified to replace var(e.whrs)
by Sigma
.)
Furthermore, to match the coefficients from regress
with the first equation from tobit
and as well impress the second equation from tobit
, you can employ asequation()
to set the equation name of regress
to whrs
:
. coefplot (regress, asequation(whrs)) /// > (tobit, go along(*:) transform(var(eastward.whrs) = sqrt(@)) /// > rename(var(e.whrs) = Sigma)) /// > , xline(0)
Lawmaking
Alternatively, you could too employ eqrename(_ = whrs)
to rename equation _
to whrs
or eqrename(whrs = _)
to rename equation whrs
to _
.
The following instance further illustrates how you can get rid of the equation labels:
. coefplot (regress, asequation(whrs)) /// > (tobit, keep(*:) transform(var(e.whrs) = sqrt(@)) /// > rename(var(e.whrs) = Sigma)) /// > , xline(0) noeqlabels
Lawmaking
As illustrated above, coefplot
matches coefficients based on their names. Employ the rename()
choice if you lot want to lucifer coefficients that have different names in the input models. Here is an instance that illustrates the effect of measurement error in regression models:
. drop _all . matrix C = ( 1, .v, 0 \ .5, 1, .three \ 0, .iii, 1 ) . drawnorm x1 x2 x3, n(10000) corr(C) (obs 10,000) . generate y = ane + x1 + x2 + x3 + 5 * invnorm(compatible()) . regress y x1 x2 x3 (output omitted) . estimates store m1 . generate x1err = x1 + 2 * invnorm(uniform()) . regress y x1err x2 x3 (output omitted) . estimates shop m2 . coefplot (m1, label(Without error)) /// > (m2, label(With error)) /// > , xline(1) rename(x1err = x1)
Code
We can see how measurement error in x1
distorts all gradient coefficients in the model, even for variable x3
that is uncorrelated with x1
(due to the indirect correlation through x2
).
Ordering and sorting
Automated social club of coefficients
In general, coefficients are plotted in the same order (from pinnacle to lesser) equally they appear in the input models. Yet, coefficients actualization but in later models are placed subsequently coefficients from earlier models (with the exception of _cons
, which is ever placed concluding). Have a wait at the post-obit example:
. sysuse motorcar, clear (1978 automobile data) . label variable mpg "ane. mpg" . label variable trunk "{bf:two. torso}" . label variable length "{bf:3. length}" . label variable plow "four. plow" . regress price mpg length (output omitted) . estimate store m1 . regress toll mpg body turn (output omitted) . estimate store m2 . backslide price mpg trunk length turn (output omitted) . estimate shop m3 . coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1))
Code
Even though in the full model (m3
) trunk
comes before length
, the order of the ii coefficients is reversed in the plot. This is because length
but non trunk
is part of the outset model. That is, because body
only appears in the afterwards models, it is placed after length
that appears already in the beginning model.
To establish an order every bit in the full model, yous can utilise the orderby()
choice:
. coefplot m1 || m2 || m3, xline(0) driblet(_cons) byopts(row(one)) orderby(3:)
Code
Typing orderby(3:)
instructs coefplot to use the model displayed in the tertiary subgraph to determine the order of the coefficients. (Typing orderby(three)
would refer to the third model in the first subgraph, which doesn't exist in the instance higher up; orderby(three:3)
would refer to the third model in the 3rd subgraph.)
Explicit guild of coefficients
Alternatively, you can specify an explicit lodge of coefficients using the gild()
choice:
. coefplot m1 || m2 || m3, xline(0) drop(_cons) byopts(row(1)) /// > order(mpg torso length)
Code
Within gild()
, you lot can apply the *
(any string) and ?
(whatever nonzero character) wildcards. Furthermore, you can type .
to insert gaps. Example:
. label variable mpg . label variable trunk . label variable length . label variable turn . coefplot m1 || m2 || m3, xline(0) driblet(_cons) byopts(row(1)) /// > social club(. mpg . t* . length .)
Code
In case of multiple equations, the specified club of coefficients applies to each equation:
. sysuse car, clear (1978 automobile data) . gen mpp = mpg/8 . mlogit rep78 mpp if rep>=iii (output omitted) . estimates shop m1 . mlogit rep78 mpp i.foreign if rep>=3 (output omitted) . estimates shop m2 . coefplot m1 || m2, xline(0) nolabel continue(*:) guild(_cons 1.foreign mpp)
Lawmaking
Alternatively, to change the lodge of equations without changing the order of coefficients, type:
. coefplot m1 || m2, xline(0) nolabel keep(*:) social club(five: 4:)
Lawmaking
Y'all can besides specify a separate order for each equation or fifty-fifty have equations apart, as in the following example:
. coefplot m1 || m2, xline(0) nolabel go on(*:) /// > order(4:1.foreign 5:i.strange 4:_cons mpp)
Lawmaking
Sort coefficients by size
Coefficients can be ordered past size using the sort()
selection. Hither is an instance that displays average wages past manufacture, from lowest to highest:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(manufacture,ii) (4 observations deleted) . backslide wage ibn.industry, nocons noheader ----------------------------------------------------------------------------------------- wage | Coefficient Std. err. t P>|t| [95% conf. interval] ------------------------+---------------------------------------------------------------- industry | Ag/Forestry/Fisheries | 5.621121 1.348538 4.17 0.000 ii.976592 8.26565 Construction | 7.564934 1.032496 vii.33 0.000 5.540173 9.589695 Manufacturing | vii.501578 .2902381 25.85 0.000 vi.932411 8.070745 Send/Comm/Utility | 11.44335 .5860926 19.52 0.000 10.294 12.5927 Wholesale/Retail trade | half-dozen.125897 .3046951 20.11 0.000 5.528379 vi.723414 Finance/Ins/Existent est.. | 9.843174 .4012702 24.53 0.000 nine.056269 ten.63008 Business/Repair svc | 7.51579 .5995678 12.54 0.000 6.340017 8.691564 Personal services | iv.401093 .564549 7.fourscore 0.000 3.293993 five.508193 Amusement/Rec svc | six.724409 one.348538 four.99 0.000 4.07988 9.368938 Professional services | 7.871186 .1936975 40.64 0.000 7.491338 eight.251033 Public assistants | nine.148407 .4191131 21.83 0.000 8.326512 9.970302 ----------------------------------------------------------------------------------------- . coefplot, sort
Code
To sort from highest to lowest, specify the descending
suboption:
. coefplot, sort(, descending)
Code
sort()
has a by()
suboption to select the statistic by which the coefficients are ordered. For example, to sort by estimation precision (standard errors) you could type:
. coefplot, sort(, past(se))
Code
Note how conviction intervals increment from superlative to bottom.
If a graph contains multiple serial, it usually makes sense to select a specific series for sorting the coefficients (the default is to take all available estimates into account; this is equivalent to sort coefficients based on their minimums across series). An example is equally follows:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(industry,2) (4 observations deleted) . regress wage ibn.manufacture, nocons noheader (output omitted) . estimates store overall . regress wage ibn.industry if union==0, nocons (output omitted) . estimates store nonunion . regress wage ibn.industry if union==1, nocons (output omitted) . estimates shop spousal relationship . coefplot overall, nokey /// > || nonunion union, bylabel(by union status) /// > || , norecycle byopts(fable(position(v))) sort(1, descending)
Lawmaking
In the case, the first series (overall means) is used for sorting. To sort by the 3rd serial (wages of the unionized; green markers in the correct panel), yous could type:
. coefplot overall, nokey /// > || nonunion union, bylabel(past union status) /// > || , norecycle byopts(legend(position(five))) sort(3, descending)
Code
Because in this instance the norecycle
choice has been specified, the series are uniquely identified beyond the subgraphs. If norecycle
is omitted, series are repeated by subgraph. Hence, to sort by a serial in a specific subgraph in this case yous demand to provide both the subgraph number and the series number, as in the following example:
. sysuse nlsw88, clear (NLSW, 1988 extract) . drop if inlist(manufacture,1,two,3,x) (67 observations deleted) . backslide wage ibn.industry if union==0 & south==0, nocons (output omitted) . estimates store nonunionnorth . regress wage ibn.manufacture if union==1 & south==0, nocons (output omitted) . estimates store unionnorth . backslide wage ibn.industry if spousal relationship==0 & due south==i, nocons (output omitted) . estimates shop nonunionsouth . backslide wage ibn.industry if union==1 & s==1, nocons (output omitted) . estimates shop unionsouth . coefplot nonunionnorth unionnorth, bylabel(North) /// > || nonunionsouth unionsouth, bylabel(South) /// > || , plotlabels("nonunion" "union") sort(2:i, descending)
Code
sort(2:i)
instructs coefplot to sort the coefficients according to the first series in the second subgraph (wages of nonunionized in the south; blue serial in right panel).
Plotting results from matrices
By default, coefplot reads results from the e()
returns of stored estimation sets. To read results from a matrix, blazon matrix(name)
instead of proper name
when referring to the results. coefplot will then collect the point estimates from the get-go row of matrix name
instead of from eastward(b)
of interpretation set name
. If plotting results from matrices, y'all also have to specify how to obtain the confidence intervals, using the 5()
, se()
, or ci()
pick. For instance, to plot medians and their confidence intervals equally computed by centile
y'all could type:
. sysuse auto, clear (1978 automobile data) . matrix median = J(1,iii,.) . matrix colnames median = mpg trunk turn . matrix CI = J(two,three,.) . matrix colnames CI = mpg trunk turn . matrix rownames CI = ll95 ul95 . local i 0 . foreach v of var mpg torso plough { 2. local ++ i iii. quietly centile `five' four. matrix median[1, `i'] = r(c_1) five. matrix CI[ane, `i'] = r(lb_1) \ r(ub_1) six. } . matrix list median median[1,three] mpg torso turn r1 20 14 40 . matrix list CI CI[2,3] mpg trunk turn ll95 nineteen 12 37.078729 ul95 22 16 42 . coefplot matrix(median), ci(CI)
Lawmaking
By default coefplot reads the point estimates from the first row of the specified matrix, and the CIs from the first ii rows of the matrix specified in ci()
. Depending on the layout of your results matrices, you will demand to specify the rows and columns to read from (encounter the next case, or the remark on Plotting results from matrices in the help file).
Results from estimation commands and from matrices tin be combined in the same graph. Here is an example that displays means and medians of price by repair record:
. sysuse auto, clear (1978 automobile data) . matrix R = J(5, 3, .) . matrix coln R = median ll95 ul95 . matrix rown R = 1 2 three four five . forv i = i/5 { 2. quietly centile price if rep78==`i' 3. matrix R[`i',1] = r(c_1), r(lb_1), r(ub_1) 4. } . matrix listing R R[five,3] median ll95 ul95 1 4564.five 4195 4934 ii 4638 3898.525 8993.35 3 4741 4484.8407 5714.9172 four 5751.v 4753.4403 7055.1933 5 5397 3930.5673 6988.0509 . mean toll, over(rep78) Hateful estimation Number of obs = 69 --------------------------------------------------------------- | Mean Std. err. [95% conf. interval] --------------+------------------------------------------------ c.price@rep78 | one | 4564.five 369.5 3827.174 5301.826 ii | 5967.625 1265.494 3442.372 8492.878 iii | 6429.233 643.5995 5144.95 7713.516 four | 6071.5 402.9585 5267.409 6875.591 5 | 5913 788.6821 4339.209 7486.791 --------------------------------------------------------------- . coefplot (., label(mean) rename(^.*([0-9])\..+$ = \one, regex)) /// > (matrix(R[,1]), ci((2 iii)) label(median)) /// > , ytitle(Repair Record 1978) xtitle(Price)
Code
(In this instance, option rename()
can be omitted in Stata xv or lower, or if version
is set to 15 or lower.)
Changing the plot blazon (recast)
Past default, coefplot uses mark symbols for point estimates and spikes for confidence intervals. To change the plot types, utilize the recast()
and ciopts(recast())
options. For example, to use confined for point estimates and capped spikes for confidence intervals, you could type:
. sysuse auto, clear (1978 auto information) . regress toll mpg trunk length if strange==0 (output omitted) . estimates shop D . backslide price mpg trunk length if foreign==one (output omitted) . estimates store F . coefplot (D, label(Domestic Cars)) (F, label(Foreign Cars)) /// > , drop(_cons) xline(0) recast(bar) ciopts(recast(rcap)) citop barwidt(0.3)
Code
Option citop
has been added so that the conviction spikes are plotted in front end of the bars.
The recast()
choice can also exist used to select other plot types such as connected-line plots or expanse plots. Furthermore, different plot types tin can be combined in a single graph by specifying a carve up recast()
option for each series. Here is an case, in which proportions are displayed as bars (without conviction intervals using the noci
option) and means are displayed as a connected-line plot with capped spikes for confidence intervals:
. sysuse motorcar, clear (1978 automobile data) . proportion rep78 (output omitted) . estimates store prop . hateful price, over(rep78) (output omitted) . estimates store hateful . coefplot (prop, recast(bar) noci barwidth(0.5) colour(*.6)) /// > (mean, recast(continued) ciopts(recast(rcap)) axis(2)) /// > , vertical nooffsets plotlabels("Proportion" "Price") /// > xtitle("Repair record") ytitle("Proportion") ytitle("Price", axis(2)) /// > rename(^.*([0-9])\..+$ = \1, regex)
Code
The example also illustrates some other options. Option vertical
has been specified to flip the axes, option nooffsets
omits offsetting the plot positions then that bars and markers are both centered in a higher place the categories, option axis()
has been applied to the second series so that proportions and means have different axes. Furthermore, option plotlabels()
provides an culling way to specify legend labels for the series (instead of specifying split label()
options).
Option rename()
is applied considering mean
and proportion
label the coefficients differently. In Stata fifteen or lower, or if version
is set to 15 or lower, the selection tin be omitted.
Using a continuous axis (at)
The coefficients provided to coefplot may represent estimates along a continuous dimension. Examples are predictive margins or marginal effects computed over values of a continuous variable. In such a case, use the at()
option to provide the plot positions to coefplot. Hither is an instance where predictive margins of foreign
are computed past level of mpg
, once from a bivariate model and once from a multivariate model:
. sysuse auto, clear (1978 automobile data) . logit strange mpg (output omitted) . margins, at(mpg=(10(ii)40)) post (output omitted) . estimates shop bivariate . logit foreign mpg plow toll (output omitted) . margins, at(mpg=(10(2)40)) mail (output omitted) . estimates store multivariate . coefplot bivariate multivariate, ytitle(Pr(foreign=1)) xtitle(Miles per Gallon) /// > at recast(line) lwidth(*2) ciopts(recast(rline) lpattern(dash))
Code
The instance besides illustrates how to change the plot types such that the estimates and their confidence intervals are displayed equally lines. Furthermore, note that automatic offsetting of plot positions is deactivated by default if the at()
pick is specified.
Types and placement of options
coefplot has four levels of options:
-
modelopts
are options that apply to a single model (or matrix). They specify the information to be collected from the model. -
plotopts
are options that utilise to a single plot (i.due east., a unmarried series of points displayed in the same fashion), possibly containing results from multiple models. They touch the rendition of markers and confidence intervals and provide a label for the plot. -
subgropts
are options that employ to a unmarried subgraph, possibly containing multiple plots. -
globalopts
are options that use to the overall graph. This also includes optionbyopts()
to make up one's mind how subgraphs are combined.
The levels are nested in the sense that upper level options include all lower level options. That is, globalopts
includes subgropts
, plotopts
, and modelopts
; subgropts
includes plotopts
, and modelopts
; plotopts
includes modelopts
. However, upper level options may non be specified at a lower level.
If lower level options are specified at an upper level, they serve as defaults for all included lower levels elements. For example, if y'all want to draw 99% and 95% conviction intervals for all included models, specify levels(99 95)
every bit global pick:
coefplot model1 model2 model3, levels(99 95)
Options specified with an individual element override the defaults set up by upper level options. For instance, if you want to draw 99% and 95% confidence intervals for model 1 and model 2 and 90% conviction intervals for model 3, yous could blazon:
coefplot model1 model2 (model3, level(90)), levels(99 95)
There are some fine distinctions about the placement of options and how they are interpreted. For example, if you type
coefplot m1, opts1 || m2, opts2 opts3
then opts2
and opts3
are interpreted every bit global options. If y'all desire to apply opts2
only to m2
then blazon
coefplot m1, opts1 || m2, opts2 ||, opts3
Similarly, if you lot type
coefplot (m1, opts1 \ m2, opts2)
then opts2
will be applied to both models. To apply opts2
only to m2
type
coefplot (m1, opts1 \ m2, opts2 \)
or, if you also want to include opts3
to be applied to both models, type
coefplot (m1, opts1 \ m2, opts2 \, opts3)
or
coefplot (m1, opts1 \ m2, opts2 \), opts3
In case of multiple subgraphs in that location is some ambivalence about where to specify the plot options (unless global option norecycle
is specified). You lot can provide plot options inside any of the subgraphs, as plot options are collected across subgraphs. However, in instance of conflict, the plot options from the later subgraphs normally take precedence over earlier plot options. In addition, y'all can as well use global options p1()
, p2()
, etc. to provide options for specific plots. In case of conflict, options specified inside a plot take precedence over options provided via p1()
, p2()
, etc.
washingtoncalwascame87.blogspot.com
Source: http://repec.sowi.unibe.ch/stata/coefplot/getting-started.html
Post a Comment for "How to Read a Coefficient Plot R"