Help for package plotmo

Version:

3.6.4

Title:

Plot a Model's Residuals, Response, and Partial Dependence Plots

Maintainer:

Stephen Milborrow <milbo@sonic.net>

Depends:

R (≥ 3.4.0), Formula (≥ 1.2-3), plotrix

Description:

Plot model surfaces for a wide variety of models using partial dependence plots and other techniques. Also plot model residuals and other information on the model.

Suggests:

C50 (≥ 0.1.0-24), earth (≥ 5.1.2), gbm (≥ 2.1.1), glmnet (≥ 2.0.5), glmnetUtils (≥ 1.0.3), MASS (≥ 7.3-51), mlr (≥ 2.12.1), neuralnet (≥ 1.33), partykit (≥ 1.2-2), pre (≥ 0.5.0), rpart (≥ 4.1-15), rpart.plot (≥ 3.0.8)

License:

GPL-3

URL:

http://www.milbo.users.sonic.net

NeedsCompilation:

Packaged:

2024-08-31 00:01:13 UTC; milbo

Author:

Stephen Milborrow [aut, cre]

Repository:

CRAN

Date/Publication:

2024-08-31 03:10:02 UTC

Plot a gbm model

Description

Plot a gbm model showing the training and other error curves.

Usage

plot_gbm(object=stop("no 'object' argument"),
    smooth = c(0, 0, 0, 1),
    col = c(1, 2, 3, 4), ylim = "auto",
    legend.x = NULL, legend.y = NULL, legend.cex = .8,
    grid.col = NA,
    n.trees = NA, col.n.trees ="darkgray",
    ...)

Arguments

object

The gbm model.

smooth

Four-element vector specifying if smoothing should be applied to the train, test, CV, and OOB curves respectively. When smoothing is specified, a smoothed curve is plotted and the minimum is calculated from the smoothed curve.
The default is c(0, 0, 0, 1) meaning apply smoothing only to the OOB curve (same as gbm.perf).
Note that smooth=1 (which gets recyled to c(1,1,1,1)) will smooth all the curves.

col

Four-element vector specifying the colors for the train, test, CV, and OOB curves respectively.
The default is c(1, 2, 3, 4).
Use a color of 0 to remove the corresponding curve, e.g. col=c(1,2,3,0) to not display the OOB curve.
If col=0 (which gets recycled to c(0,0,0,0)) nothing will be plotted, but plot_gbm will return the number-of-trees at the minima as usual (as described in the Value section below).

ylim

The default ylim="auto" shows more detail around the minima.
Use ylim=NULL for the full vertical range of the curves.
Else specify ylim as usual.

legend.x

The x position of the legend. The default positions the legend automatically.
Use legend.x=NA for no legend.
See the x and y arguments of xy.coords for other options, for example legend.x="topright".

legend.y

The y position of the legend.

legend.cex

The legend cex (the default is 0.8).

grid.col

Default NA. Color of the optional grid, for example grid.col=1.

n.trees

For use by plotres.
The x position of the gray vertical line indicating the n.trees passed by plotres to predict.gbm to calculate the residuals. Plotres defaults to all trees.

col.n.trees

For use by plotres.
Color of the vertical line showing the n.trees argument. Default is "darkgray".

...

Dot arguments are passed internally to plot.default.

Value

This function returns a four-element vector specifying the number of trees at the train, test, CV, and OOB minima respectively.

The minima are calculated after smoothing as specified by this function's smooth argument. By default, only the OOB curve is smoothed. The smoothing algorithm for the OOB curve differs slightly from gbm.perf, so can give a slightly different number of trees.

Note

The OOB curve

The OOB curve is artificially rescaled to force it into the plot. See Chapter 7 in the plotres vignette.

Interaction with plotres

When invoking this function via plotres, prefix any argument of plotres with w1. to tell plotres to pass the argument to this function. For example give w1.ylim=c(0,10) to plotres (plain ylim=c(0,10) in this context gets passed to the residual plots).

Acknowledgments

This function is derived from code in the gbm package authored by Greg Ridgeway and others.

Examples

if (require(gbm)) {
    n <- 100                            # toy model for quick demo
    x1 <- 3 * runif(n)
    x2 <- 3 * runif(n)
    x3 <- sample(1:4, n, replace=TRUE)
    y <- x1 + x2 + x3 + rnorm(n, 0, .3)
    data <- data.frame(y=y, x1=x1, x2=x2, x3=x3)
    mod <- gbm(y~., data=data, distribution="gaussian",
               n.trees=300, shrinkage=.1, interaction.depth=3,
               train.fraction=.8, verbose=FALSE)

    plot_gbm(mod)

    # plotres(mod)                      # plot residuals

    # plotmo(mod)                       # plot regression surfaces
}

Plot a glmnet model

Description

Plot the coefficient paths of a glmnet model.

An enhanced version of plot.glmnet.

Usage

plot_glmnet(x = stop("no 'x' argument"),
           xvar = c("rlambda", "lambda", "norm", "dev"),
           label = 10, nresponse = NA, grid.col = NA, s = NA, ...)

Arguments

x

The glmnet model.

xvar

What gets plotted along the x axis. One of:
"rlambda" (default) decreasing log lambda (lambda is the glmnet penalty)
"lambda" log lambda
"norm" L1-norm of the coefficients
"dev" percent deviance explained

The default xvar differs from plot.glmnet to allow s to be plotted when this function is invoked by plotres.

label

Default 10. Number of variable names displayed on the right of the plot. One of:
FALSE display no variables
TRUE display all variables
integer (default) number of variables to display (default is 10)

nresponse

Which response to plot for multiple response models.

grid.col

Default NA. Color of the optional grid, for example grid.col="lightgray".

s

For use by plotres. The x position of the gray vertical line indicating the lambda s passed by plotres to predict.glmnet to calculate the residuals. Plotres defaults to s=0.

...

Dot arguments are passed internally to matplot.

Use col to change the color of curves; for example col=1:4. The six default colors are intended to be distinguishable yet harmonious (to my eye at least), with adjacent colors as different as easily possible.

Note

Limitations

For multiple response models use the nresponse argument to specify which response should be plotted. (Currently each response must be plotted one by one.)

The type.coef argument of plot.glmnet is currently not supported.

Currently xvar="norm" is not supported for multiple response models (you will get an error message).

Interaction with plotres

When invoking this function via plotres, prefix any argument of plotres with w1. to tell plotres to pass the argument to this function. For example give w1.col=1:4 to plotres (plain col=1:4 in this context gets passed to the residual plots).

Acknowledgments

This function is based on plot.glmnet in the glmnet package authored by Jerome Friedman, Trevor Hastie, and Rob Tibshirani.

This function incorporates the function spread.labs from the orphaned package TeachingDemos written by Greg Snow.

Examples

if (require(glmnet)) {
    x <- matrix(rnorm(100 * 10), 100, 10)   # n=100 p=10
    y <- x[,1] + x[,2] + 2 * rnorm(100)     # y depends only on x[,1] and x[,2]
    mod <- glmnet(x, y)

    plot_glmnet(mod)

    # plotres(mod)                          # plot the residuals
}

Plot a model's response over a range of predictor values (the model surface)

Description

Plot model surfaces for a wide variety of models.

This function plots the model's response when varying one or two predictors while holding the other predictors constant (a poor man's partial-dependence plot).

It can also generate partial-dependence plots (by specifying pmethod="partdep").

Please see the plotmo vignette (also available here).

Usage

plotmo(object=stop("no 'object' argument"),
    type=NULL, nresponse=NA, pmethod="plotmo",
    pt.col=0, jitter=.5, smooth.col=0, level=0,
    func=NULL, inverse.func=NULL, nrug=0, grid.col=0,
    type2="persp",
    degree1=TRUE, all1=FALSE, degree2=TRUE, all2=FALSE,
    do.par=TRUE, clip=TRUE, ylim=NULL, caption=NULL, trace=0,
    grid.func=NULL, grid.levels=NULL, extend=0,
    ngrid1=50, ngrid2=20, ndiscrete=5, npoints=3000,
    center=FALSE, xflip=FALSE, yflip=FALSE, swapxy=FALSE, int.only.ok=TRUE,
    ...)

Arguments

object

The model object.

type

Type parameter passed to predict. For allowed values see the predict method for your object (such as predict.earth). By default, plotmo tries to automatically select a suitable value for the model in question (usually "response") but this will not always be correct. Use trace=1 to see the type argument passed to predict.

nresponse

Which column to use when predict returns multiple columns. This can be a column index, or a column name if the predict method for the model returns column names. The column name may be abbreviated, partial matching is used.

pmethod

Plotting method. One of:

"plotmo" (default) Classic plotmo plots i.e. the background variables are fixed at their medians (or first level for factors).

"partdep" Partial dependence plots, i.e. at each point the effect of the background variables is averaged.

"apartdep" Approximate partial dependence plots. Faster than "partdep" especially for big datasets. Like "partdep" but the background variables are averaged over a subset of ngrid1 cases (default 50), rather than all cases in the training data. The subset is created by selecting rows at equally spaced intervals from the training data after sorting the data on the response values (ties are randomly broken). The same background subset of ngrid1 cases is used for both degree1 and degree2 plots.

pt.col

The color of response points (or response sites in degree2 plots). This refers to the response y in the data used to build the model. Note that the displayed points are jittered by default (see the jitter argument).
Default is 0, display no response points.
This can be a vector, like all such arguments – for example pt.col = as.numeric(survived)+2 to color points by their survival class.
You can modify the plotted points with pt.pch, pt.cex, etc. (these get passed via plotmo's “...” argument). For example, pt.cex = weights to size points by their weight. To label the points, set pt.pch to a character vector.

jitter

Applies only if pt.col is specified.
The default is jitter=.5, automatically apply some jitter to the points. Points are jittered horizontally and vertically.
Use jitter=0 to disable this automatic jittering. Otherwise something like jitter=1, but the optimum value is data dependent.

smooth.col

Color of smooth line through the response points. (The points themselves will not be plotted unless pt.col is specified.) Default is 0, no smooth line.
Example:

    mod <- lm(Volume~Height, data=trees)
    plotmo(mod, pt.color=1, smooth.col=2)

You can adjust the amount of smoothing with smooth.f. This gets passed as f to lowess. The default is .5. Lower values make the line more wiggly.

level

Draw estimated confidence or prediction interval bands at the given level, if the predict method for the model supports them.
Default is 0, bands not plotted. Else a fraction, for example level=.95. See “Prediction intervals” in the plotmo vignette. Example:

    mod <- lm(log(Volume)~log(Girth), data=trees)
    plotmo(mod, level=.95)

You can modify the color of the bands with level.shade and level.shade2.

func

Superimpose func(x) on the plot. Example:

    mod <- lm(Volume~Girth, data=trees)
    estimated.volume <- function(x) .17 * x$Girth^2
    plotmo(mod, pt.col=2, func=estimated.volume)

The func is called for each plot with a single argument which is a dataframe with columns in the same order as the predictors in the formula or x used to build the model. Use trace=2 to see the column names and first few rows of this dataframe.

inverse.func

A function applied to the response before plotting. Useful to transform a transformed response back to the original scale. Example:

    mod <- lm(log(Volume)~., data=trees)
    plotmo(mod, inverse.func=exp)    # exp() is inverse of log()

nrug

Number of ticks in the rug along the bottom of the plot
Default is 0, no rug.
Use nrug=TRUE for all the points.
Else specify the number of quantiles e.g. use nrug=10 for ticks at the 0, 10, 20, ..., 100 percentiles.
Modify the rug ticks with rug.col, rug.lwd, etc.
The special value nrug="density" means plot the density of the points along the bottom. Modify the density plot with density.adjust (default is .5), density.col, density.lty, etc.

grid.col

Default is 0, no grid. Else add a background grid of the specified color to the degree1 plots. The special value grid.col=TRUE is treated as "lightgray".

type2

Degree2 plot type. One of "persp" (default), "image", or "contour". You can pass arguments to these functions if necessary by using persp., image., or contour. as a prefix. Examples:

    plotmo(mod, persp.ticktype="detailed", persp.nticks=3)
    plotmo(mod, type2="image")
    plotmo(mod, type2="image", image.col=heat.colors(12))
    plotmo(mod, type2="contour", contour.col=2, contour.labcex=.4)

degree1

An index vector specifying which subset of degree1 (main effect) plots to include (after selecting the relevant predictors as described in “Which variables are plotted?” in the plotmo vignette).
Default is TRUE, meaning all (the TRUE gets recycled). To plot only the third plot use degree1=3. For no degree1 plots use degree1=0.

Note that degree1 indexes plots on the page, not columns of x. Probably the easiest way to use this argument (and degree2) is to first use the default (and possibly all1=TRUE) to plot all figures. This shows how the figures are numbered. Then replot using degree1 to select the figures you want, for example degree1=c(1,3,4).

Can also be a character vector specifying which variables to plot. Examples:
degree1="wind"
degree1=c("wind", "vis").

Variables names are matched with grep. Thus "wind" will match all variables with "wind" anywhere in their name. Use "^wind$" to match only the variable named "wind".

all1

Default is FALSE. Use TRUE to plot all predictors, not just those usually selected by plotmo.
The all1 argument increases the number of plots; the degree1 argument reduces the number of plots.

degree2

An index vector specifying which subset of degree2 (interaction) plots to include.
Default is TRUE meaning all (after selecting the relevant interaction terms as described in “Which variables are plotted?” in the plotmo vignette).

Can also be a character vector specifying which variables to plot (grep is used for matching). Examples:
degree2="wind" plots all degree2 plots for the wind variable.
degree2=c("wind", "vis") plots just the wind:vis plot.

all2

Default is FALSE. Use TRUE to plot all pairs of predictors, not just those usually selected by plotmo.

do.par

One of NULL, FALSE, TRUE, or 2, as follows:

do.par=NULL. Same as do.par=FALSE if the number of plots is one; else the same as TRUE.

do.par=FALSE. Use the current par settings. You can pass additional graphics parameters in the “...” argument.

do.par=TRUE (default). Start a new page and call par as appropriate to display multiple plots on the same page. This automatically sets parameters like mfrow and mar. You can pass additional graphics parameters in the “...” argument.

do.par=2. Like do.par=TRUE but don't restore the par settings to their original state when plotmo exits, so you can add something to the plot.

clip

The default is clip=TRUE, meaning ignore very outlying predictions when determining the automatic ylim. This keeps ylim fairly compact while still covering all or nearly all the data, even if there are a few crazy predicted values. See “The ylim and clip arguments” in the plotmo vignette.
Use clip=FALSE for no clipping.

ylim

Three possibilities:
ylim=NULL (default). Automatically determine a ylim to use across all graphs.
ylim=NA. Each graph has its own ylim.
ylim=c(ymin,ymax). Use the specified limits across all graphs.

caption

Overall caption. By default create the caption automatically. Use caption="" for no caption. (Use main to set the title of individual plots, can be a vector.)

trace

Default is 0.
trace=1 (or TRUE) for a summary trace (shows how predict is invoked for the current object).
trace=2 for detailed tracing.
trace=-1 inhibits the messages usually issued by plotmo, like the plotmo grid:, calculating partdep, and nothing to plot messages. Error and warning messages will be printed as usual.

grid.func

Function applied to columns of the x matrix to pin the values of variables not on the axis of the current plot (the “background” variables).
The default is a function which for numeric variables returns the median and for logical and factors variables returns the value occurring most often in the training data.
Examples:

    plotmo(mod, grid.func=mean)
    grid.func <- function(x, ...) quantile(x)[2] # 25% quantile
    plotmo(mod, grid.func=grid.func)

This argument is not related to the grid.col argument.
This argument can be overridden for specific variables—see grid.levels below.

grid.levels

Default is NULL. Else a list of variables and their fixed value to be used when the variable is not on the axis. Supersedes grid.func for variables in the list. Names and values can be abbreviated, partial matching is used. Example:

    plotmo(mod, grid.levels=list(sex="m", age=21))

extend

Amount to extend the horizontal axis in each plot. The default is 0, do not extend (i.e. use the range of the variable in the training data). Else something like extend=.5, which will extend both the lower and upper xlim of each plot by 50%.
This argument is useful if you want to see how the model performs on data that is beyond the training data; for example, you want to see how a time-series model performs on future data.
This argument is currently implemented only for degree1 plots. Factors and discrete variables (see the ndiscrete argument) are not extended.

ngrid1

Number of equally spaced x values in each degree1 plot. Default is 50. Also used as the number of background cases for pmethod="apartdep".

ngrid2

Grid size for degree2 plots (ngrid2 x ngrid2 points are plotted). Default is 20.
The default will sometimes be too small for contour and image plots.
With large ngrid2 values, persp plots look better with persp.border=NA.

npoints

Number of response points to be plotted (a sample of npoints points is plotted). Applies only if pt.col is specified.
The default is 3000 (not all, to avoid overplotting on large models). Use npoints=TRUE or -1 for all points.

ndiscrete

Default 5 (a somewhat arbitrary value). Variables with no more than ndiscrete unique values are plotted as quantized in plots (a staircase rather than a curve).
Factors are always considered discrete. Variables with non-integer values are always considered non-discrete.
Use ndiscrete=0 if you want to plot the response for a variable with just a few integer values as a line or a curve, rather than a staircase.

int.only.ok

Plot the model even if it is an intercept-only model (no predictors are used in the model). Do this by plotting a single degree1 plot for the first predictor.
The default is TRUE. Use int.only.ok=FALSE to instead issue an error message for intercept-only models.

center

Center the plotted response. Default is FALSE.

xflip

Default FALSE. Use TRUE to flip the direction of the x axis. This argument (and yflip and swapxy) is useful when comparing to a plot from another source and you want the axes to be the same. (Note that xflip and yflip cannot be used on the persp plots, a limitation of the persp function.)

yflip

Default FALSE. Use TRUE to flip the direction of the y axis of the degree2 graphs.

swapxy

Default FALSE. Use TRUE to swap the x and y axes on the degree2 graphs.

...

Dot arguments are passed to the predict and plot functions. Dot argument names, whether prefixed or not, should be specified in full and not abbreviated.

“Prefixed” arguments are passed directly to the associated function. For example the prefixed argument persp.col="pink" passes col="pink" to persp(), overriding the global col setting. To send an argument to predict whose name may alias with plotmo's arguments, use predict. as a prefix. Example:

    plotmo(mod, s=1)           # error:  arg matches multiple formal args
    plotmo(mod, predict.s=1)   # ok now: s=1 will be passed to predict()

The prefixes recognized by plotmo are:


`predict.`	passed to the `predict` method for the model
`degree1.`	modifies degree1 plots e.g. `degree1.col=3, degree1.lwd=2`
`persp.`	arguments passed to `persp`
`contour.`	arguments passed to `contour`
`image.`	arguments passed to `image`
`pt.`	see the `pt.col` argument (arguments passed to `points` and `text`)
`smooth.`	see the `smooth.col` argument (arguments passed to `lines` and `lowess`)
`level.`	see the `level` argument (`level.shade`, `level.shade2`, and arguments for `polygon`)
`func.`	see the `func` argument (arguments passed to `lines`)
`rug.`	see the `nrug` argument (`rug.jitter`, and arguments passed to `rug`)
`density.`	see the `nrug` argument (`density.adjust`, and arguments passed to `lines`)
`grid.`	see the `grid.col` argument (arguments passed to `grid`)
`caption.`	see the `caption` argument (arguments passed to `mtext`)
`par.`	arguments passed to `par` (only necessary if a `par` argument name clashes with a `plotmo` argument)
`prednames.`	Use `prednames.abbreviate=FALSE` for full predictor names in graph axes.

The cex argument is relative, so specifying cex=1 is the same as not specifying cex.

For backwards compatibility, some dot arguments are supported but not explicitly documented. For example, the old argument col.response is no longer in plotmo's formal argument list, but is still accepted and treated like the new argument pt.col.

Note

In general this function won't work on models that don't save the call and data with the model in a standard way. For further discussion please see “Accessing the model data” in the plotmo vignette. Package authors may want to look at Guidelines for S3 Regression Models (also available here).

By default, plotmo tries to use sensible model-dependent defaults when calling predict. Use trace=1 to see the arguments passed to predict. You can change the defaults by using plotmo's type argument, and by using dot arguments prefixed with predict. (see the description of “...” above).

Examples

if (require(rpart)) {
    data(kyphosis)
    rpart.model <- rpart(Kyphosis~., data=kyphosis)
    # pass type="prob" to plotmo's internal calls to predict.rpart, and
    # select the column named "present" from the matrix returned by predict.rpart
    plotmo(rpart.model, type="prob", nresponse="present")
}
if (require(earth)) {
    data(ozone1)
    earth.model <- earth(O3 ~ ., data=ozone1, degree=2)
    plotmo(earth.model)
    # plotmo(earth.model, pmethod="partdep") # partial dependence plots
}

Ignore

Description

Miscellaneous functions exported for internal use by earth and other packages. You can ignore these.

Usage

# for earth
plotmo_fitted(object, trace, nresponse, type, ...)
plotmo_cum(rinfo, info, nfigs=1, add=FALSE,
           cum.col1, grid.col, jitter=0, cum.grid="percentages", ...)
plotmo_nresponse(y, object, nresponse, trace, fname, type="response")
plotmo_rinfo(object, type=NULL, residtype=type, nresponse=1,
    standardize=FALSE, delever=FALSE, trace=0,
    leverage.msg="returned as NA", expected.levs=NULL, labels.id=NULL, ...)
plotmo_predict(object, newdata, nresponse,
    type, expected.levs, trace, inverse.func=NULL, ...)
plotmo_prolog(object, object.name, trace, ...)
plotmo_resplevs(object, plotmo_fitted, yfull, trace)
plotmo_rsq(object, newdata, trace=0, nresponse=NA, type=NULL, ...)
plotmo_standardizescale(object)
plotmo_type(object, trace, fname="plotmo", type, ...)
plotmo_y(object, nresponse=NULL, trace=0, expected.len=NULL,
    resp.levs=NULL, convert.glm.response=!is.null(nresponse))
## Default S3 method:
plotmo.pairs(object, x, nresponse, trace, all2, ...)
## Default S3 method:
plotmo.singles(object, x, nresponse, trace, all1, ...)
## Default S3 method:
plotmo.y(object, trace, naked, expected.len, ...)
# plotmo methods
plotmo.convert.na.nresponse(object, nresponse, yhat, type="response", ...)
plotmo.pairs(object, x, nresponse, trace, all2, ...)
plotmo.pint(object, newdata, type, level, trace, ...)
plotmo.predict(object, newdata, type, ..., TRACE)
plotmo.prolog(object, object.name, trace, ...)
plotmo.residtype(object, ..., TRACE)
plotmo.singles(object, x, nresponse, trace, all1, ...)
plotmo.type(object, ..., TRACE)
plotmo.x(object, trace, ...)
plotmo.y(object, trace, naked, expected.len, nresponse=1, ...)

Arguments

...

add

all1

all2

convert.glm.response

cum.col1

cum.grid

delever

expected.len

expected.levs

fname

grid.col

info

inverse.func

jitter

labels.id

level

leverage.msg

naked

newdata

nfigs

nresponse

object.name

object

plotmo_fitted

residtype

resp.levs

rinfo

standardize

TRACE

trace

type

x

yfull

yhat

y

Plot the residuals of a regression model

Description

Plot the residuals of a regression model.

Please see the plotres vignette (also available here).

Usage

plotres(object = stop("no 'object' argument"),
    which = 1:4, info = FALSE, versus = 1,
    standardize = FALSE, delever = FALSE, level = 0,
    id.n = 3, labels.id = NULL, smooth.col = 2,
    grid.col = 0, jitter = 0,
    do.par = NULL, caption = NULL, trace = 0,
    npoints = 3000, center = TRUE,
    type = NULL, nresponse = NA,
    object.name = quote.deparse(substitute(object)), ...)

Arguments

object

The model object.

which

Which plots do draw. Default is 1:4.

1 Model plot. What gets plotted here depends on the model class. For example, for earth models this is a model selection plot. Nothing will be displayed for some models. For details, please see the plotres vignette.

2 Cumulative distribution of abs residuals

3 Residuals vs fitted

4 QQ plot

5 Abs residuals vs fitted

6 Sqrt abs residuals vs fitted

7 Abs residuals vs log fitted

8 Cube root of the squared residuals vs log fitted

9 Log abs residuals vs log fitted

info

Default is FALSE. Use TRUE to print extra information as follows:

i) Display the distribution of the residuals along the bottom of the plot.

ii) Display the training R-Squared.

iii) Display the Spearman Rank Correlation of the absolute residuals with the fitted values. Actually, correlation is measured against the absolute values of whatever is on the horizontal axis — by default this is the fitted response, but may be something else if the versus argument is used.

iv) In the Cumulative Distribution plot (which=2), display additional information on the quantiles.

v) Only for which=5 or 9. Regress the absolute residuals against the fitted values and display the regression slope. Robust linear regression is used via rlm in the MASS package.

vi) Add various annotations to the other plots.

versus

What do we plot the residuals against? One of:

1 Default. Plot the residuals versus the fitted values (or the log values when which=7 to 9).

2 Residuals versus observation number, after observations have been sorted on the fitted value. Same as versus=1, except that the residuals are spaced uniformly along the horizontal axis.

3 Residuals versus the response.

4 Residuals versus the hat leverages.

"b:" Residuals versus the basis functions. Currently only supported for earth, mda::mars, and gam::gam models. A optional regex can follow the "b:" to specify a subset of the terms, e.g. versus="b:wind" will plot terms with "wind" in their name.

Else a character vector specifying which predictors to plot against.
Example 1: versus="" plots against all predictors (since the regex versus="" matches anything).
Example 2: versus=c("wind", "vis") plots predictors with wind or vis in their name.
Example 3: versus=c("wind|vis") equivalent to the above.
Note: These are regexs. Thus versus="wind" will match all variables that have "wind" in their names. Use "^wind$" to match only the variable named "wind".

standardize

Default is FALSE. Use TRUE to standardize the residuals. Only supported for some models, an error message will be issued otherwise.
Each residual is divided by by se_i * sqrt(1 - h_ii), where se_i is the standard error of prediction and h_ii is the leverage (the diagonal entry of the hat matrix). When the variance model holds, the standardized residuals are homoscedastic with unity variance.
The leverages are obtained using hatvalues. (For earth models the leverages are for the linear regression of the response on the basis matrix bx.) A standardized residual with a leverage of 1 is plotted as a star on the axis.
This argument applies to all plots where the residuals are used (including the cumulative distribution and QQ plots, and to annotations displayed by the info argument).

delever

Default is FALSE. Use TRUE to “de-lever” the residuals. Only supported for some models, an error message will be issued otherwise.
Each residual is divided by sqrt(1 - h_ii). See the standardize argument for details.

level

Draw estimated confidence or prediction interval bands at the given level, if the model supports them.
Default is 0, bands not plotted. Else a fraction, for example level=0.90. Example:

    mod <- lm(log(Volume)~log(Girth), data=trees)
    plotres(mod, level=.90)

You can modify the color of the bands with level.shade and level.shade2.
See also “Prediction intervals” in the plotmo vignette (but note that plotmo needs prediction intervals on new data, whereas plotres requires only that the model supports prediction intervals on the training data).

id.n

The largest id.n residuals will be labeled in the plot. Default is 3. Special values TRUE and -1 or mean all.
If id.n is negative (but not -1) the id.n most positive and most negative residuals will be labeled in the plot.
A current implementation restriction is that id.n is ignored when there are more than ten thousand cases.

labels.id

Residual labels. Only used if id.n > 0. Default is the case names, or the case numbers if the cases are unnamed.

smooth.col

Color of the smooth line through the residual points. Default is 2, red. Use smooth.col=0 for no smooth line.
You can adjust the amount of smoothing with smooth.f. This gets passed as f to lowess. The default is 2/3. Lower values make the line more wiggly.

grid.col

Default is 0, no grid. Else add a background grid of the specified color to the degree1 plots. The special value grid.col=TRUE is treated as "lightgray".

jitter

Default is 0, no jitter. Passed as factor to jitter to jitter the plotted points horizontally and vertically. Useful for discrete variables and responses, where the residual points tend to be overlaid.

do.par

One of NULL, FALSE, TRUE, or 2, as follows:

do.par=NULL (default). Same as do.par=FALSE if the number of plots is one; else the same as TRUE.

do.par=FALSE. Use the current par settings. You can pass additional graphics parameters in the “...” argument.

do.par=TRUE. Start a new page and call par as appropriate to display multiple plots on the same page. This automatically sets parameters like mfrow and mar. You can pass additional graphics parameters in the “...” argument.

do.par=2. Like do.par=TRUE but don't restore the par settings to their original state when plotres exits, so you can add something to the plot.

caption

Overall caption. By default create the caption automatically. Use caption="" for no caption. (Use main to set the title of an individual plot.)

trace

Default is 0.
trace=1 (or TRUE) for a summary trace (shows how predict and friends are invoked for the model).
trace=2 for detailed tracing.

npoints

Number of points to be plotted. A sample of npoints is taken; the sample includes the biggest twenty or so residuals.
The default is 3000 (not all, to avoid overplotting on large models). Use npoints=TRUE or -1 for all points.

center

Default is TRUE, meaning center the horizontal axis in the residuals plot, so asymmetry in the residual distribution is more obvious.

type

Type parameter passed first to residuals and if that fails to predict. For allowed values see the residuals and predict methods for your object (such as residuals.rpart or predict.earth). By default, plotres tries to automatically select a suitable value for the model in question (usually "response"), but this will not always be correct. Use trace=1 to see the type argument passed to residuals and predict.

nresponse

Which column to use when residuals or predict returns multiple columns. This can be a column index or column name (which may be abbreviated, partial matching is used).

object.name

The name of the object for error and trace messages. Used internally by plot.earth.

...

Dot arguments are passed to the plot functions. Dot argument names, whether prefixed or not, should be specified in full and not abbreviated.

“Prefixed” arguments are passed directly to the associated function. For example the prefixed argument pt.col="pink" passes col="pink" to points(), overriding the global col setting. The prefixes recognized by plotres are:

`residuals.`	passed to `residuals`
`predict.`	passed to `predict` (`predict` is called if the call to `residuals` fails)
`w1.`	sent to the model-dependent plot for `which=1` e.g. `w1.col=2`
`pt.`	modify the displayed points e.g. `pt.col=as.numeric(survived)+2` or `pt.cex=.8`.
`smooth.`	modify the smooth line e.g. `smooth.col=0` or `smooth.f=.5`.
`level.`	modify the interval bands, e.g. `level.shade="gray"` or `level.shade2="lightblue"`
`legend.`	modify the displayed `legend` e.g. `legend.cex=.9`
`cum.`	modify the Cumulative Distribution plot (arguments for `plot.stepfun`)
`qq.`	modify the QQ plot, e.g. `qq.pch=1`
`qqline`	modify the `qqline` in the QQ plot, e.g. `qqline.col=0`
`label.`	modify the point labels, e.g. `label.cex=.9` or `label.font=2`
`cook.`	modify the Cook's Distance annotations. This affects only the leverage plot (`versus=3`) for `lm` models with `standardize=TRUE`. e.g. `cook.levels=c(.5, .8, 1)` or `cook.col=2`.
`caption.`	modify the overall caption (see the `caption` argument) e.g. `caption.col=2`.
`par.`	arguments for `par` (only necessary if a `par` argument name clashes with a `plotres` argument)

The cex argument is relative, so specifying cex=1 is the same as not specifying cex.

For backwards compatibility, some dot arguments are supported but not explicitly documented.

Value

If the which=1 plot was plotted, the return value of that plot (model dependent).

Else if the which=3 plot was plotted, return list(x,y) where x and y are the coordinates of the points in that plot (but without jittering even if the jitter argument was used).

Else return NULL.

Note

This function is designed primarily for displaying standard response - fitted residuals for models with a single continuous response, although it will work for a few other models.

In general this function won't work on models that don't save the call and data with the model in a standard way. It uses the same underlying mechanism to access the model data as plotmo. For further discussion please see “Accessing the model data” in the plotmo vignette (also available here). Package authors may want to look at Guidelines for S3 Regression Models (also available here).

Examples

# we use lm in this example, but plotres is more useful for models
# that don't have a function like plot.lm for plotting residuals

lm.model <- lm(Volume~., data=trees)

plotres(lm.model)

Plot a gbm model

Description

Usage

Arguments

Value

Note

See Also

Examples

Plot a glmnet model

Description

Usage

Arguments

Note

See Also

Examples

Plot a model's response over a range of predictor values (the model surface)

Description

Usage

Arguments

Note

See Also

Examples

Ignore

Description

Usage

Arguments

Plot the residuals of a regression model

Description

Usage

Arguments

Value

Note

See Also

Examples