Version: | 1.1.5 |
Title: | Functions for Base Types and Core R and 'Tidyverse' Features |
Description: | A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation. |
License: | MIT + file LICENSE |
ByteCompile: | true |
Biarch: | true |
Depends: | R (≥ 3.5.0) |
Imports: | utils |
Suggests: | cli (≥ 3.1.0), covr, crayon, fs, glue, knitr, magrittr, methods, pillar, rmarkdown, stats, testthat (≥ 3.0.0), tibble, usethis, vctrs (≥ 0.2.3), withr |
Enhances: | winch |
Encoding: | UTF-8 |
RoxygenNote: | 7.3.2 |
URL: | https://rlang.r-lib.org, https://github.com/r-lib/rlang |
BugReports: | https://github.com/r-lib/rlang/issues |
Config/testthat/edition: | 3 |
Config/Needs/website: | dplyr, tidyverse/tidytemplate |
NeedsCompilation: | yes |
Packaged: | 2025-01-17 08:43:17 UTC; lionel |
Author: | Lionel Henry [aut, cre], Hadley Wickham [aut], mikefc [cph] (Hash implementation based on Mike's xxhashlite), Yann Collet [cph] (Author of the embedded xxHash library), Posit, PBC [cph, fnd] |
Maintainer: | Lionel Henry <lionel@posit.co> |
Repository: | CRAN |
Date/Publication: | 2025-01-17 14:30:02 UTC |
rlang: Functions for Base Types and Core R and 'Tidyverse' Features
Description
A toolbox for working with base types, core R features like the condition system, and core 'Tidyverse' features like tidy evaluation.
Author(s)
Maintainer: Lionel Henry lionel@posit.co
Authors:
Hadley Wickham hadley@posit.co
Other contributors:
mikefc mikefc@coolbutuseless.com (Hash implementation based on Mike's xxhashlite) [copyright holder]
Yann Collet (Author of the embedded xxHash library) [copyright holder]
Posit, PBC [copyright holder, funder]
See Also
Useful links:
Report bugs at https://github.com/r-lib/rlang/issues
Signal an error, warning, or message
Description
These functions are equivalent to base functions base::stop()
,
base::warning()
, and base::message()
. They signal a condition
(an error, warning, or message respectively) and make it easy to
supply condition metadata:
Supply
class
to create a classed condition that can be caught or handled selectively, allowing for finer-grained error handling.Supply metadata with named
...
arguments. This data is stored in the condition object and can be examined by handlers.Supply
call
to inform users about which function the error occurred in.Supply another condition as
parent
to create a chained condition.
Certain components of condition messages are formatted with unicode symbols and terminal colours by default. These aspects can be customised, see Customising condition messages.
Usage
abort(
message = NULL,
class = NULL,
...,
call,
body = NULL,
footer = NULL,
trace = NULL,
parent = NULL,
use_cli_format = NULL,
.inherit = TRUE,
.internal = FALSE,
.file = NULL,
.frame = caller_env(),
.trace_bottom = NULL,
.subclass = deprecated()
)
warn(
message = NULL,
class = NULL,
...,
body = NULL,
footer = NULL,
parent = NULL,
use_cli_format = NULL,
.inherit = NULL,
.frequency = c("always", "regularly", "once"),
.frequency_id = NULL,
.subclass = deprecated()
)
inform(
message = NULL,
class = NULL,
...,
body = NULL,
footer = NULL,
parent = NULL,
use_cli_format = NULL,
.inherit = NULL,
.file = NULL,
.frequency = c("always", "regularly", "once"),
.frequency_id = NULL,
.subclass = deprecated()
)
signal(message = "", class, ..., .subclass = deprecated())
reset_warning_verbosity(id)
reset_message_verbosity(id)
Arguments
message |
The message to display, formatted as a bulleted
list. The first element is displayed as an alert bullet
prefixed with If a message is not supplied, it is expected that the message is
generated lazily through If a function, it is stored in the |
class |
Subclass of the condition. |
... |
Additional data to be stored in the condition object.
If you supply condition fields, you should usually provide a
|
call |
The execution environment of a currently running
function, e.g. You only need to supply Can also be For more information about error calls, see Including function calls in error messages. |
body , footer |
Additional bullets. |
trace |
A |
parent |
Supply
For more information about error calls, see Including contextual information with error chains. |
use_cli_format |
Whether to format If set to |
.inherit |
Whether the condition inherits from |
.internal |
If |
.file |
A connection or a string specifying where to print the
message. The default depends on the context, see the |
.frame |
The throwing context. Used as default for
|
.trace_bottom |
Used in the display of simplified backtraces
as the last relevant call frame to show. This way, the irrelevant
parts of backtraces corresponding to condition handling
( |
.subclass |
|
.frequency |
How frequently should the warning or message be
displayed? By default ( |
.frequency_id |
A unique identifier for the warning or
message. This is used when |
id |
The identifying string of the condition that was supplied
as |
Details
-
abort()
throws subclassed errors, see"rlang_error"
. -
warn()
temporarily set thewarning.length
global option to the maximum value (8170), unless that option has been changed from the default value. The default limit (1000 characters) is especially easy to hit when the message contains a lot of ANSI escapes, as created by the crayon or cli packages
Error prefix
As with base::stop()
, errors thrown with abort()
are prefixed
with "Error: "
. Calls and source references are included in the
prefix, e.g. "Error in
my_function() at myfile.R:1:2:"
. There
are a few cosmetic differences:
The call is stripped from its arguments to keep it simple. It is then formatted using the cli package if available.
A line break between the prefix and the message when the former is too long. When a source location is included, a line break is always inserted.
If your throwing code is highly structured, you may have to
explicitly inform abort()
about the relevant user-facing call to
include in the prefix. Internal helpers are rarely relevant to end
users. See the call
argument of abort()
.
Backtrace
abort()
saves a backtrace in the trace
component of the error
condition. You can print a simplified backtrace of the last error
by calling last_error()
and a full backtrace with
summary(last_error())
. Learn how to control what is displayed
when an error is thrown with rlang_backtrace_on_error
.
Muffling and silencing conditions
Signalling a condition with inform()
or warn()
displays a
message in the console. These messages can be muffled as usual with
base::suppressMessages()
or base::suppressWarnings()
.
inform()
and warn()
messages can also be silenced with the
global options rlib_message_verbosity
and
rlib_warning_verbosity
. These options take the values:
-
"default"
: Verbose unless the.frequency
argument is supplied. -
"verbose"
: Always verbose. -
"quiet"
: Always quiet.
When set to quiet, the message is not displayed and the condition is not signalled.
stdout
and stderr
By default, abort()
and inform()
print to standard output in
interactive sessions. This allows rlang to be in control of the
appearance of messages in IDEs like RStudio.
There are two situations where messages are streamed to stderr
:
In non-interactive sessions, messages are streamed to standard error so that R scripts can easily filter them out from normal output by redirecting
stderr
.If a sink is active (either on output or on messages) messages are always streamd to
stderr
.
These exceptions ensure consistency of behaviour in interactive and non-interactive sessions, and when sinks are active.
See Also
Examples
# These examples are guarded to avoid throwing errors
if (FALSE) {
# Signal an error with a message just like stop():
abort("The error message.")
# Unhandled errors are saved automatically by `abort()` and can be
# retrieved with `last_error()`. The error prints with a simplified
# backtrace:
f <- function() try(g())
g <- function() evalq(h())
h <- function() abort("Tilt.")
last_error()
# Use `summary()` to print the full backtrace and the condition fields:
summary(last_error())
# Give a class to the error:
abort("The error message", "mypkg_bad_error")
# This allows callers to handle the error selectively
tryCatch(
mypkg_function(),
mypkg_bad_error = function(err) {
warn(conditionMessage(err)) # Demote the error to a warning
NA # Return an alternative value
}
)
# You can also specify metadata that will be stored in the condition:
abort("The error message.", "mypkg_bad_error", data = 1:10)
# This data can then be consulted by user handlers:
tryCatch(
mypkg_function(),
mypkg_bad_error = function(err) {
# Compute an alternative return value with the data:
recover_error(err$data)
}
)
# If you call low-level APIs it may be a good idea to create a
# chained error with the low-level error wrapped in a more
# user-friendly error. Use `try_fetch()` to fetch errors of a given
# class and rethrow them with the `parent` argument of `abort()`:
file <- "http://foo.bar/baz"
try(
try_fetch(
download(file),
error = function(err) {
msg <- sprintf("Can't download `%s`", file)
abort(msg, parent = err)
})
)
# You can also hard-code the call when it's not easy to
# forward it from the caller
f <- function() {
abort("my message", call = call("my_function"))
}
g <- function() {
f()
}
# Shows that the error occured in `my_function()`
try(g())
}
Test for missing values
Description
are_na()
checks for missing values in a vector and is equivalent
to base::is.na()
. It is a vectorised predicate, meaning that its
output is always the same length as its input. On the other hand,
is_na()
is a scalar predicate and always returns a scalar
boolean, TRUE
or FALSE
. If its input is not scalar, it returns
FALSE
. Finally, there are typed versions that check for
particular missing types.
Usage
are_na(x)
is_na(x)
is_lgl_na(x)
is_int_na(x)
is_dbl_na(x)
is_chr_na(x)
is_cpl_na(x)
Arguments
x |
An object to test |
Details
The scalar predicates accept non-vector inputs. They are equivalent
to is_null()
in that respect. In contrast the vectorised
predicate are_na()
requires a vector input since it is defined
over vector values.
Life cycle
These functions might be moved to the vctrs package at some point. This is why they are marked as questioning.
Examples
# are_na() is vectorised and works regardless of the type
are_na(c(1, 2, NA))
are_na(c(1L, NA, 3L))
# is_na() checks for scalar input and works for all types
is_na(NA)
is_na(na_dbl)
is_na(character(0))
# There are typed versions as well:
is_lgl_na(NA)
is_lgl_na(na_dbl)
Match an argument to a character vector
Description
This is equivalent to base::match.arg()
with a few differences:
Partial matches trigger an error.
Error messages are a bit more informative and obey the tidyverse standards.
arg_match()
derives the possible values from the
caller function.
arg_match0()
is a bare-bones version if performance is at a premium.
It requires a string as arg
and explicit character values
.
For convenience, arg
may also be a character vector containing
every element of values
, possibly permuted.
In this case, the first element of arg
is used.
Usage
arg_match(
arg,
values = NULL,
...,
multiple = FALSE,
error_arg = caller_arg(arg),
error_call = caller_env()
)
arg_match0(arg, values, arg_nm = caller_arg(arg), error_call = caller_env())
Arguments
arg |
A symbol referring to an argument accepting strings. |
values |
A character vector of possible values that |
... |
These dots are for future extensions and must be empty. |
multiple |
Whether |
error_arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
error_call |
The execution environment of a currently
running function, e.g. |
arg_nm |
Same as |
Value
The string supplied to arg
.
See Also
Examples
fn <- function(x = c("foo", "bar")) arg_match(x)
fn("bar")
# Throws an informative error for mismatches:
try(fn("b"))
try(fn("baz"))
# Use the bare-bones version with explicit values for speed:
arg_match0("bar", c("foo", "bar", "baz"))
# For convenience:
fn1 <- function(x = c("bar", "baz", "foo")) fn3(x)
fn2 <- function(x = c("baz", "bar", "foo")) fn3(x)
fn3 <- function(x) arg_match0(x, c("foo", "bar", "baz"))
fn1()
fn2("bar")
try(fn3("zoo"))
Argument type: data-masking
Description
This page describes the <data-masking>
argument modifier which
indicates that the argument uses tidy evaluation with data masking.
If you've never heard of tidy evaluation before, start with
vignette("programming", package = "dplyr")
.
Key terms
The primary motivation for tidy evaluation in tidyverse packages is that it provides data masking, which blurs the distinction between two types of variables:
-
env-variables are "programming" variables and live in an environment. They are usually created with
<-
. Env-variables can be any type of R object. -
data-variables are "statistical" variables and live in a data frame. They usually come from data files (e.g.
.csv
,.xls
), or are created by manipulating existing variables. Data-variables live inside data frames, so must be vectors.
General usage
Data masking allows you to refer to variables in the "current" data frame
(usually supplied in the .data
argument), without any other prefix.
It's what allows you to type (e.g.) filter(diamonds, x == 0 & y == 0 & z == 0)
instead of diamonds[diamonds$x == 0 & diamonds$y == 0 & diamonds$z == 0, ]
.
Indirection
The main challenge of data masking arises when you introduce some indirection, i.e. instead of directly typing the name of a variable you want to supply it in a function argument or character vector.
There are two main cases:
If you want the user to supply the variable (or function of variables) in a function argument, embrace the argument, e.g.
filter(df, {{ var }})
.dist_summary <- function(df, var) { df %>% summarise(n = n(), min = min({{ var }}), max = max({{ var }})) } mtcars %>% dist_summary(mpg) mtcars %>% group_by(cyl) %>% dist_summary(mpg)
If you have the column name as a character vector, use the
.data
pronoun, e.g.summarise(df, mean = mean(.data[[var]]))
.for (var in names(mtcars)) { mtcars %>% count(.data[[var]]) %>% print() } lapply(names(mtcars), function(var) mtcars %>% count(.data[[var]]))
(Note that the contents of
[[
, e.g.var
above, is never evaluated in the data environment so you don't need to worry about a data-variable calledvar
causing problems.)
Dot-dot-dot (...)
When this modifier is applied to ...
, there is one other useful technique
which solves the problem of creating a new variable with a name supplied by
the user. Use the interpolation syntax from the glue package: "{var}" := expression
. (Note the use of :=
instead of =
to enable this syntax).
var_name <- "l100km" mtcars %>% mutate("{var_name}" := 235 / mpg)
Note that ...
automatically provides indirection, so you can use it as is
(i.e. without embracing) inside a function:
grouped_mean <- function(df, var, ...) { df %>% group_by(...) %>% summarise(mean = mean({{ var }})) }
See Also
Helper for consistent documentation of empty dots
Description
Use @inheritParams rlang::args_dots_empty
in your package
to consistently document ...
that must be empty.
Arguments
... |
These dots are for future extensions and must be empty. |
Helper for consistent documentation of used dots
Description
Use @inheritParams rlang::args_dots_used
in your package
to consistently document ...
that must be used.
Arguments
... |
Arguments passed to methods. |
Documentation anchor for error arguments
Description
Use @inheritParams rlang::args_error_context
in your package to
document arg
and call
arguments (or equivalently their prefixed
versions error_arg
and error_call
).
-
arg
parameters should be formatted as argument (e.g. using cli's.arg
specifier) and included in error messages. See alsocaller_arg()
. -
call
parameters should be included in error conditions in a field namedcall
. An easy way to do this is by passing acall
argument toabort()
. See alsolocal_error_call()
.
Arguments
arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
error_arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
call |
The execution environment of a currently
running function, e.g. |
error_call |
The execution environment of a currently
running function, e.g. |
Convert object to a box
Description
-
as_box()
boxes its input only if it is not already a box. The class is also checked if supplied. -
as_box_if()
boxes its input only if it not already a box, or if the predicate.p
returnsTRUE
.
Usage
as_box(x, class = NULL)
as_box_if(.x, .p, .class = NULL, ...)
Arguments
x , .x |
An R object. |
class , .class |
A box class. If the input is already a box of
that class, it is returned as is. If the input needs to be boxed,
|
.p |
A predicate function. |
... |
Arguments passed to |
Transform to a closure
Description
as_closure()
is like as_function()
but also wraps primitive
functions inside closures. Some special control flow primitives
like if
, for
, or break
can't be wrapped and will cause an
error.
Usage
as_closure(x, env = caller_env())
Arguments
x |
A function or formula. If a function, it is used as is. If a formula, e.g. If a string, the function is looked up in |
env |
Environment in which to fetch the function in case |
Examples
# Primitive functions are regularised as closures
as_closure(list)
as_closure("list")
# Operators have `.x` and `.y` as arguments, just like lambda
# functions created with the formula syntax:
as_closure(`+`)
as_closure(`~`)
Create a data mask
Description
A data mask is an environment (or possibly multiple environments forming an ancestry) containing user-supplied objects. Objects in the mask have precedence over objects in the environment (i.e. they mask those objects). Many R functions evaluate quoted expressions in a data mask so these expressions can refer to objects within the user data.
These functions let you construct a tidy eval data mask manually. They are meant for developers of tidy eval interfaces rather than for end users.
Usage
as_data_mask(data)
as_data_pronoun(data)
new_data_mask(bottom, top = bottom)
Arguments
data |
A data frame or named vector of masking data. |
bottom |
The environment containing masking objects if the data mask is one environment deep. The bottom environment if the data mask comprises multiple environment. If you haven't supplied |
top |
The last environment of the data mask. If the data mask
is only one environment deep, This must be an environment that you own, i.e. that you have
created yourself. The parent of |
Value
A data mask that you can supply to eval_tidy()
.
Why build a data mask?
Most of the time you can just call eval_tidy()
with a list or a
data frame and the data mask will be constructed automatically.
There are three main use cases for manual creation of data masks:
When
eval_tidy()
is called with the same data in a tight loop. Because there is some overhead to creating tidy eval data masks, constructing the mask once and reusing it for subsequent evaluations may improve performance.When several expressions should be evaluated in the exact same environment because a quoted expression might create new objects that can be referred in other quoted expressions evaluated at a later time. One example of this is
tibble::lst()
where new columns can refer to previous ones.When your data mask requires special features. For instance the data frame columns in dplyr data masks are implemented with active bindings.
Building your own data mask
Unlike base::eval()
which takes any kind of environments as data
mask, eval_tidy()
has specific requirements in order to support
quosures. For this reason you can't supply bare
environments.
There are two ways of constructing an rlang data mask manually:
-
as_data_mask()
transforms a list or data frame to a data mask. It automatically installs the data pronoun.data
. -
new_data_mask()
is a bare bones data mask constructor for environments. You can supply a bottom and a top environment in case your data mask comprises multiple environments (see section below).Unlike
as_data_mask()
it does not install the.data
pronoun so you need to provide one yourself. You can provide a pronoun constructed withas_data_pronoun()
or your own pronoun class.as_data_pronoun()
will create a pronoun from a list, an environment, or an rlang data mask. In the latter case, the whole ancestry is looked up from the bottom to the top of the mask. Functions stored in the mask are bypassed by the pronoun.
Once you have built a data mask, simply pass it to eval_tidy()
as
the data
argument. You can repeat this as many times as
needed. Note that any objects created there (perhaps because of a
call to <-
) will persist in subsequent evaluations.
Top and bottom of data mask
In some cases you'll need several levels in your data mask. One good reason is when you include functions in the mask. It's a good idea to keep data objects one level lower than function objects, so that the former cannot override the definitions of the latter (see examples).
In that case, set up all your environments and keep track of the
bottom child and the top parent. You'll need to pass both to
new_data_mask()
.
Note that the parent of the top environment is completely
undetermined, you shouldn't expect it to remain the same at all
times. This parent is replaced during evaluation by eval_tidy()
to one of the following environments:
The default environment passed as the
env
argument ofeval_tidy()
.The environment of the current quosure being evaluated, if applicable.
Consequently, all masking data should be contained between the bottom and top environment of the data mask.
Examples
# Evaluating in a tidy evaluation environment enables all tidy
# features:
mask <- as_data_mask(mtcars)
eval_tidy(quo(letters), mask)
# You can install new pronouns in the mask:
mask$.pronoun <- as_data_pronoun(list(foo = "bar", baz = "bam"))
eval_tidy(quo(.pronoun$foo), mask)
# In some cases the data mask can leak to the user, for example if
# a function or formula is created in the data mask environment:
cyl <- "user variable from the context"
fn <- eval_tidy(quote(function() cyl), mask)
fn()
# If new objects are created in the mask, they persist in the
# subsequent calls:
eval_tidy(quote(new <- cyl + am), mask)
eval_tidy(quote(new * 2), mask)
# In some cases your data mask is a whole chain of environments
# rather than a single environment. You'll have to use
# `new_data_mask()` and let it know about the bottom of the mask
# (the last child of the environment chain) and the topmost parent.
# A common situation where you'll want a multiple-environment mask
# is when you include functions in your mask. In that case you'll
# put functions in the top environment and data in the bottom. This
# will prevent the data from overwriting the functions.
top <- new_environment(list(`+` = base::paste, c = base::paste))
# Let's add a middle environment just for sport:
middle <- env(top)
# And finally the bottom environment containing data:
bottom <- env(middle, a = "a", b = "b", c = "c")
# We can now create a mask by supplying the top and bottom
# environments:
mask <- new_data_mask(bottom, top = top)
# This data mask can be passed to eval_tidy() instead of a list or
# data frame:
eval_tidy(quote(a + b + c), data = mask)
# Note how the function `c()` and the object `c` are looked up
# properly because of the multi-level structure:
eval_tidy(quote(c(a, b, c)), data = mask)
# new_data_mask() does not create data pronouns, but
# data pronouns can be added manually:
mask$.fns <- as_data_pronoun(top)
# The `.data` pronoun should generally be created from the
# mask. This will ensure data is looked up throughout the whole
# ancestry. Only non-function objects are looked up from this
# pronoun:
mask$.data <- as_data_pronoun(mask)
mask$.data$c
# Now we can reference values with the pronouns:
eval_tidy(quote(c(.data$a, .data$b, .data$c)), data = mask)
Coerce to an environment
Description
as_environment()
coerces named vectors (including lists) to an
environment. The names must be unique. If supplied an unnamed
string, it returns the corresponding package environment (see
pkg_env()
).
Usage
as_environment(x, parent = NULL)
Arguments
x |
An object to coerce. |
parent |
A parent environment, |
Details
If x
is an environment and parent
is not NULL
, the
environment is duplicated before being set a new parent. The return
value is therefore a different environment than x
.
Examples
# Coerce a named vector to an environment:
env <- as_environment(mtcars)
# By default it gets the empty environment as parent:
identical(env_parent(env), empty_env())
# With strings it is a handy shortcut for pkg_env():
as_environment("base")
as_environment("rlang")
# With NULL it returns the empty environment:
as_environment(NULL)
Convert to function
Description
as_function()
transforms a one-sided formula into a function.
This powers the lambda syntax in packages like purrr.
Usage
as_function(
x,
env = global_env(),
...,
arg = caller_arg(x),
call = caller_env()
)
is_lambda(x)
Arguments
x |
A function or formula. If a function, it is used as is. If a formula, e.g. If a string, the function is looked up in |
env |
Environment in which to fetch the function in case |
... |
These dots are for future extensions and must be empty. |
arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
call |
The execution environment of a currently
running function, e.g. |
Examples
f <- as_function(~ .x + 1)
f(10)
g <- as_function(~ -1 * .)
g(4)
h <- as_function(~ .x - .y)
h(6, 3)
# Functions created from a formula have a special class:
is_lambda(f)
is_lambda(as_function(function() "foo"))
Create a default name for an R object
Description
as_label()
transforms R objects into a short, human-readable
description. You can use labels to:
Display an object in a concise way, for example to labellise axes in a graphical plot.
Give default names to columns in a data frame. In this case, labelling is the first step before name repair.
See also as_name()
for transforming symbols back to a
string. Unlike as_label()
, as_name()
is a well defined
operation that guarantees the roundtrip symbol -> string ->
symbol.
In general, if you don't know for sure what kind of object you're
dealing with (a call, a symbol, an unquoted constant), use
as_label()
and make no assumption about the resulting string. If
you know you have a symbol and need the name of the object it
refers to, use as_name()
. For instance, use as_label()
with
objects captured with enquo()
and as_name()
with symbols
captured with ensym()
.
Usage
as_label(x)
Arguments
x |
An object. |
Transformation to string
Quosures are squashed before being labelled.
Symbols are transformed to string with
as_string()
.Calls are abbreviated.
Numbers are represented as such.
Other constants are represented by their type, such as
<dbl>
or<data.frame>
.
See Also
as_name()
for transforming symbols back to a string
deterministically.
Examples
# as_label() is useful with quoted expressions:
as_label(expr(foo(bar)))
as_label(expr(foobar))
# It works with any R object. This is also useful for quoted
# arguments because the user might unquote constant objects:
as_label(1:3)
as_label(base::list)
Extract names from symbols
Description
as_name()
converts symbols to character strings. The
conversion is deterministic. That is, the roundtrip symbol -> name -> symbol
always gives the same result.
Use
as_name()
when you need to transform a symbol to a string to refer to an object by its name.Use
as_label()
when you need to transform any kind of object to a string to represent that object with a short description.
Usage
as_name(x)
Arguments
x |
A string or symbol, possibly wrapped in a quosure. If a string, the attributes are removed, if any. |
Details
rlang::as_name()
is the opposite of base::as.name()
. If
you're writing base R code, we recommend using base::as.symbol()
which is an alias of as.name()
that follows a more modern
terminology (R types instead of S modes).
Value
A character vector of length 1.
See Also
as_label()
for converting any object to a single string
suitable as a label. as_string()
for a lower-level version that
doesn't unwrap quosures.
Examples
# Let's create some symbols:
foo <- quote(foo)
bar <- sym("bar")
# as_name() converts symbols to strings:
foo
as_name(foo)
typeof(bar)
typeof(as_name(bar))
# as_name() unwraps quosured symbols automatically:
as_name(quo(foo))
Cast symbol to string
Description
as_string()
converts symbols to character strings.
Usage
as_string(x)
Arguments
x |
A string or symbol. If a string, the attributes are removed, if any. |
Value
A character vector of length 1.
Unicode tags
Unlike base::as.symbol()
and base::as.name()
, as_string()
automatically transforms unicode tags such as "<U+5E78>"
to the
proper UTF-8 character. This is important on Windows because:
R on Windows has no UTF-8 support, and uses native encoding instead.
The native encodings do not cover all Unicode characters. For example, Western encodings do not support CKJ characters.
When a lossy UTF-8 -> native transformation occurs, uncovered characters are transformed to an ASCII unicode tag like
"<U+5E78>"
.Symbols are always encoded in native. This means that transforming the column names of a data frame to symbols might be a lossy operation.
This operation is very common in the tidyverse because of data masking APIs like dplyr where data frames are transformed to environments. While the names of a data frame are stored as a character vector, the bindings of environments are stored as symbols.
Because it reencodes the ASCII unicode tags to their UTF-8
representation, the string -> symbol -> string roundtrip is
more stable with as_string()
.
See Also
as_name()
for a higher-level variant of as_string()
that automatically unwraps quosures.
Examples
# Let's create some symbols:
foo <- quote(foo)
bar <- sym("bar")
# as_string() converts symbols to strings:
foo
as_string(foo)
typeof(bar)
typeof(as_string(bar))
Coerce to a character vector and attempt encoding conversion
Description
Unlike specifying the encoding
argument in as_string()
and
as_character()
, which is only declarative, these functions
actually attempt to convert the encoding of their input. There are
two possible cases:
The string is tagged as UTF-8 or latin1, the only two encodings for which R has specific support. In this case, converting to the same encoding is a no-op, and converting to native always works as expected, as long as the native encoding, the one specified by the
LC_CTYPE
locale has support for all characters occurring in the strings. Unrepresentable characters are serialised as unicode points: "<U+xxxx>".The string is not tagged. R assumes that it is encoded in the native encoding. Conversion to native is a no-op, and conversion to UTF-8 should work as long as the string is actually encoded in the locale codeset.
When translating to UTF-8, the strings are parsed for serialised
unicode points (e.g. strings looking like "U+xxxx") with
chr_unserialise_unicode()
. This helps to alleviate the effects of
character-to-symbol-to-character roundtrips on systems with
non-UTF-8 native encoding.
Usage
as_utf8_character(x)
Arguments
x |
An object to coerce. |
Examples
# Let's create a string marked as UTF-8 (which is guaranteed by the
# Unicode escaping in the string):
utf8 <- "caf\uE9"
Encoding(utf8)
charToRaw(utf8)
Bare type predicates
Description
These predicates check for a given type but only return TRUE
for
bare R objects. Bare objects have no class attributes. For example,
a data frame is a list, but not a bare list.
Usage
is_bare_list(x, n = NULL)
is_bare_atomic(x, n = NULL)
is_bare_vector(x, n = NULL)
is_bare_double(x, n = NULL)
is_bare_complex(x, n = NULL)
is_bare_integer(x, n = NULL)
is_bare_numeric(x, n = NULL)
is_bare_character(x, n = NULL)
is_bare_logical(x, n = NULL)
is_bare_raw(x, n = NULL)
is_bare_string(x, n = NULL)
is_bare_bytes(x, n = NULL)
Arguments
x |
Object to be tested. |
n |
Expected length of a vector. |
Details
The predicates for vectors include the
n
argument for pattern-matching on the vector length.Like
is_atomic()
and unlike base Ris.atomic()
for R < 4.4.0,is_bare_atomic()
does not returnTRUE
forNULL
. Starting in R 4.4.0,is.atomic(NULL)
returns FALSE.Unlike base R
is.numeric()
,is_bare_double()
only returnsTRUE
for floating point numbers.
See Also
type-predicates, scalar-type-predicates
Box a value
Description
new_box()
is similar to base::I()
but it protects a value by
wrapping it in a scalar list rather than by adding an attribute.
unbox()
retrieves the boxed value. is_box()
tests whether an
object is boxed with optional class. as_box()
ensures that a
value is wrapped in a box. as_box_if()
does the same but only if
the value matches a predicate.
Usage
new_box(.x, class = NULL, ...)
is_box(x, class = NULL)
unbox(box)
Arguments
class |
For |
... |
Additional attributes passed to |
x , .x |
An R object. |
box |
A boxed value to unbox. |
Examples
boxed <- new_box(letters, "mybox")
is_box(boxed)
is_box(boxed, "mybox")
is_box(boxed, "otherbox")
unbox(boxed)
# as_box() avoids double-boxing:
boxed2 <- as_box(boxed, "mybox")
boxed2
unbox(boxed2)
# Compare to:
boxed_boxed <- new_box(boxed, "mybox")
boxed_boxed
unbox(unbox(boxed_boxed))
# Use `as_box_if()` with a predicate if you need to ensure a box
# only for a subset of values:
as_box_if(NULL, is_null, "null_box")
as_box_if("foo", is_null, "null_box")
Human readable memory sizes
Description
Construct, manipulate and display vectors of byte sizes. These are numeric vectors, so you can compare them numerically, but they can also be compared to human readable values such as '10MB'.
-
parse_bytes()
takes a character vector of human-readable bytes and returns a structured bytes vector. -
as_bytes()
is a generic conversion function for objects representing bytes.
Note: A bytes()
constructor will be exported soon.
Usage
as_bytes(x)
parse_bytes(x)
Arguments
x |
A numeric or character vector. Character representations can use shorthand sizes (see examples). |
Details
These memory sizes are always assumed to be base 1000, rather than 1024.
Examples
parse_bytes("1")
parse_bytes("1K")
parse_bytes("1Kb")
parse_bytes("1KiB")
parse_bytes("1MB")
parse_bytes("1KB") < "1MB"
sum(parse_bytes(c("1MB", "5MB", "500KB")))
Extract arguments from a call
Description
Extract arguments from a call
Usage
call_args(call)
call_args_names(call)
Arguments
call |
A defused call. |
Value
A named list of arguments.
See Also
Examples
call <- quote(f(a, b))
# Subsetting a call returns the arguments converted to a language
# object:
call[-1]
# On the other hand, call_args() returns a regular list that is
# often easier to work with:
str(call_args(call))
# When the arguments are unnamed, a vector of empty strings is
# supplied (rather than NULL):
call_args_names(call)
Extract function from a call
Description
Usage
call_fn(call, env = caller_env())
Arguments
call , env |
Inspect a call
Description
This function is a wrapper around base::match.call()
. It returns
its own function call.
Usage
call_inspect(...)
Arguments
... |
Arguments to display in the returned call. |
Examples
# When you call it directly, it simply returns what you typed
call_inspect(foo(bar), "" %>% identity())
# Pass `call_inspect` to functionals like `lapply()` or `map()` to
# inspect the calls they create around the supplied function
lapply(1:3, call_inspect)
Match supplied arguments to function definition
Description
call_match()
is like match.call()
with these differences:
It supports matching missing argument to their defaults in the function definition.
It requires you to be a little more specific in some cases. Either all arguments are inferred from the call stack or none of them are (see the Inference section).
Usage
call_match(
call = NULL,
fn = NULL,
...,
defaults = FALSE,
dots_env = NULL,
dots_expand = TRUE
)
Arguments
call |
A call. The arguments will be matched to |
fn |
A function definition to match arguments to. |
... |
These dots must be empty. |
defaults |
Whether to match missing arguments to their defaults. |
dots_env |
An execution environment where to find dots. If
supplied and dots exist in this environment, and if |
dots_expand |
If Note that the resulting call is not meant to be evaluated since R
does not support passing dots through a named argument, even if
named |
Inference from the call stack
When call
is not supplied, it is inferred from the call stack
along with fn
and dots_env
.
-
call
andfn
are inferred from the calling environment:sys.call(sys.parent())
andsys.function(sys.parent())
. -
dots_env
is inferred from the caller of the calling environment:caller_env(2)
.
If call
is supplied, then you must supply fn
as well. Also
consider supplying dots_env
as it is set to the empty environment
when not inferred.
Examples
# `call_match()` supports matching missing arguments to their
# defaults
fn <- function(x = "default") fn
call_match(quote(fn()), fn)
call_match(quote(fn()), fn, defaults = TRUE)
Modify the arguments of a call
Description
If you are working with a user-supplied call, make sure the
arguments are standardised with call_match()
before
modifying the call.
Usage
call_modify(
.call,
...,
.homonyms = c("keep", "first", "last", "error"),
.standardise = NULL,
.env = caller_env()
)
Arguments
.call |
Can be a call, a formula quoting a call in the right-hand side, or a frame object from which to extract the call expression. |
... |
<dynamic> Named or unnamed expressions
(constants, names or calls) used to modify the call. Use |
.homonyms |
How to treat arguments with the same name. The
default, |
.standardise , .env |
Deprecated as of rlang 0.3.0. Please
call |
Value
A quosure if .call
is a quosure, a call otherwise.
Examples
call <- quote(mean(x, na.rm = TRUE))
# Modify an existing argument
call_modify(call, na.rm = FALSE)
call_modify(call, x = quote(y))
# Remove an argument
call_modify(call, na.rm = zap())
# Add a new argument
call_modify(call, trim = 0.1)
# Add an explicit missing argument:
call_modify(call, na.rm = )
# Supply a list of new arguments with `!!!`
newargs <- list(na.rm = NULL, trim = 0.1)
call <- call_modify(call, !!!newargs)
call
# Remove multiple arguments by splicing zaps:
newargs <- rep_named(c("na.rm", "trim"), list(zap()))
call <- call_modify(call, !!!newargs)
call
# Modify the `...` arguments as if it were a named argument:
call <- call_modify(call, ... = )
call
call <- call_modify(call, ... = zap())
call
# When you're working with a user-supplied call, standardise it
# beforehand in case it includes unmatched arguments:
user_call <- quote(matrix(x, nc = 3))
call_modify(user_call, ncol = 1)
# `call_match()` applies R's argument matching rules. Matching
# ensures you're modifying the intended argument.
user_call <- call_match(user_call, matrix)
user_call
call_modify(user_call, ncol = 1)
# By default, arguments with the same name are kept. This has
# subtle implications, for instance you can move an argument to
# last position by removing it and remapping it:
call <- quote(foo(bar = , baz))
call_modify(call, bar = NULL, bar = missing_arg())
# You can also choose to keep only the first or last homonym
# arguments:
args <- list(bar = NULL, bar = missing_arg())
call_modify(call, !!!args, .homonyms = "first")
call_modify(call, !!!args, .homonyms = "last")
Extract function name or namespace of a call
Description
call_name()
and call_ns()
extract the function name or
namespace of simple calls as a string. They return NULL
for
complex calls.
Simple calls:
foo()
,bar::foo()
.Complex calls:
foo()()
,bar::foo
,foo$bar()
,(function() NULL)()
.
The is_call_simple()
predicate helps you determine whether a call
is simple. There are two invariants you can count on:
If
is_call_simple(x)
returnsTRUE
,call_name(x)
returns a string. Otherwise it returnsNULL
.If
is_call_simple(x, ns = TRUE)
returnsTRUE
,call_ns()
returns a string. Otherwise it returnsNULL
.
Usage
call_name(call)
call_ns(call)
is_call_simple(x, ns = NULL)
Arguments
call |
A defused call. |
x |
An object to test. |
ns |
Whether call is namespaced. If |
Value
The function name or namespace as a string, or NULL
if
the call is not named or namespaced.
Examples
# Is the function named?
is_call_simple(quote(foo()))
is_call_simple(quote(foo[[1]]()))
# Is the function namespaced?
is_call_simple(quote(list()), ns = TRUE)
is_call_simple(quote(base::list()), ns = TRUE)
# Extract the function name from quoted calls:
call_name(quote(foo(bar)))
call_name(quo(foo(bar)))
# Namespaced calls are correctly handled:
call_name(quote(base::matrix(baz)))
# Anonymous and subsetted functions return NULL:
call_name(quote(foo$bar()))
call_name(quote(foo[[bar]]()))
call_name(quote(foo()()))
# Extract namespace of a call with call_ns():
call_ns(quote(base::bar()))
# If not namespaced, call_ns() returns NULL:
call_ns(quote(bar()))
Standardise a call
Description
Deprecated in rlang 0.4.11 in favour of call_match()
.
call_standardise()
was designed for call wrappers that include an
environment like formulas or quosures. The function definition was
plucked from that environment. However in practice it is rare to
use it with wrapped calls, and then it's easy to forget to supply
the environment. For these reasons, we have designed call_match()
as a simpler wrapper around match.call()
.
This is essentially equivalent to base::match.call()
, but with
experimental handling of primitive functions.
Usage
call_standardise(call, env = caller_env())
Arguments
call , env |
Value
A quosure if call
is a quosure, a raw call otherwise.
Create a call
Description
Quoted function calls are one of the two types of symbolic objects in R. They represent the action of calling a function, possibly with arguments. There are two ways of creating a quoted call:
By quoting it. Quoting prevents functions from being called. Instead, you get the description of the function call as an R object. That is, a quoted function call.
By constructing it with
base::call()
,base::as.call()
, orcall2()
. In this case, you pass the call elements (the function to call and the arguments to call it with) separately.
See section below for the difference between call2()
and the base
constructors.
Usage
call2(.fn, ..., .ns = NULL)
Arguments
.fn |
Function to call. Must be a callable object: a string, symbol, call, or a function. |
... |
<dynamic> Arguments for the function call. Empty arguments are preserved. |
.ns |
Namespace with which to prefix |
Difference with base constructors
call2()
is more flexible than base::call()
:
The function to call can be a string or a callable object: a symbol, another call (e.g. a
$
or[[
call), or a function to inline.base::call()
only supports strings and you need to usebase::as.call()
to construct a call with a callable object.call2(list, 1, 2) as.call(list(list, 1, 2))
The
.ns
argument is convenient for creating namespaced calls.call2("list", 1, 2, .ns = "base") # Equivalent to ns_call <- call("::", as.symbol("list"), as.symbol("base")) as.call(list(ns_call, 1, 2))
-
call2()
has dynamic dots support. You can splice lists of arguments with!!!
or unquote an argument name with glue syntax.args <- list(na.rm = TRUE, trim = 0) call2("mean", 1:10, !!!args) # Equivalent to as.call(c(list(as.symbol("mean"), 1:10), args))
Caveats of inlining objects in calls
call2()
makes it possible to inline objects in calls, both in
function and argument positions. Inlining an object or a function
has the advantage that the correct object is used in all
environments. If all components of the code are inlined, you can
even evaluate in the empty environment.
However inlining also has drawbacks. It can cause issues with NSE
functions that expect symbolic arguments. The objects may also leak
in representations of the call stack, such as traceback()
.
See Also
Examples
# fn can either be a string, a symbol or a call
call2("f", a = 1)
call2(quote(f), a = 1)
call2(quote(f()), a = 1)
#' Can supply arguments individually or in a list
call2(quote(f), a = 1, b = 2)
call2(quote(f), !!!list(a = 1, b = 2))
# Creating namespaced calls is easy:
call2("fun", arg = quote(baz), .ns = "mypkg")
# Empty arguments are preserved:
call2("[", quote(x), , drop = )
Find the caller argument for error messages
Description
caller_arg()
is a variant of substitute()
or ensym()
for
arguments that reference other arguments. Unlike substitute()
which returns an expression, caller_arg()
formats the expression
as a single line string which can be included in error messages.
When included in an error message, the resulting label should generally be formatted as argument, for instance using the
.arg
in the cli package.Use
@inheritParams rlang::args_error_context
to document anarg
orerror_arg
argument that takeserror_arg()
as default.
Arguments
arg |
An argument name in the current function. |
Examples
arg_checker <- function(x, arg = caller_arg(x), call = caller_env()) {
cli::cli_abort("{.arg {arg}} must be a thingy.", arg = arg, call = call)
}
my_function <- function(my_arg) {
arg_checker(my_arg)
}
try(my_function(NULL))
Catch a condition
Description
This is a small wrapper around tryCatch()
that captures any
condition signalled while evaluating its argument. It is useful for
situations where you expect a specific condition to be signalled,
for debugging, and for unit testing.
Usage
catch_cnd(expr, classes = "condition")
Arguments
expr |
Expression to be evaluated with a catching condition handler. |
classes |
A character vector of condition classes to catch. By default, catches all conditions. |
Value
A condition if any was signalled, NULL
otherwise.
Examples
catch_cnd(10)
catch_cnd(abort("an error"))
catch_cnd(signal("my_condition", message = "a condition"))
Check that dots are empty
Description
...
can be inserted in a function signature to force users to
fully name the details arguments. In this case, supplying data in
...
is almost always a programming error. This function checks
that ...
is empty and fails otherwise.
Usage
check_dots_empty(
env = caller_env(),
error = NULL,
call = caller_env(),
action = abort
)
Arguments
env |
Environment in which to look for |
error |
An optional error handler passed to |
call |
The execution environment of a currently
running function, e.g. |
action |
Details
In packages, document ...
with this standard tag:
@inheritParams rlang::args_dots_empty
See Also
Other dots checking functions:
check_dots_unnamed()
,
check_dots_used()
Examples
f <- function(x, ..., foofy = 8) {
check_dots_empty()
x + foofy
}
# This fails because `foofy` can't be matched positionally
try(f(1, 4))
# This fails because `foofy` can't be matched partially by name
try(f(1, foof = 4))
# Thanks to `...`, it must be matched exactly
f(1, foofy = 4)
Check that dots are empty (low level variant)
Description
check_dots_empty0()
is a more efficient version of
check_dots_empty()
with a slightly different interface. Instead
of inspecting the current environment for dots, it directly takes
...
. It is only meant for very low level functions where a
couple microseconds make a difference.
Usage
check_dots_empty0(..., call = caller_env())
Arguments
... |
Dots which should be empty. |
Check that all dots are unnamed
Description
In functions like paste()
, named arguments in ...
are often a
sign of misspelled argument names. Call check_dots_unnamed()
to
fail with an error when named arguments are detected.
Usage
check_dots_unnamed(
env = caller_env(),
error = NULL,
call = caller_env(),
action = abort
)
Arguments
env |
Environment in which to look for |
error |
An optional error handler passed to |
call |
The execution environment of a currently
running function, e.g. |
action |
See Also
Other dots checking functions:
check_dots_empty()
,
check_dots_used()
Examples
f <- function(..., foofy = 8) {
check_dots_unnamed()
c(...)
}
f(1, 2, 3, foofy = 4)
try(f(1, 2, 3, foof = 4))
Check that all dots have been used
Description
When ...
arguments are passed to methods, it is assumed there
method will match and use these arguments. If this isn't the case,
this often indicates a programming error. Call check_dots_used()
to fail with an error when unused arguments are detected.
Usage
check_dots_used(
env = caller_env(),
call = caller_env(),
error = NULL,
action = deprecated()
)
Arguments
env |
Environment in which to look for |
call |
The execution environment of a currently
running function, e.g. |
error |
An optional error handler passed to |
action |
Details
In packages, document ...
with this standard tag:
@inheritParams rlang::args_dots_used
check_dots_used()
implicitly calls on.exit()
to check that all
elements of ...
have been used when the function exits. If you
use on.exit()
elsewhere in your function, make sure to use add = TRUE
so that you don't override the handler set up by
check_dots_used()
.
See Also
Other dots checking functions:
check_dots_empty()
,
check_dots_unnamed()
Examples
f <- function(...) {
check_dots_used()
g(...)
}
g <- function(x, y, ...) {
x + y
}
f(x = 1, y = 2)
try(f(x = 1, y = 2, z = 3))
try(f(x = 1, y = 2, 3, 4, 5))
# Use an `error` handler to handle the error differently.
# For instance to demote the error to a warning:
fn <- function(...) {
check_dots_empty(
error = function(cnd) {
warning(cnd)
}
)
"out"
}
fn()
Check that arguments are mutually exclusive
Description
check_exclusive()
checks that only one argument is supplied out of
a set of mutually exclusive arguments. An informative error is
thrown if multiple arguments are supplied.
Usage
check_exclusive(..., .require = TRUE, .frame = caller_env(), .call = .frame)
Arguments
... |
Function arguments. |
.require |
Whether at least one argument must be supplied. |
.frame |
Environment where the arguments in |
.call |
The execution environment of a currently
running function, e.g. |
Value
The supplied argument name as a string. If .require
is
FALSE
and no argument is supplied, the empty string ""
is
returned.
Examples
f <- function(x, y) {
switch(
check_exclusive(x, y),
x = message("`x` was supplied."),
y = message("`y` was supplied.")
)
}
# Supplying zero or multiple arguments is forbidden
try(f())
try(f(NULL, NULL))
# The user must supply one of the mutually exclusive arguments
f(NULL)
f(y = NULL)
# With `.require` you can allow zero arguments
f <- function(x, y) {
switch(
check_exclusive(x, y, .require = FALSE),
x = message("`x` was supplied."),
y = message("`y` was supplied."),
message("No arguments were supplied")
)
}
f()
Check that argument is supplied
Description
Throws an error if x
is missing.
Usage
check_required(x, arg = caller_arg(x), call = caller_env())
Arguments
x |
A function argument. Must be a symbol. |
arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
call |
The execution environment of a currently
running function, e.g. |
See Also
Examples
f <- function(x) {
check_required(x)
}
# Fails because `x` is not supplied
try(f())
# Succeeds
f(NULL)
Create a child environment
Description
env()
now supports creating child environments, please use it
instead.
Usage
child_env(.parent, ...)
Translate unicode points to UTF-8
Description
For historical reasons, R translates strings to the native encoding
when they are converted to symbols. This string-to-symbol
conversion is not a rare occurrence and happens for instance to the
names of a list of arguments converted to a call by do.call()
.
If the string contains unicode characters that cannot be
represented in the native encoding, R serialises those as an ASCII
sequence representing the unicode point. This is why Windows users
with western locales often see strings looking like <U+xxxx>
. To
alleviate some of the pain, rlang parses strings and looks for
serialised unicode points to translate them back to the proper
UTF-8 representation. This transformation occurs automatically in
functions like env_names()
and can be manually triggered with
as_utf8_character()
and chr_unserialise_unicode()
.
Usage
chr_unserialise_unicode(chr)
Arguments
chr |
A character vector. |
Life cycle
This function is experimental.
Examples
ascii <- "<U+5E78>"
chr_unserialise_unicode(ascii)
identical(chr_unserialise_unicode(ascii), "\u5e78")
Create a condition object
Description
These constructors create subclassed conditions, the objects that power the error, warning, and message system in R.
-
cnd()
creates bare conditions that only inherit fromcondition
. Conditions created with
error_cnd()
,warning_cnd()
, andmessage_cnd()
inherit from"error"
,"warning"
, or"message"
.-
error_cnd()
creates subclassed errors. See"rlang_error"
.
Use cnd_signal()
to emit the relevant signal for a particular
condition class.
Usage
cnd(class, ..., message = "", call = NULL, use_cli_format = NULL)
error_cnd(
class = NULL,
...,
message = "",
call = NULL,
trace = NULL,
parent = NULL,
use_cli_format = NULL
)
warning_cnd(
class = NULL,
...,
message = "",
call = NULL,
use_cli_format = NULL
)
message_cnd(
class = NULL,
...,
message = "",
call = NULL,
use_cli_format = NULL
)
Arguments
class |
The condition subclass. |
... |
<dynamic> Named data fields stored inside the condition object. |
message |
A default message to inform the user about the condition when it is signalled. |
call |
A function call to be included in the error message. If an execution environment of a running function, the corresponding function call is retrieved. |
use_cli_format |
Whether to use the cli package to format
|
trace |
A |
parent |
A parent condition object. |
See Also
Examples
# Create a condition inheriting only from the S3 class "foo":
cnd <- cnd("foo")
# Signal the condition to potential handlers. Since this is a bare
# condition the signal has no effect if no handlers are set up:
cnd_signal(cnd)
# When a relevant handler is set up, the signal transfers control
# to the handler
with_handlers(cnd_signal(cnd), foo = function(c) "caught!")
tryCatch(cnd_signal(cnd), foo = function(c) "caught!")
Does a condition or its ancestors inherit from a class?
Description
Like any R objects, errors captured with catchers like tryCatch()
have a class()
which you can test with inherits()
. However,
with chained errors, the class of a captured error might be
different than the error that was originally signalled. Use
cnd_inherits()
to detect whether an error or any of its parent
inherits from a class.
Whereas inherits()
tells you whether an object is a particular
kind of error, cnd_inherits()
answers the question whether an
object is a particular kind of error or has been caused by such an
error.
Some chained conditions carry parents that are not inherited. See
the .inherit
argument of abort()
, warn()
, and inform()
.
Usage
cnd_inherits(cnd, class)
Arguments
cnd |
A condition to test. |
class |
A class passed to |
Capture an error with cnd_inherits()
Error catchers like tryCatch()
and try_fetch()
can only match
the class of a condition, not the class of its parents. To match a
class across the ancestry of an error, you'll need a bit of
craftiness.
Ancestry matching can't be done with tryCatch()
at all so you'll
need to switch to withCallingHandlers()
. Alternatively, you can
use the experimental rlang function try_fetch()
which is able to
perform the roles of both tryCatch()
and withCallingHandlers()
.
withCallingHandlers()
Unlike tryCatch()
, withCallingHandlers()
does not capture an
error. If you don't explicitly jump with an error or a value
throw, nothing happens.
Since we don't want to throw an error, we'll throw a value using
callCC()
:
f <- function() { parent <- error_cnd("bar", message = "Bar") abort("Foo", parent = parent) } cnd <- callCC(function(throw) { withCallingHandlers( f(), error = function(x) if (cnd_inherits(x, "bar")) throw(x) ) }) class(cnd) #> [1] "rlang_error" "error" "condition" class(cnd$parent) #> [1] "bar" "rlang_error" "error" "condition"
try_fetch()
This pattern is easier with try_fetch()
. Like
withCallingHandlers()
, it doesn't capture a matching error right
away. Instead, it captures it only if the handler doesn't return a
zap()
value.
cnd <- try_fetch( f(), error = function(x) if (cnd_inherits(x, "bar")) x else zap() ) class(cnd) #> [1] "rlang_error" "error" "condition" class(cnd$parent) #> [1] "bar" "rlang_error" "error" "condition"
Note that try_fetch()
uses cnd_inherits()
internally. This
makes it very easy to match a parent condition:
cnd <- try_fetch( f(), bar = function(x) x ) # This is the parent class(cnd) #> [1] "bar" "rlang_error" "error" "condition"
Build an error message from parts
Description
cnd_message()
assembles an error message from three generics:
-
cnd_header()
-
cnd_body()
-
cnd_footer()
Methods for these generics must return a character vector. The
elements are combined into a single string with a newline
separator. Bullets syntax is supported, either through rlang (see
format_error_bullets()
), or through cli if the condition has
use_cli_format
set to TRUE
.
The default method for the error header returns the message
field
of the condition object. The default methods for the body and
footer return the the body
and footer
fields if any, or empty
character vectors otherwise.
cnd_message()
is automatically called by the conditionMessage()
for rlang errors, warnings, and messages. Error classes created
with abort()
only need to implement header, body or footer
methods. This provides a lot of flexibility for hierarchies of
error classes, for instance you could inherit the body of an error
message from a parent class while overriding the header and footer.
Usage
cnd_message(cnd, ..., inherit = TRUE, prefix = FALSE)
cnd_header(cnd, ...)
cnd_body(cnd, ...)
cnd_footer(cnd, ...)
Arguments
cnd |
A condition object. |
... |
Arguments passed to methods. |
inherit |
Wether to include parent messages. Parent messages
are printed with a "Caused by error:" prefix, even if |
prefix |
Whether to print the full message, including the
condition prefix ( |
Overriding header, body, and footer methods
Sometimes the contents of an error message depends on the state of
your checking routine. In that case, it can be tricky to lazily
generate error messages with cnd_header()
, cnd_body()
, and
cnd_footer()
: you have the choice between overspecifying your
error class hierarchies with one class per state, or replicating
the type-checking control flow within the cnd_body()
method. None
of these options are ideal.
A better option is to define header
, body
, or footer
fields
in your condition object. These can be a static string, a
lambda-formula, or a function with the same
signature as cnd_header()
, cnd_body()
, or cnd_footer()
. These
fields override the message generics and make it easy to generate
an error message tailored to the state in which the error was
constructed.
Muffle a condition
Description
Unlike exiting()
handlers, calling()
handlers must be explicit
that they have handled a condition to stop it from propagating to
other handlers. Use cnd_muffle()
within a calling handler (or as
a calling handler, see examples) to prevent any other handlers from
being called for that condition.
Usage
cnd_muffle(cnd)
Arguments
cnd |
A condition to muffle. |
Value
If cnd
is mufflable, cnd_muffle()
jumps to the muffle
restart and doesn't return. Otherwise, it returns FALSE
.
Mufflable conditions
Most conditions signalled by base R are muffable, although the name of the restart varies. cnd_muffle() will automatically call the correct restart for you. It is compatible with the following conditions:
-
warning
andmessage
conditions. In this casecnd_muffle()
is equivalent tobase::suppressMessages()
andbase::suppressWarnings()
. Bare conditions signalled with
signal()
orcnd_signal()
. Note that conditions signalled withbase::signalCondition()
are not mufflable.Interrupts are sometimes signalled with a
resume
restart on recent R versions. When this is the case, you can muffle the interrupt withcnd_muffle()
. Check if a restart is available withbase::findRestart("resume")
.
If you call cnd_muffle()
with a condition that is not mufflable
you will cause a new error to be signalled.
Errors are not mufflable since they are signalled in critical situations where execution cannot continue safely.
Conditions captured with
base::tryCatch()
,with_handlers()
orcatch_cnd()
are no longer mufflable. Muffling restarts must be called from a calling handler.
Examples
fn <- function() {
inform("Beware!", "my_particular_msg")
inform("On your guard!")
"foobar"
}
# Let's install a muffling handler for the condition thrown by `fn()`.
# This will suppress all `my_particular_wng` warnings but let other
# types of warnings go through:
with_handlers(fn(),
my_particular_msg = calling(function(cnd) {
inform("Dealt with this particular message")
cnd_muffle(cnd)
})
)
# Note how execution of `fn()` continued normally after dealing
# with that particular message.
# cnd_muffle() can also be passed to with_handlers() as a calling
# handler:
with_handlers(fn(),
my_particular_msg = calling(cnd_muffle)
)
Signal a condition object
Description
cnd_signal()
takes a condition as argument and emits the
corresponding signal. The type of signal depends on the class of
the condition:
A message is signalled if the condition inherits from
"message"
. This is equivalent to signalling withinform()
orbase::message()
.A warning is signalled if the condition inherits from
"warning"
. This is equivalent to signalling withwarn()
orbase::warning()
.An error is signalled if the condition inherits from
"error"
. This is equivalent to signalling withabort()
orbase::stop()
.An interrupt is signalled if the condition inherits from
"interrupt"
. This is equivalent to signalling withinterrupt()
.
Usage
cnd_signal(cnd, ...)
Arguments
cnd |
A condition object (see |
... |
These dots are for future extensions and must be empty. |
See Also
-
cnd_type()
to determine the type of a condition. -
abort()
,warn()
andinform()
for creating and signalling structured R conditions in one go. -
try_fetch()
for establishing condition handlers for particular condition classes.
Examples
# The type of signal depends on the class. If the condition
# inherits from "warning", a warning is issued:
cnd <- warning_cnd("my_warning_class", message = "This is a warning")
cnd_signal(cnd)
# If it inherits from "error", an error is raised:
cnd <- error_cnd("my_error_class", message = "This is an error")
try(cnd_signal(cnd))
What type is a condition?
Description
Use cnd_type()
to check what type a condition is.
Usage
cnd_type(cnd)
Arguments
cnd |
A condition object. |
Value
A string, either "condition"
, "message"
, "warning"
,
"error"
or "interrupt"
.
Examples
cnd_type(catch_cnd(abort("Abort!")))
cnd_type(catch_cnd(interrupt()))
Advanced defusal operators
Description
These advanced operators defuse R expressions.
expr()
, enquo()
, and enquos()
are sufficient for most
purposes but rlang provides these other operations, either for
completeness or because they are useful to experts.
-
exprs()
is the plural variant ofexpr()
. It returns a list of expressions. It is likebase::alist()
but with injection support. -
quo()
andquos()
are likeexpr()
andexprs()
but return quosures instead of naked expressions. When you are defusing your own local expressions (by opposition to function arguments where non-local expressions are supplied by your users), there is generally no need to attach the current environment in a quosure. See What are quosures and when are they needed?. -
enexpr()
andenexprs()
are likeenquo()
andenquos()
but return naked expressions instead of quosures. These operators should very rarely be used because they lose track of the environment of defused arguments. -
ensym()
andensyms()
are likeenexpr()
andenexprs()
but they throw an error when the defused expressions are not simple symbols. They also support strings which are interpreted as symbols. These functions are modelled on the behaviour of the left-hand side of=
and<-
where you can supply symbols and strings interchangeably."foo" <- NULL list("foo" = NULL)
-
enquo0
andenquos0()
are likeenquo()
andenquos()
but without injection support. The injection operators!!
,!!!
, and{{
are not processed, instead they are preserved in the defused expression. This makes it possible to defuse expressions that potentially contain injection operators meant for later use. The trade off is that it makes it harder for users to inject expressions in your function. They have to enable injection explicitly withinject()
.None of the features of dynamic dots are available when defusing with
enquos0()
. For instance, trailing empty arguments are not automatically trimmed.
Usage
enexpr(arg)
exprs(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.unquote_names = TRUE
)
enexprs(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.ignore_null = c("none", "all"),
.unquote_names = TRUE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
ensym(arg)
ensyms(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.ignore_null = c("none", "all"),
.unquote_names = TRUE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
quo(expr)
quos(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.unquote_names = TRUE
)
enquo0(arg)
enquos0(...)
Arguments
arg |
An unquoted argument name. The expression supplied to that argument is defused and returned. |
... |
For |
.named |
If |
.ignore_empty |
Whether to ignore empty arguments. Can be one
of |
.unquote_names |
Whether to treat |
.ignore_null |
Whether to ignore unnamed null arguments. Can be
|
.homonyms |
How to treat arguments with the same name. The
default, |
.check_assign |
Whether to check for |
expr |
An expression to defuse. |
Examples
# `exprs()` is the plural variant of `expr()`
exprs(foo, bar, bar)
# `quo()` and `quos()` are the quosure variants of `expr()` and `exprs()`
quo(foo)
quos(foo, bar)
# `enexpr()` and `enexprs()` are the naked variants of `enquo()` and `enquos()`
my_function1 <- function(arg) enexpr(arg)
my_function2 <- function(arg, ...) enexprs(arg, ...)
my_function1(1 + 1)
my_function2(1 + 1, 10 * 2)
# `ensym()` and `ensyms()` are symbol variants of `enexpr()` and `enexprs()`
my_function3 <- function(arg) ensym(arg)
my_function4 <- function(arg, ...) ensyms(arg, ...)
# The user must supply symbols
my_function3(foo)
my_function4(foo, bar)
# Complex expressions are an error
try(my_function3(1 + 1))
try(my_function4(1 + 1, 10 * 2))
# `enquo0()` and `enquos0()` disable injection operators
automatic_injection <- function(x) enquo(x)
no_injection <- function(x) enquo0(x)
automatic_injection(foo(!!!1:3))
no_injection(foo(!!!1:3))
# Injection can still be done explicitly
inject(no_injection(foo(!!!1:3)))
Development notes - dots.R
Description
Development notes - dots.R
.__error_call__.
flag in dots collectors
Dots collectors like dots_list()
are a little tricky because they
may error out in different situations. Do we want to forward the
context, i.e. set the call flag to the calling environment?
Collectors throw errors in these cases:
While checking their own parameters, in which case the relevant context is the collector itself and we don't forward.
While collecting the dots, during evaluation of the supplied arguments. In this case forwarding or not is irrelevant because expressions in
...
are evaluated in their own environment which is not connected to the collector's context.While collecting the dots, during argument constraints checks such as determined by the
.homonyms
argument. In this case we want to forward the context because the caller of the dots collector is the one who determines the constraints for its users.
Box a final value for early termination
Description
A value boxed with done()
signals to its caller that it
should stop iterating. Use it to shortcircuit a loop.
Usage
done(x)
is_done_box(x, empty = NULL)
Arguments
x |
For |
empty |
Whether the box is empty. If |
Value
A boxed value.
Examples
done(3)
x <- done(3)
is_done_box(x)
.data
and .env
pronouns
Description
The .data
and .env
pronouns make it explicit where to find
objects when programming with data-masked
functions.
m <- 10 mtcars %>% mutate(disp = .data$disp * .env$m)
-
.data
retrieves data-variables from the data frame. -
.env
retrieves env-variables from the environment.
Because the lookup is explicit, there is no ambiguity between both kinds of variables. Compare:
disp <- 10 mtcars %>% mutate(disp = .data$disp * .env$disp) mtcars %>% mutate(disp = disp * disp)
Note that .data
is only a pronoun, it is not a real data
frame. This means that you can't take its names or map a function
over the contents of .data
. Similarly, .env
is not an actual R
environment. For instance, it doesn't have a parent and the
subsetting operators behave differently.
.data
versus the magrittr pronoun .
In a magrittr pipeline, .data
is not necessarily interchangeable with the magrittr pronoun .
.
With grouped data frames in particular, .data
represents the
current group slice whereas the pronoun .
represents the whole
data frame. Always prefer using .data
in data-masked context.
Where does .data
live?
The .data
pronoun is automatically created for you by
data-masking functions using the tidy eval framework.
You don't need to import rlang::.data
or use library(rlang)
to
work with this pronoun.
However, the .data
object exported from rlang is useful to import
in your package namespace to avoid a R CMD check
note when
referring to objects from the data mask. R does not have any way of
knowing about the presence or absence of .data
in a particular
scope so you need to import it explicitly or equivalently declare
it with utils::globalVariables(".data")
.
Note that rlang::.data
is a "fake" pronoun. Do not refer to
rlang::.data
with the rlang::
qualifier in data masking
code. Use the unqualified .data
symbol that is automatically put
in scope by data-masking functions.
How many arguments are currently forwarded in dots?
Description
This returns the number of arguments currently forwarded in ...
as an integer.
Usage
dots_n(...)
Arguments
... |
Forwarded arguments. |
Examples
fn <- function(...) dots_n(..., baz)
fn(foo, bar)
Splice lists
Description
dots_splice()
is like dots_list()
but automatically splices
list inputs.
Usage
dots_splice(
...,
.ignore_empty = c("trailing", "none", "all"),
.preserve_empty = FALSE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
Arguments
... |
Arguments to collect in a list. These dots are dynamic. |
.ignore_empty |
Whether to ignore empty arguments. Can be one
of |
.preserve_empty |
Whether to preserve the empty arguments that
were not ignored. If |
.homonyms |
How to treat arguments with the same name. The
default, |
.check_assign |
Whether to check for |
Evaluate dots with preliminary splicing
Description
This is a tool for advanced users. It captures dots, processes
unquoting and splicing operators, and evaluates them. Unlike
dots_list()
, it does not flatten spliced objects, instead they
are attributed a spliced
class (see splice()
). You can process
spliced objects manually, perhaps with a custom predicate (see
flatten_if()
).
Usage
dots_values(
...,
.ignore_empty = c("trailing", "none", "all"),
.preserve_empty = FALSE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
Arguments
... |
Arguments to evaluate and process splicing operators. |
.ignore_empty |
Whether to ignore empty arguments. Can be one
of |
.preserve_empty |
Whether to preserve the empty arguments that
were not ignored. If |
.homonyms |
How to treat arguments with the same name. The
default, |
.check_assign |
Whether to check for |
Examples
dots <- dots_values(!!! list(1, 2), 3)
dots
# Flatten the objects marked as spliced:
flatten_if(dots, is_spliced)
Duplicate an R object
Description
duplicate()
is an interface to the C-level duplicate()
and
shallow_duplicate()
functions. It is mostly meant for users of
the C API of R, e.g. for debugging, experimenting, or prototyping C
code in R.
Usage
duplicate(x, shallow = FALSE)
Arguments
x |
An R object. Uncopyable objects like symbols and
environments are returned as is (just like with |
shallow |
Recursive data structures like lists, calls and pairlists are duplicated in full by default. A shallow copy only duplicates the top-level data structure. |
See Also
pairlist
Dynamic dots features
Description
The base ...
syntax supports:
-
Forwarding arguments from function to function, matching them along the way to arguments.
-
Collecting arguments inside data structures, e.g. with
c()
orlist()
.
Dynamic dots offer a few additional features, injection in particular:
You can splice arguments saved in a list with the splice operator
!!!
.You can inject names with glue syntax on the left-hand side of
:=
.Trailing commas are ignored, making it easier to copy and paste lines of arguments.
Add dynamic dots support in your functions
If your function takes dots, adding support for dynamic features is
as easy as collecting the dots with list2()
instead of list()
.
See also dots_list()
, which offers more control over the collection.
In general, passing ...
to a function that supports dynamic dots
causes your function to inherit the dynamic behaviour.
In packages, document dynamic dots with this standard tag:
@param ... <[`dynamic-dots`][rlang::dyn-dots]> What these dots do.
Examples
f <- function(...) {
out <- list2(...)
rev(out)
}
# Trailing commas are ignored
f(this = "that", )
# Splice lists of arguments with `!!!`
x <- list(alpha = "first", omega = "last")
f(!!!x)
# Inject a name using glue syntax
if (is_installed("glue")) {
nm <- "key"
f("{nm}" := "value")
f("prefix_{nm}" := "value")
}
Embrace operator {{
Description
The embrace operator {{
is used to create functions that call
other data-masking functions. It transports a
data-masked argument (an argument that can refer to columns of a
data frame) from one function to another.
my_mean <- function(data, var) { dplyr::summarise(data, mean = mean({{ var }})) }
Under the hood
{{
combines enquo()
and !!
in one
step. The snippet above is equivalent to:
my_mean <- function(data, var) { var <- enquo(var) dplyr::summarise(data, mean = mean(!!var)) }
See Also
Get the empty environment
Description
The empty environment is the only one that does not have a parent.
It is always used as the tail of an environment chain such as the
search path (see search_envs()
).
Usage
empty_env()
Examples
# Create environments with nothing in scope:
child_env(empty_env())
Defuse function arguments with glue
Description
englue()
creates a string with the glue operators {
and {{
. These operators are
normally used to inject names within dynamic dots.
englue()
makes them available anywhere within a function.
englue()
must be used inside a function. englue("{{ var }}")
defuses the argument var
and transforms it to a
string using the default name operation.
Usage
englue(x, env = caller_env(), error_call = current_env(), error_arg = "x")
Arguments
x |
A string to interpolate with glue operators. |
env |
User environment where the interpolation data lives in
case you're wrapping |
error_call |
The execution environment of a currently
running function, e.g. |
error_arg |
An argument name as a string. This argument will be mentioned in error messages as the input that is at the origin of a problem. |
Details
englue("{{ var }}")
is equivalent to as_label(enquo(var))
. It
defuses arg
and transforms the expression to a
string with as_label()
.
In dynamic dots, using only {
is allowed. In englue()
you must
use {{
at least once. Use glue::glue()
for simple
interpolation.
Before using englue()
in a package, first ensure that glue is
installed by adding it to your Imports:
section.
usethis::use_package("glue", "Imports")
Wrapping englue()
You can provide englue semantics to a user provided string by supplying env
.
In this example we create a variant of englue()
that supports a
special .qux
pronoun by:
Creating an environment
masked_env
that inherits from the user env, the one where their data lives.Overriding the
error_arg
anderror_call
arguments to point to our own argument name and call environment. This pattern is slightly different from usual error context passing becauseenglue()
is a backend function that uses its own error context by default (and not a checking function that uses your error context by default).
my_englue <- function(text) { masked_env <- env(caller_env(), .qux = "QUX") englue( text, env = masked_env, error_arg = "text", error_call = current_env() ) } # Users can then use your wrapper as they would use `englue()`: fn <- function(x) { foo <- "FOO" my_englue("{{ x }}_{.qux}_{foo}") } fn(bar) #> [1] "bar_QUX_FOO"
If you are creating a low level package on top of englue(), you
should also consider exposing env
, error_arg
and error_call
in your englue()
wrapper so users can wrap your wrapper.
See Also
Examples
g <- function(var) englue("{{ var }}")
g(cyl)
g(1 + 1)
g(!!letters)
# These are equivalent to
as_label(quote(cyl))
as_label(quote(1 + 1))
as_label(letters)
Defuse function arguments
Description
enquo()
and enquos()
defuse function arguments.
A defused expression can be examined, modified, and injected into
other expressions.
Defusing function arguments is useful for:
Creating data-masking functions.
Interfacing with another data-masking function using the defuse-and-inject pattern.
These are advanced tools. Make sure to first learn about the embrace
operator {{
in Data mask programming patterns.
{{
is easier to work with less theory, and it is sufficient
in most applications.
Usage
enquo(arg)
enquos(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.ignore_null = c("none", "all"),
.unquote_names = TRUE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
Arguments
arg |
An unquoted argument name. The expression supplied to that argument is defused and returned. |
... |
Names of arguments to defuse. |
.named |
If |
.ignore_empty |
Whether to ignore empty arguments. Can be one
of |
.ignore_null |
Whether to ignore unnamed null arguments. Can be
|
.unquote_names |
Whether to treat |
.homonyms |
How to treat arguments with the same name. The
default, |
.check_assign |
Whether to check for |
Value
enquo()
returns a quosure and enquos()
returns a list of quosures.
Implicit injection
Arguments defused with enquo()
and enquos()
automatically gain
injection support.
my_mean <- function(data, var) { var <- enquo(var) dplyr::summarise(data, mean(!!var)) } # Can now use `!!` and `{{` my_mean(mtcars, !!sym("cyl"))
See enquo0()
and enquos0()
for variants that don't enable
injection.
See Also
-
Defusing R expressions for an overview.
-
expr()
to defuse your own local expressions. -
base::eval()
andeval_bare()
for resuming evaluation of a defused expression.
Examples
# `enquo()` defuses the expression supplied by your user
f <- function(arg) {
enquo(arg)
}
f(1 + 1)
# `enquos()` works with arguments and dots. It returns a list of
# expressions
f <- function(...) {
enquos(...)
}
f(1 + 1, 2 * 10)
# `enquo()` and `enquos()` enable _injection_ and _embracing_ for
# your users
g <- function(arg) {
f({{ arg }} * 2)
}
g(100)
column <- sym("cyl")
g(!!column)
Add backtrace from error handler
Description
entrace()
is a low level function. See global_entrace()
for a
user-friendly way of enriching errors and other conditions from
your RProfile.
-
entrace()
is meant to be used as a global handler. It enriches conditions with a backtrace. Errors are saved tolast_error()
and rethrown immediately. Messages and warnings are recorded intolast_messages()
andlast_warnings()
and let through. -
cnd_entrace()
adds a backtrace to a condition object, without any other effect. It should be called from a condition handler.
entrace()
also works as an option(error = )
handler for
compatibility with versions of R older than 4.0.
When used as calling handler, rlang trims the handler invokation context from the backtrace.
Usage
entrace(cnd, ..., top = NULL, bottom = NULL)
cnd_entrace(cnd, ..., top = NULL, bottom = NULL)
Arguments
cnd |
When |
... |
Unused. These dots are for future extensions. |
top |
The first frame environment to be included in the backtrace. This becomes the top of the backtrace tree and represents the oldest call in the backtrace. This is needed in particular when you call If not supplied, the |
bottom |
The last frame environment to be included in the backtrace. This becomes the rightmost leaf of the backtrace tree and represents the youngest call in the backtrace. Set this when you would like to capture a backtrace without the capture context. Can also be an integer that will be passed to |
See Also
global_entrace()
for configuring errors with
entrace()
. cnd_entrace()
to manually add a backtrace to a
condition.
Examples
quote({ # Not run
# Set `entrace()` globally in your RProfile
globalCallingHandlers(error = rlang::entrace)
# On older R versions which don't feature `globalCallingHandlers`,
# set the error handler like this:
options(error = rlang::entrace)
})
Create a new environment
Description
These functions create new environments.
-
env()
creates a child of the current environment by default and takes a variable number of named objects to populate it. -
new_environment()
creates a child of the empty environment by default and takes a named list of objects to populate it.
Usage
env(...)
new_environment(data = list(), parent = empty_env())
Arguments
... , data |
<dynamic> Named values. You can supply one unnamed to specify a custom parent, otherwise it defaults to the current environment. |
parent |
A parent environment. |
Environments as objects
Environments are containers of uniquely named objects. Their most common use is to provide a scope for the evaluation of R expressions. Not all languages have first class environments, i.e. can manipulate scope as regular objects. Reification of scope is one of the most powerful features of R as it allows you to change what objects a function or expression sees when it is evaluated.
Environments also constitute a data structure in their own right. They are a collection of uniquely named objects, subsettable by name and modifiable by reference. This latter property (see section on reference semantics) is especially useful for creating mutable OO systems (cf the R6 package and the ggproto system for extending ggplot2).
Inheritance
All R environments (except the empty environment) are defined with a parent environment. An environment and its grandparents thus form a linear hierarchy that is the basis for lexical scoping in R. When R evaluates an expression, it looks up symbols in a given environment. If it cannot find these symbols there, it keeps looking them up in parent environments. This way, objects defined in child environments have precedence over objects defined in parent environments.
The ability of overriding specific definitions is used in the
tidyeval framework to create powerful domain-specific grammars. A
common use of masking is to put data frame columns in scope. See
for example as_data_mask()
.
Reference semantics
Unlike regular objects such as vectors, environments are an
uncopyable object type. This means that if you
have multiple references to a given environment (by assigning the
environment to another symbol with <-
or passing the environment
as argument to a function), modifying the bindings of one of those
references changes all other references as well.
See Also
Examples
# env() creates a new environment that inherits from the current
# environment by default
env <- env(a = 1, b = "foo")
env$b
identical(env_parent(env), current_env())
# Supply one unnamed argument to inherit from another environment:
env <- env(base_env(), a = 1, b = "foo")
identical(env_parent(env), base_env())
# Both env() and child_env() support tidy dots features:
objs <- list(b = "foo", c = "bar")
env <- env(a = 1, !!! objs)
env$c
# You can also unquote names with the definition operator `:=`
var <- "a"
env <- env(!!var := "A")
env$a
# Use new_environment() to create containers with the empty
# environment as parent:
env <- new_environment()
env_parent(env)
# Like other new_ constructors, it takes an object rather than dots:
new_environment(list(a = "foo", b = "bar"))
Bind symbols to objects in an environment
Description
These functions create bindings in an environment. The bindings are
supplied through ...
as pairs of names and values or expressions.
env_bind()
is equivalent to evaluating a <-
expression within
the given environment. This function should take care of the
majority of use cases but the other variants can be useful for
specific problems.
-
env_bind()
takes named values which are bound in.env
.env_bind()
is equivalent tobase::assign()
. -
env_bind_active()
takes named functions and creates active bindings in.env
. This is equivalent tobase::makeActiveBinding()
. An active binding executes a function each time it is evaluated. The arguments are passed toas_function()
so you can supply formulas instead of functions.Remember that functions are scoped in their own environment. These functions can thus refer to symbols from this enclosure that are not actually in scope in the dynamic environment where the active bindings are invoked. This allows creative solutions to difficult problems (see the implementations of
dplyr::do()
methods for an example). -
env_bind_lazy()
takes named expressions. This is equivalent tobase::delayedAssign()
. The arguments are captured withexprs()
(and thus support call-splicing and unquoting) and assigned to symbols in.env
. These expressions are not evaluated immediately but lazily. Once a symbol is evaluated, the corresponding expression is evaluated in turn and its value is bound to the symbol (the expressions are thus evaluated only once, if at all). -
%<~%
is a shortcut forenv_bind_lazy()
. It works like<-
but the RHS is evaluated lazily.
Usage
env_bind(.env, ...)
env_bind_lazy(.env, ..., .eval_env = caller_env())
env_bind_active(.env, ...)
lhs %<~% rhs
Arguments
.env |
An environment. |
... |
<dynamic> Named objects ( |
.eval_env |
The environment where the expressions will be evaluated when the symbols are forced. |
lhs |
The variable name to which |
rhs |
An expression lazily evaluated and assigned to |
Value
The input object .env
, with its associated environment
modified in place, invisibly.
Side effects
Since environments have reference semantics (see relevant section
in env()
documentation), modifying the bindings of an environment
produces effects in all other references to that environment. In
other words, env_bind()
and its variants have side effects.
Like other side-effecty functions like par()
and options()
,
env_bind()
and variants return the old values invisibly.
See Also
env_poke()
for binding a single element.
Examples
# env_bind() is a programmatic way of assigning values to symbols
# with `<-`. We can add bindings in the current environment:
env_bind(current_env(), foo = "bar")
foo
# Or modify those bindings:
bar <- "bar"
env_bind(current_env(), bar = "BAR")
bar
# You can remove bindings by supplying zap sentinels:
env_bind(current_env(), foo = zap())
try(foo)
# Unquote-splice a named list of zaps
zaps <- rep_named(c("foo", "bar"), list(zap()))
env_bind(current_env(), !!!zaps)
try(bar)
# It is most useful to change other environments:
my_env <- env()
env_bind(my_env, foo = "foo")
my_env$foo
# A useful feature is to splice lists of named values:
vals <- list(a = 10, b = 20)
env_bind(my_env, !!!vals, c = 30)
my_env$b
my_env$c
# You can also unquote a variable referring to a symbol or a string
# as binding name:
var <- "baz"
env_bind(my_env, !!var := "BAZ")
my_env$baz
# The old values of the bindings are returned invisibly:
old <- env_bind(my_env, a = 1, b = 2, baz = "baz")
old
# You can restore the original environment state by supplying the
# old values back:
env_bind(my_env, !!!old)
# env_bind_lazy() assigns expressions lazily:
env <- env()
env_bind_lazy(env, name = { cat("forced!\n"); "value" })
# Referring to the binding will cause evaluation:
env$name
# But only once, subsequent references yield the final value:
env$name
# You can unquote expressions:
expr <- quote(message("forced!"))
env_bind_lazy(env, name = !!expr)
env$name
# By default the expressions are evaluated in the current
# environment. For instance we can create a local binding and refer
# to it, even though the variable is bound in a different
# environment:
who <- "mickey"
env_bind_lazy(env, name = paste(who, "mouse"))
env$name
# You can specify another evaluation environment with `.eval_env`:
eval_env <- env(who = "minnie")
env_bind_lazy(env, name = paste(who, "mouse"), .eval_env = eval_env)
env$name
# Or by unquoting a quosure:
quo <- local({
who <- "fievel"
quo(paste(who, "mouse"))
})
env_bind_lazy(env, name = !!quo)
env$name
# You can create active bindings with env_bind_active(). Active
# bindings execute a function each time they are evaluated:
fn <- function() {
cat("I have been called\n")
rnorm(1)
}
env <- env()
env_bind_active(env, symbol = fn)
# `fn` is executed each time `symbol` is evaluated or retrieved:
env$symbol
env$symbol
eval_bare(quote(symbol), env)
eval_bare(quote(symbol), env)
# All arguments are passed to as_function() so you can use the
# formula shortcut:
env_bind_active(env, foo = ~ runif(1))
env$foo
env$foo
What kind of environment binding?
Description
Usage
env_binding_are_active(env, nms = NULL)
env_binding_are_lazy(env, nms = NULL)
Arguments
env |
An environment. |
nms |
Names of bindings. Defaults to all bindings in |
Value
A logical vector as long as nms
and named after it.
Lock or unlock environment bindings
Description
Locked environment bindings trigger an error when an attempt is made to redefine the binding.
Usage
env_binding_lock(env, nms = NULL)
env_binding_unlock(env, nms = NULL)
env_binding_are_locked(env, nms = NULL)
Arguments
env |
An environment. |
nms |
Names of bindings. Defaults to all bindings in |
Value
env_binding_are_unlocked()
returns a logical vector as
long as nms
and named after it. env_binding_lock()
and
env_binding_unlock()
return the old value of
env_binding_are_unlocked()
invisibly.
See Also
env_lock()
for locking an environment.
Examples
# Bindings are unlocked by default:
env <- env(a = "A", b = "B")
env_binding_are_locked(env)
# But can optionally be locked:
env_binding_lock(env, "a")
env_binding_are_locked(env)
# If run, the following would now return an error because `a` is locked:
# env_bind(env, a = "foo")
# with_env(env, a <- "bar")
# Let's unlock it. Note that the return value indicate which
# bindings were locked:
were_locked <- env_binding_unlock(env)
were_locked
# Now that it is unlocked we can modify it again:
env_bind(env, a = "foo")
with_env(env, a <- "bar")
env$a
Browse environments
Description
-
env_browse(env)
is equivalent to evaluatingbrowser()
inenv
. It persistently sets the environment for step-debugging. Supplyvalue = FALSE
to disable browsing. -
env_is_browsed()
is a predicate that inspects whether an environment is being browsed.
Usage
env_browse(env, value = TRUE)
env_is_browsed(env)
Arguments
env |
An environment. |
value |
Whether to browse |
Value
env_browse()
returns the previous value of
env_is_browsed()
(a logical), invisibly.
Mask bindings by defining symbols deeper in a scope
Description
This function is superseded. Please use env()
(and possibly
set_env()
if you're masking the bindings for another object like
a closure or a formula) instead.
env_bury()
is like env_bind()
but it creates the bindings in a
new child environment. This makes sure the new bindings have
precedence over old ones, without altering existing environments.
Unlike env_bind()
, this function does not have side effects and
returns a new environment (or object wrapping that environment).
Usage
env_bury(.env, ...)
Arguments
.env |
An environment. |
... |
<dynamic> Named objects ( |
Value
A copy of .env
enclosing the new environment containing
bindings to ...
arguments.
See Also
Examples
orig_env <- env(a = 10)
fn <- set_env(function() a, orig_env)
# fn() currently sees `a` as the value `10`:
fn()
# env_bury() will bury the current scope of fn() behind a new
# environment:
fn <- env_bury(fn, a = 1000)
fn()
# Even though the symbol `a` is still defined deeper in the scope:
orig_env$a
Cache a value in an environment
Description
env_cache()
is a wrapper around env_get()
and env_poke()
designed to retrieve a cached value from env
.
If the
nm
binding exists, it returns its value.Otherwise, it stores the default value in
env
and returns that.
Usage
env_cache(env, nm, default)
Arguments
env |
An environment. |
nm |
Name of binding, a string. |
default |
The default value to store in |
Value
Either the value of nm
or default
if it did not exist
yet.
Examples
e <- env(a = "foo")
# Returns existing binding
env_cache(e, "a", "default")
# Creates a `b` binding and returns its default value
env_cache(e, "b", "default")
# Now `b` is defined
e$b
Clone or coalesce an environment
Description
-
env_clone()
creates a new environment containing exactly the same bindings as the input, optionally with a new parent. -
env_coalesce()
copies binding from the RHS environment into the LHS. If the RHS already contains bindings with the same name as in the LHS, those are kept as is.
Both these functions preserve active bindings and promises (the latter are only preserved on R >= 4.0.0).
Usage
env_clone(env, parent = env_parent(env))
env_coalesce(env, from)
Arguments
env |
An environment. |
parent |
The parent of the cloned environment. |
from |
Environment to copy bindings from. |
Examples
# A clone initially contains the same bindings as the original
# environment
env <- env(a = 1, b = 2)
clone <- env_clone(env)
env_print(clone)
env_print(env)
# But it can acquire new bindings or change existing ones without
# impacting the original environment
env_bind(clone, a = "foo", c = 3)
env_print(clone)
env_print(env)
# `env_coalesce()` copies bindings from one environment to another
lhs <- env(a = 1)
rhs <- env(a = "a", b = "b", c = "c")
env_coalesce(lhs, rhs)
env_print(lhs)
# To copy all the bindings from `rhs` into `lhs`, first delete the
# conflicting bindings from `rhs`
env_unbind(lhs, env_names(rhs))
env_coalesce(lhs, rhs)
env_print(lhs)
Depth of an environment chain
Description
This function returns the number of environments between env
and
the empty environment, including env
. The depth of
env
is also the number of parents of env
(since the empty
environment counts as a parent).
Usage
env_depth(env)
Arguments
env |
An environment. |
Value
An integer.
See Also
The section on inheritance in env()
documentation.
Examples
env_depth(empty_env())
env_depth(pkg_env("rlang"))
Get an object in an environment
Description
env_get()
extracts an object from an enviroment env
. By
default, it does not look in the parent environments.
env_get_list()
extracts multiple objects from an environment into
a named list.
Usage
env_get(env = caller_env(), nm, default, inherit = FALSE, last = empty_env())
env_get_list(
env = caller_env(),
nms,
default,
inherit = FALSE,
last = empty_env()
)
Arguments
env |
An environment. |
nm |
Name of binding, a string. |
default |
A default value in case there is no binding for |
inherit |
Whether to look for bindings in the parent environments. |
last |
Last environment inspected when |
nms |
Names of bindings, a character vector. |
Value
An object if it exists. Otherwise, throws an error.
See Also
env_cache()
for a variant of env_get()
designed to
cache a value in an environment.
Examples
parent <- child_env(NULL, foo = "foo")
env <- child_env(parent, bar = "bar")
# This throws an error because `foo` is not directly defined in env:
# env_get(env, "foo")
# However `foo` can be fetched in the parent environment:
env_get(env, "foo", inherit = TRUE)
# You can also avoid an error by supplying a default value:
env_get(env, "foo", default = "FOO")
Does an environment have or see bindings?
Description
env_has()
is a vectorised predicate that queries whether an
environment owns bindings personally (with inherit
set to
FALSE
, the default), or sees them in its own environment or in
any of its parents (with inherit = TRUE
).
Usage
env_has(env = caller_env(), nms, inherit = FALSE)
Arguments
env |
An environment. |
nms |
A character vector of binding names for which to check existence. |
inherit |
Whether to look for bindings in the parent environments. |
Value
A named logical vector as long as nms
.
Examples
parent <- child_env(NULL, foo = "foo")
env <- child_env(parent, bar = "bar")
# env does not own `foo` but sees it in its parent environment:
env_has(env, "foo")
env_has(env, "foo", inherit = TRUE)
Does environment inherit from another environment?
Description
This returns TRUE
if x
has ancestor
among its parents.
Usage
env_inherits(env, ancestor)
Arguments
env |
An environment. |
ancestor |
Another environment from which |
Is frame environment user facing?
Description
Detects if env
is user-facing, that is, whether it's an environment
that inherits from:
The global environment, as would happen when called interactively
A package that is currently being tested
If either is true, we consider env
to belong to an evaluation
frame that was called directly by the end user. This is by
contrast to indirect calls by third party functions which are not
user facing.
For instance the lifecycle package
uses env_is_user_facing()
to figure out whether a deprecated function
was called directly or indirectly, and select an appropriate
verbosity level as a function of that.
Usage
env_is_user_facing(env)
Arguments
env |
An environment. |
Escape hatch
You can override the return value of env_is_user_facing()
by
setting the global option "rlang_user_facing"
to:
-
TRUE
orFALSE
. A package name as a string. Then
env_is_user_facing(x)
returnsTRUE
ifx
inherits from the namespace corresponding to that package name.
Examples
fn <- function() {
env_is_user_facing(caller_env())
}
# Direct call of `fn()` from the global env
with(global_env(), fn())
# Indirect call of `fn()` from a package
with(ns_env("utils"), fn())
Lock an environment
Description
Locked environments cannot be modified. An important example is namespace environments which are locked by R when loaded in a session. Once an environment is locked it normally cannot be unlocked.
Note that only the environment as a container is locked, not the
individual bindings. You can't remove or add a binding but you can
still modify the values of existing bindings. See
env_binding_lock()
for locking individual bindings.
Usage
env_lock(env)
env_is_locked(env)
Arguments
env |
An environment. |
Value
The old value of env_is_locked()
invisibly.
See Also
Examples
# New environments are unlocked by default:
env <- env(a = 1)
env_is_locked(env)
# Use env_lock() to lock them:
env_lock(env)
env_is_locked(env)
# Now that `env` is locked, it is no longer possible to remove or
# add bindings. If run, the following would fail:
# env_unbind(env, "a")
# env_bind(env, b = 2)
# Note that even though the environment as a container is locked,
# the individual bindings are still unlocked and can be modified:
env$a <- 10
Label of an environment
Description
Special environments like the global environment have their own
names. env_name()
returns:
"global" for the global environment.
"empty" for the empty environment.
"base" for the base package environment (the last environment on the search path).
"namespace:pkg" if
env
is the namespace of the package "pkg".The
name
attribute ofenv
if it exists. This is how the package environments and the imports environments store their names. The name of package environments is typically "package:pkg".The empty string
""
otherwise.
env_label()
is exactly like env_name()
but returns the memory
address of anonymous environments as fallback.
Usage
env_name(env)
env_label(env)
Arguments
env |
An environment. |
Examples
# Some environments have specific names:
env_name(global_env())
env_name(ns_env("rlang"))
# Anonymous environments don't have names but are labelled by their
# address in memory:
env_name(env())
env_label(env())
Names and numbers of symbols bound in an environment
Description
env_names()
returns object names from an enviroment env
as a
character vector. All names are returned, even those starting with
a dot. env_length()
returns the number of bindings.
Usage
env_names(env)
env_length(env)
Arguments
env |
An environment. |
Value
A character vector of object names.
Names of symbols and objects
Technically, objects are bound to symbols rather than strings,
since the R interpreter evaluates symbols (see is_expression()
for a
discussion of symbolic objects versus literal objects). However it
is often more convenient to work with strings. In rlang
terminology, the string corresponding to a symbol is called the
name of the symbol (or by extension the name of an object bound
to a symbol).
Encoding
There are deep encoding issues when you convert a string to symbol
and vice versa. Symbols are always in the native encoding. If
that encoding (let's say latin1) cannot support some characters,
these characters are serialised to ASCII. That's why you sometimes
see strings looking like <U+1234>
, especially if you're running
Windows (as R doesn't support UTF-8 as native encoding on that
platform).
To alleviate some of the encoding pain, env_names()
always
returns a UTF-8 character vector (which is fine even on Windows)
with ASCII unicode points translated back to UTF-8.
Examples
env <- env(a = 1, b = 2)
env_names(env)
Get parent environments
Description
-
env_parent()
returns the parent environment ofenv
if called withn = 1
, the grandparent withn = 2
, etc. -
env_tail()
searches through the parents and returns the one which hasempty_env()
as parent. -
env_parents()
returns the list of all parents, including the empty environment. This list is named usingenv_name()
.
See the section on inheritance in env()
's documentation.
Usage
env_parent(env = caller_env(), n = 1)
env_tail(env = caller_env(), last = global_env())
env_parents(env = caller_env(), last = global_env())
Arguments
env |
An environment. |
n |
The number of generations to go up. |
last |
The environment at which to stop. Defaults to the global environment. The empty environment is always a stopping condition so it is safe to leave the default even when taking the tail or the parents of an environment on the search path.
|
Value
An environment for env_parent()
and env_tail()
, a list
of environments for env_parents()
.
Examples
# Get the parent environment with env_parent():
env_parent(global_env())
# Or the tail environment with env_tail():
env_tail(global_env())
# By default, env_parent() returns the parent environment of the
# current evaluation frame. If called at top-level (the global
# frame), the following two expressions are equivalent:
env_parent()
env_parent(base_env())
# This default is more handy when called within a function. In this
# case, the enclosure environment of the function is returned
# (since it is the parent of the evaluation frame):
enclos_env <- env()
fn <- set_env(function() env_parent(), enclos_env)
identical(enclos_env, fn())
Poke an object in an environment
Description
env_poke()
will assign or reassign a binding in env
if create
is TRUE
. If create
is FALSE
and a binding does not already
exists, an error is issued.
Usage
env_poke(env = caller_env(), nm, value, inherit = FALSE, create = !inherit)
Arguments
env |
An environment. |
nm |
Name of binding, a string. |
value |
The value for a new binding. |
inherit |
Whether to look for bindings in the parent environments. |
create |
Whether to create a binding if it does not already exist in the environment. |
Details
If inherit
is TRUE
, the parents environments are checked for
an existing binding to reassign. If not found and create
is
TRUE
, a new binding is created in env
. The default value for
create
is a function of inherit
: FALSE
when inheriting,
TRUE
otherwise.
This default makes sense because the inheriting case is mostly
for overriding an existing binding. If not found, something
probably went wrong and it is safer to issue an error. Note that
this is different to the base R operator <<-
which will create
a binding in the global environment instead of the current
environment when no existing binding is found in the parents.
Value
The old value of nm
or a zap sentinel if the
binding did not exist yet.
See Also
env_bind()
for binding multiple elements. env_cache()
for a variant of env_poke()
designed to cache values.
Pretty-print an environment
Description
This prints:
The label and the parent label.
Whether the environment is locked.
The bindings in the environment (up to 20 bindings). They are printed succintly using
pillar::type_sum()
(if available, otherwise uses an internal version of that generic). In addition fancy bindings (actives and promises) are indicated as such.Locked bindings get a
[L]
tag
Note that printing a package namespace (see ns_env()
) with
env_print()
will typically tag function bindings as <lazy>
until they are evaluated the first time. This is because package
functions are lazily-loaded from disk to improve performance when
loading a package.
Usage
env_print(env = caller_env())
Arguments
env |
An environment, or object that can be converted to an
environment by |
Remove bindings from an environment
Description
env_unbind()
is the complement of env_bind()
. Like env_has()
,
it ignores the parent environments of env
by default. Set
inherit
to TRUE
to track down bindings in parent environments.
Usage
env_unbind(env = caller_env(), nms, inherit = FALSE)
Arguments
env |
An environment. |
nms |
A character vector of binding names to remove. |
inherit |
Whether to look for bindings in the parent environments. |
Value
The input object env
with its associated environment
modified in place, invisibly.
Examples
env <- env(foo = 1, bar = 2)
env_has(env, c("foo", "bar"))
# Remove bindings with `env_unbind()`
env_unbind(env, c("foo", "bar"))
env_has(env, c("foo", "bar"))
# With inherit = TRUE, it removes bindings in parent environments
# as well:
parent <- env(empty_env(), foo = 1, bar = 2)
env <- env(parent, foo = "b")
env_unbind(env, "foo", inherit = TRUE)
env_has(env, c("foo", "bar"))
env_has(env, c("foo", "bar"), inherit = TRUE)
Unlock an environment
Description
This function should only be used in development tools or interactively.
Usage
env_unlock(env)
Arguments
env |
An environment. |
Value
Whether the environment has been unlocked.
Evaluate an expression in an environment
Description
eval_bare()
is a lower-level version of function base::eval()
.
Technically, it is a simple wrapper around the C function
Rf_eval()
. You generally don't need to use eval_bare()
instead
of eval()
. Its main advantage is that it handles stack-sensitive
calls (such as return()
, on.exit()
or parent.frame()
) more
consistently when you pass an enviroment of a frame on the call
stack.
Usage
eval_bare(expr, env = parent.frame())
Arguments
expr |
An expression to evaluate. |
env |
The environment in which to evaluate the expression. |
Details
These semantics are possible because eval_bare()
creates only one
frame on the call stack whereas eval()
creates two frames, the
second of which has the user-supplied environment as frame
environment. When you supply an existing frame environment to
base::eval()
there will be two frames on the stack with the same
frame environment. Stack-sensitive functions only detect the
topmost of these frames. We call these evaluation semantics
"stack inconsistent".
Evaluating expressions in the actual frame environment has useful
practical implications for eval_bare()
:
-
return()
calls are evaluated in frame environments that might be burried deep in the call stack. This causes a long return that unwinds multiple frames (triggering theon.exit()
event for each frame). By contrasteval()
only returns from theeval()
call, one level up. -
on.exit()
,parent.frame()
,sys.call()
, and generally all the stack inspection functionssys.xxx()
are evaluated in the correct frame environment. This is similar to how this type of calls can be evaluated deep in the call stack because of lazy evaluation, when you force an argument that has been passed around several times.
The flip side of the semantics of eval_bare()
is that it can't
evaluate break
or next
expressions even if called within a
loop.
See Also
eval_tidy()
for evaluation with data mask and quosure
support.
Examples
# eval_bare() works just like base::eval() but you have to create
# the evaluation environment yourself:
eval_bare(quote(foo), env(foo = "bar"))
# eval() has different evaluation semantics than eval_bare(). It
# can return from the supplied environment even if its an
# environment that is not on the call stack (i.e. because you've
# created it yourself). The following would trigger an error with
# eval_bare():
ret <- quote(return("foo"))
eval(ret, env())
# eval_bare(ret, env()) # "no function to return from" error
# Another feature of eval() is that you can control surround loops:
bail <- quote(break)
while (TRUE) {
eval(bail)
# eval_bare(bail) # "no loop for break/next" error
}
# To explore the consequences of stack inconsistent semantics, let's
# create a function that evaluates `parent.frame()` deep in the call
# stack, in an environment corresponding to a frame in the middle of
# the stack. For consistency with R's lazy evaluation semantics, we'd
# expect to get the caller of that frame as result:
fn <- function(eval_fn) {
list(
returned_env = middle(eval_fn),
actual_env = current_env()
)
}
middle <- function(eval_fn) {
deep(eval_fn, current_env())
}
deep <- function(eval_fn, eval_env) {
expr <- quote(parent.frame())
eval_fn(expr, eval_env)
}
# With eval_bare(), we do get the expected environment:
fn(rlang::eval_bare)
# But that's not the case with base::eval():
fn(base::eval)
Evaluate an expression with quosures and pronoun support
Description
eval_tidy()
is a variant of base::eval()
that powers the tidy
evaluation framework. Like eval()
it accepts user data as
argument. Whereas eval()
simply transforms the data to an
environment, eval_tidy()
transforms it to a data mask with as_data_mask()
. Evaluating in a data
mask enables the following features:
-
Quosures. Quosures are expressions bundled with an environment. If
data
is supplied, objects in the data mask always have precedence over the quosure environment, i.e. the data masks the environment. -
Pronouns. If
data
is supplied, the.env
and.data
pronouns are installed in the data mask..env
is a reference to the calling environment and.data
refers to thedata
argument. These pronouns are an escape hatch for the data mask ambiguity problem.
Usage
eval_tidy(expr, data = NULL, env = caller_env())
Arguments
expr |
An expression or quosure to evaluate. |
data |
A data frame, or named list or vector. Alternatively, a
data mask created with |
env |
The environment in which to evaluate |
When should eval_tidy() be used instead of eval()?
base::eval()
is sufficient for simple evaluation. Use
eval_tidy()
when you'd like to support expressions referring to
the .data
pronoun, or when you need to support quosures.
If you're evaluating an expression captured with
injection support, it is recommended to use
eval_tidy()
because users may inject quosures.
Note that unwrapping a quosure with quo_get_expr()
does not
guarantee that there is no quosures inside the expression. Quosures
might be unquoted anywhere in the expression tree. For instance,
the following does not work reliably in the presence of nested
quosures:
my_quoting_fn <- function(x) { x <- enquo(x) expr <- quo_get_expr(x) env <- quo_get_env(x) eval(expr, env) } # Works: my_quoting_fn(toupper(letters)) # Fails because of a nested quosure: my_quoting_fn(toupper(!!quo(letters)))
Stack semantics of eval_tidy()
eval_tidy()
always evaluates in a data mask, even when data
is
NULL
. Because of this, it has different stack semantics than
base::eval()
:
Lexical side effects, such as assignment with
<-
, occur in the mask rather thanenv
.Functions that require the evaluation environment to correspond to a frame on the call stack do not work. This is why
return()
called from a quosure does not work.The mask environment creates a new branch in the tree representation of backtraces (which you can visualise in a
browser()
session withlobstr::cst()
).
See also eval_bare()
for more information about these differences.
See Also
-
new_data_mask()
andas_data_mask()
for manually creating data masks.
Examples
# With simple defused expressions eval_tidy() works the same way as
# eval():
fruit <- "apple"
vegetable <- "potato"
expr <- quote(paste(fruit, vegetable, sep = " or "))
expr
eval(expr)
eval_tidy(expr)
# Both accept a data mask as argument:
data <- list(fruit = "banana", vegetable = "carrot")
eval(expr, data)
eval_tidy(expr, data)
# The main difference is that eval_tidy() supports quosures:
with_data <- function(data, expr) {
quo <- enquo(expr)
eval_tidy(quo, data)
}
with_data(NULL, fruit)
with_data(data, fruit)
# eval_tidy() installs the `.data` and `.env` pronouns to allow
# users to be explicit about variable references:
with_data(data, .data$fruit)
with_data(data, .env$fruit)
Execute a function
Description
This function constructs and evaluates a call to .fn
.
It has two primary uses:
To call a function with arguments stored in a list (if the function doesn't support dynamic dots). Splice the list of arguments with
!!!
.To call every function stored in a list (in conjunction with
map()
/lapply()
)
Usage
exec(.fn, ..., .env = caller_env())
Arguments
.fn |
A function, or function name as a string. |
... |
<dynamic> Arguments for |
.env |
Environment in which to evaluate the call. This will be
most useful if |
Examples
args <- list(x = c(1:10, 100, NA), na.rm = TRUE)
exec("mean", !!!args)
exec("mean", !!!args, trim = 0.2)
fs <- list(a = function() "a", b = function() "b")
lapply(fs, exec)
# Compare to do.call it will not automatically inline expressions
# into the evaluated call.
x <- 10
args <- exprs(x1 = x + 1, x2 = x * 2)
exec(list, !!!args)
do.call(list, args)
# exec() is not designed to generate pretty function calls. This is
# most easily seen if you call a function that captures the call:
f <- disp ~ cyl
exec("lm", f, data = mtcars)
# If you need finer control over the generated call, you'll need to
# construct it yourself. This may require creating a new environment
# with carefully constructed bindings
data_env <- env(data = mtcars)
eval(expr(lm(!!f, data)), data_env)
Defuse an R expression
Description
expr()
defuses an R expression with
injection support.
It is equivalent to base::bquote()
.
Arguments
expr |
An expression to defuse. |
See Also
-
Defusing R expressions for an overview.
-
enquo()
to defuse non-local expressions from function arguments. -
sym()
andcall2()
for building expressions (symbols and calls respectively) programmatically. -
base::eval()
andeval_bare()
for resuming evaluation of a defused expression.
Examples
# R normally returns the result of an expression
1 + 1
# `expr()` defuses the expression that you have supplied and
# returns it instead of its value
expr(1 + 1)
expr(toupper(letters))
# It supports _injection_ with `!!` and `!!!`. This is a convenient
# way of modifying part of an expression by injecting other
# objects.
var <- "cyl"
expr(with(mtcars, mean(!!sym(var))))
vars <- c("cyl", "am")
expr(with(mtcars, c(!!!syms(vars))))
# Compare to the normal way of building expressions
call("with", call("mean", sym(var)))
call("with", call2("c", !!!syms(vars)))
Process unquote operators in a captured expression
Description
expr_interp()
is deprecated, please use inject()
instead.
Usage
expr_interp(x, env = NULL)
Arguments
x , env |
Turn an expression to a label
Description
expr_text()
turns the expression into a single string, which
might be multi-line. expr_name()
is suitable for formatting
names. It works best with symbols and scalar types, but also
accepts calls. expr_label()
formats the expression nicely for use
in messages.
Usage
expr_label(expr)
expr_name(expr)
expr_text(expr, width = 60L, nlines = Inf)
Arguments
expr |
An expression to labellise. |
width |
Width of each line. |
nlines |
Maximum number of lines to extract. |
Examples
# To labellise a function argument, first capture it with
# substitute():
fn <- function(x) expr_label(substitute(x))
fn(x:y)
# Strings are encoded
expr_label("a\nb")
# Names and expressions are quoted with ``
expr_label(quote(x))
expr_label(quote(a + b + c))
# Long expressions are collapsed
expr_label(quote(foo({
1 + 2
print(x)
})))
Print an expression
Description
expr_print()
, powered by expr_deparse()
, is an alternative
printer for R expressions with a few improvements over the base R
printer.
It colourises quosures according to their environment. Quosures from the global environment are printed normally while quosures from local environments are printed in unique colour (or in italic when all colours are taken).
It wraps inlined objects in angular brackets. For instance, an integer vector unquoted in a function call (e.g.
expr(foo(!!(1:3)))
) is printed like this:foo(<int: 1L, 2L, 3L>)
while by default R prints the code to create that vector:foo(1:3)
which is ambiguous.It respects the width boundary (from the global option
width
) in more cases.
Usage
expr_print(x, ...)
expr_deparse(x, ..., width = peek_option("width"))
Arguments
x |
An object or expression to print. |
... |
Arguments passed to |
width |
The width of the deparsed or printed expression.
Defaults to the global option |
Value
expr_deparse()
returns a character vector of lines.
expr_print()
returns its input invisibly.
Examples
# It supports any object. Non-symbolic objects are always printed
# within angular brackets:
expr_print(1:3)
expr_print(function() NULL)
# Contrast this to how the code to create these objects is printed:
expr_print(quote(1:3))
expr_print(quote(function() NULL))
# The main cause of non-symbolic objects in expressions is
# quasiquotation:
expr_print(expr(foo(!!(1:3))))
# Quosures from the global environment are printed normally:
expr_print(quo(foo))
expr_print(quo(foo(!!quo(bar))))
# Quosures from local environments are colourised according to
# their environments (if you have crayon installed):
local_quo <- local(quo(foo))
expr_print(local_quo)
wrapper_quo <- local(quo(bar(!!local_quo, baz)))
expr_print(wrapper_quo)
Ensure that all elements of a list of expressions are named
Description
This gives default names to unnamed elements of a list of
expressions (or expression wrappers such as formulas or
quosures), deparsed with as_label()
.
Usage
exprs_auto_name(
exprs,
...,
repair_auto = c("minimal", "unique"),
repair_quiet = FALSE
)
quos_auto_name(quos)
Arguments
exprs |
A list of expressions. |
... |
These dots are for future extensions and must be empty. |
repair_auto |
Whether to repair the automatic names. By
default, minimal names are returned. See |
repair_quiet |
Whether to inform user about repaired names. |
quos |
A list of quosures. |
Get or set formula components
Description
f_rhs
extracts the righthand side, f_lhs
extracts the lefthand
side, and f_env
extracts the environment. All functions throw an
error if f
is not a formula.
Usage
f_rhs(f)
f_rhs(x) <- value
f_lhs(f)
f_lhs(x) <- value
f_env(f)
f_env(x) <- value
Arguments
f , x |
A formula |
value |
The value to replace with. |
Value
f_rhs
and f_lhs
return language objects (i.e. atomic
vectors of length 1, a name, or a call). f_env
returns an
environment.
Examples
f_rhs(~ 1 + 2 + 3)
f_rhs(~ x)
f_rhs(~ "A")
f_rhs(1 ~ 2)
f_lhs(~ y)
f_lhs(x ~ y)
f_env(~ x)
Turn RHS of formula into a string or label
Description
Equivalent of expr_text()
and expr_label()
for formulas.
Usage
f_text(x, width = 60L, nlines = Inf)
f_name(x)
f_label(x)
Arguments
x |
A formula. |
width |
Width of each line. |
nlines |
Maximum number of lines to extract. |
Examples
f <- ~ a + b + bc
f_text(f)
f_label(f)
# Names a quoted with ``
f_label(~ x)
# Strings are encoded
f_label(~ "a\nb")
# Long expressions are collapsed
f_label(~ foo({
1 + 2
print(x)
}))
Global options for rlang
Description
rlang has several options which may be set globally to control behavior. A brief description of each is given here. If any functions are referenced, refer to their documentation for additional details.
-
rlang_interactive
: A logical value used byis_interactive()
. This can be set toTRUE
to test interactive behavior in unit tests, for example. -
rlang_backtrace_on_error
: A character string which controls whether backtraces are displayed with error messages, and the level of detail they print. See rlang_backtrace_on_error for the possible option values. -
rlang_trace_format_srcrefs
: A logical value used to control whether srcrefs are printed as part of the backtrace. -
rlang_trace_top_env
: An environment which will be treated as the top-level environment when printing traces. Seetrace_back()
for examples.
Internal API for standalone-types-check
Description
Internal API for standalone-types-check
Flatten or squash a list of lists into a simpler vector
Description
These functions are deprecated in favour of purrr::list_c()
and
purrr::list_flatten()
.
flatten()
removes one level hierarchy from a list, while
squash()
removes all levels. These functions are similar to
unlist()
but they are type-stable so you always know what the
type of the output is.
Usage
flatten(x)
flatten_lgl(x)
flatten_int(x)
flatten_dbl(x)
flatten_cpl(x)
flatten_chr(x)
flatten_raw(x)
squash(x)
squash_lgl(x)
squash_int(x)
squash_dbl(x)
squash_cpl(x)
squash_chr(x)
squash_raw(x)
flatten_if(x, predicate = is_spliced)
squash_if(x, predicate = is_spliced)
Arguments
x |
A list to flatten or squash. The contents of the list can
be anything for unsuffixed functions |
predicate |
A function of one argument returning whether it should be spliced. |
Value
flatten()
returns a list, flatten_lgl()
a logical
vector, flatten_int()
an integer vector, flatten_dbl()
a
double vector, and flatten_chr()
a character vector. Similarly
for squash()
and the typed variants (squash_lgl()
etc).
Examples
x <- replicate(2, sample(4), simplify = FALSE)
x
flatten(x)
flatten_int(x)
# With flatten(), only one level gets removed at a time:
deep <- list(1, list(2, list(3)))
flatten(deep)
flatten(flatten(deep))
# But squash() removes all levels:
squash(deep)
squash_dbl(deep)
# The typed flatten functions remove one level and coerce to an atomic
# vector at the same time:
flatten_dbl(list(1, list(2)))
# Only bare lists are flattened, but you can splice S3 lists
# explicitly:
foo <- set_attrs(list("bar"), class = "foo")
str(flatten(list(1, foo, list(100))))
str(flatten(list(1, splice(foo), list(100))))
# Instead of splicing manually, flatten_if() and squash_if() let
# you specify a predicate function:
is_foo <- function(x) inherits(x, "foo") || is_bare_list(x)
str(flatten_if(list(1, foo, list(100)), is_foo))
# squash_if() does the same with deep lists:
deep_foo <- list(1, list(foo, list(foo, 100)))
str(deep_foo)
str(squash(deep_foo))
str(squash_if(deep_foo, is_foo))
Get or set function body
Description
fn_body()
is a simple wrapper around base::body()
. It always
returns a \{
expression and throws an error when the input is a
primitive function (whereas body()
returns NULL
). The setter
version preserves attributes, unlike body<-
.
Usage
fn_body(fn = caller_fn())
fn_body(fn) <- value
Arguments
fn |
A function. It is looked up in the calling frame if not supplied. |
value |
New formals or formals names for |
Examples
# fn_body() is like body() but always returns a block:
fn <- function() do()
body(fn)
fn_body(fn)
# It also throws an error when used on a primitive function:
try(fn_body(base::list))
Return the closure environment of a function
Description
Closure environments define the scope of functions (see env()
).
When a function call is evaluated, R creates an evaluation frame
that inherits from the closure environment. This makes all objects
defined in the closure environment and all its parents available to
code executed within the function.
Usage
fn_env(fn)
fn_env(x) <- value
Arguments
fn , x |
A function. |
value |
A new closure environment for the function. |
Details
fn_env()
returns the closure environment of fn
. There is also
an assignment method to set a new closure environment.
Examples
env <- child_env("base")
fn <- with_env(env, function() NULL)
identical(fn_env(fn), env)
other_env <- child_env("base")
fn_env(fn) <- other_env
identical(fn_env(fn), other_env)
Extract arguments from a function
Description
fn_fmls()
returns a named list of formal arguments.
fn_fmls_names()
returns the names of the arguments.
fn_fmls_syms()
returns formals as a named list of symbols. This
is especially useful for forwarding arguments in constructed calls.
Usage
fn_fmls(fn = caller_fn())
fn_fmls_names(fn = caller_fn())
fn_fmls_syms(fn = caller_fn())
fn_fmls(fn) <- value
fn_fmls_names(fn) <- value
Arguments
fn |
A function. It is looked up in the calling frame if not supplied. |
value |
New formals or formals names for |
Details
Unlike formals()
, these helpers throw an error with primitive
functions instead of returning NULL
.
See Also
call_args()
and call_args_names()
Examples
# Extract from current call:
fn <- function(a = 1, b = 2) fn_fmls()
fn()
# fn_fmls_syms() makes it easy to forward arguments:
call2("apply", !!! fn_fmls_syms(lapply))
# You can also change the formals:
fn_fmls(fn) <- list(A = 10, B = 20)
fn()
fn_fmls_names(fn) <- c("foo", "bar")
fn()
Format bullets for error messages
Description
format_error_bullets()
takes a character vector and returns a single
string (or an empty vector if the input is empty). The elements of
the input vector are assembled as a list of bullets, depending on
their names:
Unnamed elements are unindented. They act as titles or subtitles.
Elements named
"*"
are bulleted with a cyan "bullet" symbol.Elements named
"i"
are bulleted with a blue "info" symbol.Elements named
"x"
are bulleted with a red "cross" symbol.Elements named
"v"
are bulleted with a green "tick" symbol.Elements named
"!"
are bulleted with a yellow "warning" symbol.Elements named
">"
are bulleted with an "arrow" symbol.Elements named
" "
start with an indented line break.
For convenience, if the vector is fully unnamed, the elements are formatted as "*" bullets.
The bullet formatting for errors follows the idea that sentences in
error messages are best kept short and simple. The best way to
present the information is in the cnd_body()
method of an error
conditon as a bullet list of simple sentences containing a single
clause. The info and cross symbols of the bullets provide hints on
how to interpret the bullet relative to the general error issue,
which should be supplied as cnd_header()
.
Usage
format_error_bullets(x)
Arguments
x |
A named character vector of messages. Named elements are
prefixed with the corresponding bullet. Elements named with a
single space |
Examples
# All bullets
writeLines(format_error_bullets(c("foo", "bar")))
# This is equivalent to
writeLines(format_error_bullets(set_names(c("foo", "bar"), "*")))
# Supply named elements to format info, cross, and tick bullets
writeLines(format_error_bullets(c(i = "foo", x = "bar", v = "baz", "*" = "quux")))
# An unnamed element breaks the line
writeLines(format_error_bullets(c(i = "foo\nbar")))
# A " " element breaks the line within a bullet (with indentation)
writeLines(format_error_bullets(c(i = "foo", " " = "bar")))
Validate and format a function call for use in error messages
Description
-
error_call()
takes either a frame environment or a call. If the input is an environment,error_call()
acts likeframe_call()
with some additional logic, e.g. for S3 methods and for frames with alocal_error_call()
. -
format_error_call()
simplifies its input to a simple call (see section below) and formats the result as code (using cli if available). Use this function to generate the "in" part of an error message from a stack frame call.format_error_call()
first passes its input toerror_call()
to fetch calls from frame environments.
Usage
format_error_call(call)
error_call(call)
Arguments
call |
The execution environment of a currently
running function, e.g. |
Value
Either a string formatted as code or NULL
if a simple
call could not be generated.
Details of formatting
The arguments of function calls are stripped.
Complex function calls containing inlined objects return
NULL
.Calls to
if
preserve the condition since it might be informative. Branches are dropped.Calls to operators and other special syntax are formatted using their names rather than the potentially confusing function form.
Examples
# Arguments are stripped
writeLines(format_error_call(quote(foo(bar, baz))))
# Returns `NULL` with complex calls such as those that contain
# inlined functions
format_error_call(call2(list))
# Operators are formatted using their names rather than in
# function call form
writeLines(format_error_call(quote(1 + 2)))
Format a type for error messages
Description
friendly_type()
is deprecated. Please use the
standalone-friendly-type.R
file instead.
Usage
friendly_type(type)
Arguments
type |
A type as returned by |
Value
A string of the prettified type, qualified with an indefinite article.
Get or set the environment of an object
Description
These functions dispatch internally with methods for functions,
formulas and frames. If called with a missing argument, the
environment of the current evaluation frame is returned. If you
call get_env()
with an environment, it acts as the identity
function and the environment is simply returned (this helps
simplifying code when writing generic functions for environments).
Usage
get_env(env, default = NULL)
set_env(env, new_env = caller_env())
env_poke_parent(env, new_env)
Arguments
env |
An environment. |
default |
The default environment in case |
new_env |
An environment to replace |
Details
While set_env()
returns a modified copy and does not have side
effects, env_poke_parent()
operates changes the environment by
side effect. This is because environments are
uncopyable. Be careful not to change environments
that you don't own, e.g. a parent environment of a function from a
package.
See Also
quo_get_env()
and quo_set_env()
for versions of
get_env()
and set_env()
that only work on quosures.
Examples
# Environment of closure functions:
fn <- function() "foo"
get_env(fn)
# Or of quosures or formulas:
get_env(~foo)
get_env(quo(foo))
# Provide a default in case the object doesn't bundle an environment.
# Let's create an unevaluated formula:
f <- quote(~foo)
# The following line would fail if run because unevaluated formulas
# don't bundle an environment (they didn't have the chance to
# record one yet):
# get_env(f)
# It is often useful to provide a default when you're writing
# functions accepting formulas as input:
default <- env()
identical(get_env(f, default), default)
# set_env() can be used to set the enclosure of functions and
# formulas. Let's create a function with a particular environment:
env <- child_env("base")
fn <- set_env(function() NULL, env)
# That function now has `env` as enclosure:
identical(get_env(fn), env)
identical(get_env(fn), current_env())
# set_env() does not work by side effect. Setting a new environment
# for fn has no effect on the original function:
other_env <- child_env(NULL)
set_env(fn, other_env)
identical(get_env(fn), other_env)
# Since set_env() returns a new function with a different
# environment, you'll need to reassign the result:
fn <- set_env(fn, other_env)
identical(get_env(fn), other_env)
Entrace unexpected errors
Description
global_entrace()
enriches base errors, warnings, and messages
with rlang features.
They are assigned a backtrace. You can configure whether to display a backtrace on error with the rlang_backtrace_on_error global option.
They are recorded in
last_error()
,last_warnings()
, orlast_messages()
. You can inspect backtraces at any time by calling these functions.
Set global entracing in your RProfile with:
rlang::global_entrace()
Usage
global_entrace(enable = TRUE, class = c("error", "warning", "message"))
Arguments
enable |
Whether to enable or disable global handling. |
class |
A character vector of one or several classes of conditions to be entraced. |
Inside RMarkdown documents
Call global_entrace()
inside an RMarkdown document to cause
errors and warnings to be promoted to rlang conditions that include
a backtrace. This needs to be done in a separate setup chunk before
the first error or warning.
This is useful in conjunction with
rlang_backtrace_on_error_report
and
rlang_backtrace_on_warning_report
. To get full entracing in an
Rmd document, include this in a setup chunk before the first error
or warning is signalled.
```{r setup} rlang::global_entrace() options(rlang_backtrace_on_warning_report = "full") options(rlang_backtrace_on_error_report = "full") ```
Under the hood
On R 4.0 and newer, global_entrace()
installs a global handler
with globalCallingHandlers()
. On older R versions, entrace()
is
set as an option(error = )
handler. The latter method has the
disadvantage that only one handler can be set at a time. This means
that you need to manually switch between entrace()
and other
handlers like recover()
. Also this causes a conflict with IDE
handlers (e.g. in RStudio).
Register default global handlers
Description
global_handle()
sets up a default configuration for error,
warning, and message handling. It calls:
-
global_entrace()
to enable rlang errors and warnings globally. -
global_prompt_install()
to recover frompackageNotFoundError
s with a user prompt to install the missing package. Note that at the time of writing (R 4.1), there are only very limited situations where this handler works.
Usage
global_handle(entrace = TRUE, prompt_install = TRUE)
Arguments
entrace |
Passed as |
prompt_install |
Passed as |
Prompt user to install missing packages
Description
When enabled, packageNotFoundError
thrown by loadNamespace()
cause a user prompt to install the missing package and continue
without interrupting the current program.
This is similar to how check_installed()
prompts users to install
required packages. It uses the same install strategy, using pak if
available and install.packages()
otherwise.
Usage
global_prompt_install(enable = TRUE)
Arguments
enable |
Whether to enable or disable global handling. |
Name injection with "{"
and "{{"
Description
Dynamic dots (and data-masked dots which are dynamic by default) have built-in support for names interpolation with the glue package.
tibble::tibble(foo = 1) #> # A tibble: 1 x 1 #> foo #> <dbl> #> 1 1 foo <- "name" tibble::tibble("{foo}" := 1) #> # A tibble: 1 x 1 #> name #> <dbl> #> 1 1
Inside functions, embracing an argument with {{
inserts the expression supplied as argument in the string. This gives an indication on the variable or computation supplied as argument:
tib <- function(x) { tibble::tibble("var: {{ x }}" := x) } tib(1 + 1) #> # A tibble: 1 x 1 #> `var: 1 + 1` #> <dbl> #> 1 2
See also englue()
to string-embrace outside of dynamic dots.
g <- function(x) { englue("var: {{ x }}") } g(1 + 1) #> [1] "var: 1 + 1"
Technically, "{{"
defuses a function argument, calls as_label()
on the expression supplied as argument, and inserts the result in the string.
"{"
and "{{"
While glue::glue()
only supports "{"
, dynamic dots support both "{"
and "{{"
. The double brace variant is similar to the embrace operator {{
available in data-masked arguments.
In the following example, the embrace operator is used in a glue string to name the result with a default name that represents the expression supplied as argument:
my_mean <- function(data, var) { data %>% dplyr::summarise("{{ var }}" := mean({{ var }})) } mtcars %>% my_mean(cyl) #> # A tibble: 1 x 1 #> cyl #> <dbl> #> 1 6.19 mtcars %>% my_mean(cyl * am) #> # A tibble: 1 x 1 #> `cyl * am` #> <dbl> #> 1 2.06
"{{"
is only meant for inserting an expression supplied as argument to a function. The result of the expression is not inspected or used. To interpolate a string stored in a variable, use the regular glue operator "{"
instead:
my_mean <- function(data, var, name = "mean") { data %>% dplyr::summarise("{name}" := mean({{ var }})) } mtcars %>% my_mean(cyl) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19 mtcars %>% my_mean(cyl, name = "cyl") #> # A tibble: 1 x 1 #> cyl #> <dbl> #> 1 6.19
Using the wrong operator causes unexpected results:
x <- "name" list2("{{ x }}" := 1) #> $`"name"` #> [1] 1 list2("{x}" := 1) #> $name #> [1] 1
Ideally, using {{
on regular objects would be an error. However for technical reasons it is not possible to make a distinction between function arguments and ordinary variables. See Does {{ work on regular objects? for more information about this limitation.
Allow overriding default names
The implementation of my_mean()
in the previous section forces a default name onto the result. But what if the caller wants to give it a different name? In functions that take dots, it is possible to just supply a named expression to override the default. In a function like my_mean()
that takes a named argument we need a different approach.
This is where englue()
becomes useful. We can pull out the default name creation in another user-facing argument like this:
my_mean <- function(data, var, name = englue("{{ var }}")) { data %>% dplyr::summarise("{name}" := mean({{ var }})) }
Now the user may supply their own name if needed:
mtcars %>% my_mean(cyl * am) #> # A tibble: 1 x 1 #> `cyl * am` #> <dbl> #> 1 2.06 mtcars %>% my_mean(cyl * am, name = "mean_cyl_am") #> # A tibble: 1 x 1 #> mean_cyl_am #> <dbl> #> 1 2.06
What's the deal with :=
?
Name injection in dynamic dots was originally implemented with :=
instead of =
to allow complex expressions on the LHS:
x <- "name" list2(!!x := 1) #> $name #> [1] 1
Name-injection with glue operations was an extension of this existing feature and so inherited the same interface. However, there is no technical barrier to using glue strings on the LHS of =
.
As we are now moving away from !!
for common tasks, we are considering enabling glue strings with =
and superseding :=
usage. Track the progress of this change in issue 1296.
Using glue syntax in packages
Since rlang does not depend directly on glue, you will have to ensure that glue is installed by adding it to your Imports:
section.
usethis::use_package("glue", "Imports")
How long is an object?
Description
This is a function for the common task of testing the length of an
object. It checks the length of an object in a non-generic way:
base::length()
methods are ignored.
Usage
has_length(x, n = NULL)
Arguments
x |
A R object. |
n |
A specific length to test |
Examples
has_length(list())
has_length(list(), 0)
has_length(letters)
has_length(letters, 20)
has_length(letters, 26)
Does an object have an element with this name?
Description
This function returns a logical value that indicates if a data
frame or another named object contains an element with a specific
name. Note that has_name()
only works with vectors. For instance,
environments need the specialised function env_has()
.
Usage
has_name(x, name)
Arguments
x |
A data frame or another named object |
name |
Element name(s) to check |
Details
Unnamed objects are treated as if all names are empty strings. NA
input gives FALSE
as output.
Value
A logical vector of the same length as name
Examples
has_name(iris, "Species")
has_name(mtcars, "gears")
Hashing
Description
-
hash()
hashes an arbitrary R object. -
hash_file()
hashes the data contained in a file.
The generated hash is guaranteed to be reproducible across platforms that have the same endianness and are using the same R version.
Usage
hash(x)
hash_file(path)
Arguments
x |
An object. |
path |
A character vector of paths to the files to be hashed. |
Details
These hashers use the XXH128 hash algorithm of the xxHash library, which generates a 128-bit hash. Both are implemented as streaming hashes, which generate the hash with minimal extra memory usage.
For hash()
, objects are converted to binary using R's native serialization
tools. On R >= 3.5.0, serialization version 3 is used, otherwise version 2 is
used. See serialize()
for more information about the serialization version.
Value
For
hash()
, a single character string containing the hash.For
hash_file()
, a character vector containing one hash per file.
Examples
hash(c(1, 2, 3))
hash(mtcars)
authors <- file.path(R.home("doc"), "AUTHORS")
copying <- file.path(R.home("doc"), "COPYING")
hashes <- hash_file(c(authors, copying))
hashes
# If you need a single hash for multiple files,
# hash the result of `hash_file()`
hash(hashes)
Does an object inherit from a set of classes?
Description
-
inherits_any()
is likebase::inherits()
but is more explicit about its behaviour with multiple classes. Ifclasses
contains several elements and the object inherits from at least one of them,inherits_any()
returnsTRUE
. -
inherits_all()
tests that an object inherits from all of the classes in the supplied order. This is usually the best way to test for inheritance of multiple classes. -
inherits_only()
tests that the class vectors are identical. It is a shortcut foridentical(class(x), class)
.
Usage
inherits_any(x, class)
inherits_all(x, class)
inherits_only(x, class)
Arguments
x |
An object to test for inheritance. |
class |
A character vector of classes. |
Examples
obj <- structure(list(), class = c("foo", "bar", "baz"))
# With the _any variant only one class must match:
inherits_any(obj, c("foobar", "bazbaz"))
inherits_any(obj, c("foo", "bazbaz"))
# With the _all variant all classes must match:
inherits_all(obj, c("foo", "bazbaz"))
inherits_all(obj, c("foo", "baz"))
# The order of classes must match as well:
inherits_all(obj, c("baz", "foo"))
# inherits_only() checks that the class vectors are identical:
inherits_only(obj, c("foo", "baz"))
inherits_only(obj, c("foo", "bar", "baz"))
Inject objects in an R expression
Description
inject()
evaluates an expression with injection
support. There are three main usages:
-
Splicing lists of arguments in a function call.
Inline objects or other expressions in an expression with
!!
and!!!
. For instance to create functions or formulas programmatically.Pass arguments to NSE functions that defuse their arguments without injection support (see for instance
enquo0()
). You can use{{ arg }}
with functions documented to support quosures. Otherwise, use!!enexpr(arg)
.
Usage
inject(expr, env = caller_env())
Arguments
expr |
An argument to evaluate. This argument is immediately
evaluated in |
env |
The environment in which to evaluate |
Examples
# inject() simply evaluates its argument with injection
# support. These expressions are equivalent:
2 * 3
inject(2 * 3)
inject(!!2 * !!3)
# Injection with `!!` can be useful to insert objects or
# expressions within other expressions, like formulas:
lhs <- sym("foo")
rhs <- sym("bar")
inject(!!lhs ~ !!rhs + 10)
# Injection with `!!!` splices lists of arguments in function
# calls:
args <- list(na.rm = TRUE, finite = 0.2)
inject(mean(1:10, !!!args))
Injection operator !!
Description
The injection operator !!
injects a value or
expression inside another expression. In other words, it modifies a
piece of code before R evaluates it.
There are two main cases for injection. You can inject constant values to work around issues of scoping ambiguity, and you can inject defused expressions like symbolised column names.
Where does !!
work?
!!
does not work everywhere, you can only use it within certain
special functions:
Functions taking defused and data-masked arguments.
Technically, this means function arguments defused with
{{
oren
-prefixed operators likeenquo()
,enexpr()
, etc.Inside
inject()
.
All data-masking verbs in the tidyverse support injection operators
out of the box. With base functions, you need to use inject()
to
enable !!
. Using !!
out of context may lead to incorrect
results, see What happens if I use injection operators out of context?.
The examples below are built around the base function with()
.
Since it's not a tidyverse function we will use inject()
to enable
!!
usage.
Injecting values
Data-masking functions like with()
are handy because you can
refer to column names in your computations. This comes at the price
of data mask ambiguity: if you have defined an env-variable of the
same name as a data-variable, you get a name collisions. This
collision is always resolved by giving precedence to the
data-variable (it masks the env-variable):
cyl <- c(100, 110) with(mtcars, mean(cyl)) #> [1] 6.1875
The injection operator offers one way of solving this. Use it to inject the env-variable inside the data-masked expression:
inject( with(mtcars, mean(!!cyl)) ) #> [1] 105
Note that the .env
pronoun is a simpler way of solving the
ambiguity. See The data mask ambiguity for more about
this.
Injecting expressions
Injection is also useful for modifying parts of a defused expression. In the following example we use the symbolise-and-inject pattern to inject a column name inside a data-masked expression.
var <- sym("cyl") inject( with(mtcars, mean(!!var)) ) #> [1] 6.1875
Since with()
is a base function, you can't inject
quosures, only naked symbols and calls. This
isn't a problem here because we're injecting the name of a data
frame column. If the environment is important, try injecting a
pre-computed value instead.
When do I need !!
?
With tidyverse APIs, injecting expressions with !!
is no longer a
common pattern. First, the .env
pronoun solves the
ambiguity problem in a more intuitive way:
cyl <- 100 mtcars %>% dplyr::mutate(cyl = cyl * .env$cyl)
Second, the embrace operator {{
makes the
defuse-and-inject pattern easier to
learn and use.
my_mean <- function(data, var) { data %>% dplyr::summarise(mean({{ var }})) } # Equivalent to my_mean <- function(data, var) { data %>% dplyr::summarise(mean(!!enquo(var))) }
!!
is a good tool to learn for advanced applications but our
hope is that it isn't needed for common data analysis cases.
See Also
Simulate interrupt condition
Description
interrupt()
simulates a user interrupt of the kind that is
signalled with Ctrl-C
. It is currently not possible to create
custom interrupt condition objects.
Usage
interrupt()
Invoke a function with a list of arguments
Description
Deprecated in rlang 0.4.0 in favour of
exec()
.
Usage
invoke(.fn, .args = list(), ..., .env = caller_env(), .bury = c(".fn", ""))
Arguments
.fn , args , ... , .env , .bury |
Is object a call?
Description
This function tests if x
is a call. This is a
pattern-matching predicate that returns FALSE
if name
and n
are supplied and the call does not match these properties.
Usage
is_call(x, name = NULL, n = NULL, ns = NULL)
Arguments
x |
An object to test. Formulas and quosures are treated literally. |
name |
An optional name that the call should match. It is
passed to |
n |
An optional number of arguments that the call should match. |
ns |
The namespace of the call. If Can be a character vector of namespaces, in which case the call
has to match at least one of them, otherwise |
See Also
Examples
is_call(quote(foo(bar)))
# You can pattern-match the call with additional arguments:
is_call(quote(foo(bar)), "foo")
is_call(quote(foo(bar)), "bar")
is_call(quote(foo(bar)), quote(foo))
# Match the number of arguments with is_call():
is_call(quote(foo(bar)), "foo", 1)
is_call(quote(foo(bar)), "foo", 2)
# By default, namespaced calls are tested unqualified:
ns_expr <- quote(base::list())
is_call(ns_expr, "list")
# You can also specify whether the call shouldn't be namespaced by
# supplying an empty string:
is_call(ns_expr, "list", ns = "")
# Or if it should have a namespace:
is_call(ns_expr, "list", ns = "utils")
is_call(ns_expr, "list", ns = "base")
# You can supply multiple namespaces:
is_call(ns_expr, "list", ns = c("utils", "base"))
is_call(ns_expr, "list", ns = c("utils", "stats"))
# If one of them is "", unnamespaced calls will match as well:
is_call(quote(list()), "list", ns = "base")
is_call(quote(list()), "list", ns = c("base", ""))
is_call(quote(base::list()), "list", ns = c("base", ""))
# The name argument is vectorised so you can supply a list of names
# to match with:
is_call(quote(foo(bar)), c("bar", "baz"))
is_call(quote(foo(bar)), c("bar", "foo"))
is_call(quote(base::list), c("::", ":::", "$", "@"))
Is an object callable?
Description
A callable object is an object that can appear in the function position of a call (as opposed to argument position). This includes symbolic objects that evaluate to a function or literal functions embedded in the call.
Usage
is_callable(x)
Arguments
x |
An object to test. |
Details
Note that strings may look like callable objects because
expressions of the form "list"()
are valid R code. However,
that's only because the R parser transforms strings to symbols. It
is not legal to manually set language heads to strings.
Examples
# Symbolic objects and functions are callable:
is_callable(quote(foo))
is_callable(base::identity)
# node_poke_car() lets you modify calls without any checking:
lang <- quote(foo(10))
node_poke_car(lang, current_env())
# Use is_callable() to check an input object is safe to put as CAR:
obj <- base::identity
if (is_callable(obj)) {
lang <- node_poke_car(lang, obj)
} else {
abort("`obj` must be callable")
}
eval_bare(lang)
Is object a condition?
Description
Is object a condition?
Usage
is_condition(x)
is_error(x)
is_warning(x)
is_message(x)
Arguments
x |
An object to test. |
Is an object copyable?
Description
When an object is modified, R generally copies it (sometimes
lazily) to enforce value semantics.
However, some internal types are uncopyable. If you try to copy
them, either with <-
or by argument passing, you actually create
references to the original object rather than actual
copies. Modifying these references can thus have far reaching side
effects.
Usage
is_copyable(x)
Arguments
x |
An object to test. |
Examples
# Let's add attributes with structure() to uncopyable types. Since
# they are not copied, the attributes are changed in place:
env <- env()
structure(env, foo = "bar")
env
# These objects that can only be changed with side effect are not
# copyable:
is_copyable(env)
structure(base::list, foo = "bar")
str(base::list)
Is a vector uniquely named?
Description
Like is_named()
but also checks that names are unique.
Usage
is_dictionaryish(x)
Arguments
x |
A vector. |
Is object an empty vector or NULL?
Description
Is object an empty vector or NULL?
Usage
is_empty(x)
Arguments
x |
object to test |
Examples
is_empty(NULL)
is_empty(list())
is_empty(list(NULL))
Is object an environment?
Description
is_bare_environment()
tests whether x
is an environment without a s3 or
s4 class.
Usage
is_environment(x)
is_bare_environment(x)
Arguments
x |
object to test |
Is an object an expression?
Description
In rlang, an expression is the return type of parse_expr()
, the
set of objects that can be obtained from parsing R code. Under this
definition expressions include numbers, strings, NULL
, symbols,
and function calls. These objects can be classified as:
Symbolic objects, i.e. symbols and function calls (for which
is_symbolic()
returnsTRUE
)Syntactic literals, i.e. scalar atomic objects and
NULL
(testable withis_syntactic_literal()
)
is_expression()
returns TRUE
if the input is either a symbolic
object or a syntactic literal. If a call, the elements of the call
must all be expressions as well. Unparsable calls are not
considered expressions in this narrow definition.
Note that in base R, there exists expression()
vectors, a data
type similar to a list that supports special attributes created by
the parser called source references. This data type is not
supported in rlang.
Usage
is_expression(x)
is_syntactic_literal(x)
is_symbolic(x)
Arguments
x |
An object to test. |
Details
is_symbolic()
returns TRUE
for symbols and calls (objects with
type language
). Symbolic objects are replaced by their value
during evaluation. Literals are the complement of symbolic
objects. They are their own value and return themselves during
evaluation.
is_syntactic_literal()
is a predicate that returns TRUE
for the
subset of literals that are created by R when parsing text (see
parse_expr()
): numbers, strings and NULL
. Along with symbols,
these literals are the terminating nodes in an AST.
Note that in the most general sense, a literal is any R object that
evaluates to itself and that can be evaluated in the empty
environment. For instance, quote(c(1, 2))
is not a literal, it is
a call. However, the result of evaluating it in base_env()
is a
literal(in this case an atomic vector).
As the data structure for function arguments, pairlists are also a
kind of language objects. However, since they are mostly an
internal data structure and can't be returned as is by the parser,
is_expression()
returns FALSE
for pairlists.
See Also
is_call()
for a call predicate.
Examples
q1 <- quote(1)
is_expression(q1)
is_syntactic_literal(q1)
q2 <- quote(x)
is_expression(q2)
is_symbol(q2)
q3 <- quote(x + 1)
is_expression(q3)
is_call(q3)
# Atomic expressions are the terminating nodes of a call tree:
# NULL or a scalar atomic vector:
is_syntactic_literal("string")
is_syntactic_literal(NULL)
is_syntactic_literal(letters)
is_syntactic_literal(quote(call()))
# Parsable literals have the property of being self-quoting:
identical("foo", quote("foo"))
identical(1L, quote(1L))
identical(NULL, quote(NULL))
# Like any literals, they can be evaluated within the empty
# environment:
eval_bare(quote(1L), empty_env())
# Whereas it would fail for symbolic expressions:
# eval_bare(quote(c(1L, 2L)), empty_env())
# Pairlists are also language objects representing argument lists.
# You will usually encounter them with extracted formals:
fmls <- formals(is_expression)
typeof(fmls)
# Since they are mostly an internal data structure, is_expression()
# returns FALSE for pairlists, so you will have to check explicitly
# for them:
is_expression(fmls)
is_pairlist(fmls)
Is object a formula?
Description
is_formula()
tests whether x
is a call to ~
. is_bare_formula()
tests in addition that x
does not inherit from anything else than
"formula"
.
Note: When we first implemented is_formula()
, we thought it
best to treat unevaluated formulas as formulas by default (see
section below). Now we think this default introduces too many edge
cases in normal code. We recommend always supplying scoped = TRUE
. Unevaluated formulas can be handled via a is_call(x, "~")
branch.
Usage
is_formula(x, scoped = NULL, lhs = NULL)
is_bare_formula(x, scoped = TRUE, lhs = NULL)
Arguments
x |
An object to test. |
scoped |
A boolean indicating whether the quosure is scoped,
that is, has a valid environment attribute and inherits from
|
lhs |
A boolean indicating whether the formula has a left-hand
side. If |
Dealing with unevaluated formulas
At parse time, a formula is a simple call to ~
and it does not
have a class or an environment. Once evaluated, the ~
call
becomes a properly structured formula. Unevaluated formulas arise
by quotation, e.g. ~~foo
, quote(~foo)
, or substitute(arg)
with arg
being supplied a formula. Use the scoped
argument to
check whether the formula carries an environment.
Examples
is_formula(~10)
is_formula(10)
# If you don't supply `lhs`, both one-sided and two-sided formulas
# will return `TRUE`
is_formula(disp ~ am)
is_formula(~am)
# You can also specify whether you expect a LHS:
is_formula(disp ~ am, lhs = TRUE)
is_formula(disp ~ am, lhs = FALSE)
is_formula(~am, lhs = TRUE)
is_formula(~am, lhs = FALSE)
# Handling of unevaluated formulas is a bit tricky. These formulas
# are special because they don't inherit from `"formula"` and they
# don't carry an environment (they are not scoped):
f <- quote(~foo)
f_env(f)
# By default unevaluated formulas are treated as formulas
is_formula(f)
# Supply `scoped = TRUE` to ensure you have an evaluated formula
is_formula(f, scoped = TRUE)
# By default unevaluated formulas not treated as bare formulas
is_bare_formula(f)
# If you supply `scoped = TRUE`, they will be considered bare
# formulas even though they don't inherit from `"formula"`
is_bare_formula(f, scoped = TRUE)
Is object a function?
Description
The R language defines two different types of functions: primitive functions, which are low-level, and closures, which are the regular kind of functions.
Usage
is_function(x)
is_closure(x)
is_primitive(x)
is_primitive_eager(x)
is_primitive_lazy(x)
Arguments
x |
Object to be tested. |
Details
Closures are functions written in R, named after the way their arguments are scoped within nested environments (see https://en.wikipedia.org/wiki/Closure_(computer_programming)). The root environment of the closure is called the closure environment. When closures are evaluated, a new environment called the evaluation frame is created with the closure environment as parent. This is where the body of the closure is evaluated. These closure frames appear on the evaluation stack, as opposed to primitive functions which do not necessarily have their own evaluation frame and never appear on the stack.
Primitive functions are more efficient than closures for two
reasons. First, they are written entirely in fast low-level
code. Second, the mechanism by which they are passed arguments is
more efficient because they often do not need the full procedure of
argument matching (dealing with positional versus named arguments,
partial matching, etc). One practical consequence of the special
way in which primitives are passed arguments is that they
technically do not have formal arguments, and formals()
will
return NULL
if called on a primitive function. Finally, primitive
functions can either take arguments lazily, like R closures do,
or evaluate them eagerly before being passed on to the C code.
The former kind of primitives are called "special" in R terminology,
while the latter is referred to as "builtin". is_primitive_eager()
and is_primitive_lazy()
allow you to check whether a primitive
function evaluates arguments eagerly or lazily.
You will also encounter the distinction between primitive and
internal functions in technical documentation. Like primitive
functions, internal functions are defined at a low level and
written in C. However, internal functions have no representation in
the R language. Instead, they are called via a call to
base::.Internal()
within a regular closure. This ensures that
they appear as normal R function objects: they obey all the usual
rules of argument passing, and they appear on the evaluation stack
as any other closures. As a result, fn_fmls()
does not need to
look in the .ArgsEnv
environment to obtain a representation of
their arguments, and there is no way of querying from R whether
they are lazy ('special' in R terminology) or eager ('builtin').
You can call primitive functions with .Primitive()
and internal
functions with .Internal()
. However, calling internal functions
in a package is forbidden by CRAN's policy because they are
considered part of the private API. They often assume that they
have been called with correctly formed arguments, and may cause R
to crash if you call them with unexpected objects.
Examples
# Primitive functions are not closures:
is_closure(base::c)
is_primitive(base::c)
# On the other hand, internal functions are wrapped in a closure
# and appear as such from the R side:
is_closure(base::eval)
# Both closures and primitives are functions:
is_function(base::c)
is_function(base::eval)
# Many primitive functions evaluate arguments eagerly:
is_primitive_eager(base::c)
is_primitive_eager(base::list)
is_primitive_eager(base::`+`)
# However, primitives that operate on expressions, like quote() or
# substitute(), are lazy:
is_primitive_lazy(base::quote)
is_primitive_lazy(base::substitute)
Are packages installed in any of the libraries?
Description
These functions check that packages are installed with minimal side effects. If installed, the packages will be loaded but not attached.
-
is_installed()
doesn't interact with the user. It simply returnsTRUE
orFALSE
depending on whether the packages are installed. In interactive sessions,
check_installed()
asks the user whether to install missing packages. If the user accepts, the packages are installed withpak::pkg_install()
if available, orutils::install.packages()
otherwise. If the session is non interactive or if the user chooses not to install the packages, the current evaluation is aborted.
You can disable the prompt by setting the
rlib_restart_package_not_found
global option to FALSE
. In that
case, missing packages always cause an error.
Usage
is_installed(pkg, ..., version = NULL, compare = NULL)
check_installed(
pkg,
reason = NULL,
...,
version = NULL,
compare = NULL,
action = NULL,
call = caller_env()
)
Arguments
pkg |
The package names. Can include version requirements,
e.g. |
... |
These dots must be empty. |
version |
Minimum versions for |
compare |
A character vector of comparison operators to use
for |
reason |
Optional string indicating why is |
action |
An optional function taking |
call |
The execution environment of a currently
running function, e.g. |
Value
is_installed()
returns TRUE
if all package names
provided in pkg
are installed, FALSE
otherwise. check_installed()
either doesn't return or returns
NULL
.
Handling package not found errors
check_installed()
signals error conditions of class
rlib_error_package_not_found
. The error includes pkg
and
version
fields. They are vectorised and may include several
packages.
The error is signalled with a rlib_restart_package_not_found
restart on the stack to allow handlers to install the required
packages. To do so, add a calling handler
for rlib_error_package_not_found
, install the required packages,
and invoke the restart without arguments. This restarts the check
from scratch.
The condition is not signalled in non-interactive sessions, in the
restarting case, or if the rlib_restart_package_not_found
user
option is set to FALSE
.
Examples
is_installed("utils")
is_installed(c("base", "ggplot5"))
is_installed(c("base", "ggplot5"), version = c(NA, "5.1.0"))
Is a vector integer-like?
Description
These predicates check whether R considers a number vector to be
integer-like, according to its own tolerance check (which is in
fact delegated to the C library). This function is not adapted to
data analysis, see the help for base::is.integer()
for examples
of how to check for whole numbers.
Things to consider when checking for integer-like doubles:
This check can be expensive because the whole double vector has to be traversed and checked.
Large double values may be integerish but may still not be coercible to integer. This is because integers in R only support values up to
2^31 - 1
while numbers stored as double can be much larger.
Usage
is_integerish(x, n = NULL, finite = NULL)
is_bare_integerish(x, n = NULL, finite = NULL)
is_scalar_integerish(x, finite = NULL)
Arguments
x |
Object to be tested. |
n |
Expected length of a vector. |
finite |
Whether all values of the vector are finite. The
non-finite values are |
See Also
is_bare_numeric()
for testing whether an object is a
base numeric type (a bare double or integer vector).
Examples
is_integerish(10L)
is_integerish(10.0)
is_integerish(10.0, n = 2)
is_integerish(10.000001)
is_integerish(TRUE)
Is R running interactively?
Description
Like base::interactive()
, is_interactive()
returns TRUE
when
the function runs interactively and FALSE
when it runs in batch
mode. It also checks, in this order:
The
rlang_interactive
global option. If set to a singleTRUE
orFALSE
,is_interactive()
returns that value immediately. This escape hatch is useful in unit tests or to manually turn on interactive features in RMarkdown outputs.Whether knitr or testthat is in progress, in which case
is_interactive()
returnsFALSE
.
with_interactive()
and local_interactive()
set the global
option conveniently.
Usage
is_interactive()
local_interactive(value = TRUE, frame = caller_env())
with_interactive(expr, value = TRUE)
Arguments
value |
A single |
frame |
The environment of a running function which defines the scope of the temporary options. When the function returns, the options are reset to their original values. |
expr |
An expression to evaluate with interactivity set to
|
Is object a call?
Description
These functions are deprecated, please use
is_call()
and its n
argument instead.
Usage
is_lang(x, name = NULL, n = NULL, ns = NULL)
Arguments
x |
An object to test. Formulas and quosures are treated literally. |
name |
An optional name that the call should match. It is
passed to |
n |
An optional number of arguments that the call should match. |
ns |
The namespace of the call. If Can be a character vector of namespaces, in which case the call
has to match at least one of them, otherwise |
Is object named?
Description
-
is_named()
is a scalar predicate that checks thatx
has anames
attribute and that none of the names are missing or empty (NA
or""
). -
is_named2()
is likeis_named()
but always returnsTRUE
for empty vectors, even those that don't have anames
attribute. In other words, it tests for the property that each element of a vector is named.is_named2()
composes well withnames2()
whereasis_named()
composes withnames()
. -
have_name()
is a vectorised variant.
Usage
is_named(x)
is_named2(x)
have_name(x)
Arguments
x |
A vector to test. |
Details
is_named()
always returns TRUE
for empty vectors because
Value
is_named()
and is_named2()
are scalar predicates that
return TRUE
or FALSE
. have_name()
is vectorised and returns
a logical vector as long as the input.
Examples
# is_named() is a scalar predicate about the whole vector of names:
is_named(c(a = 1, b = 2))
is_named(c(a = 1, 2))
# Unlike is_named2(), is_named() returns `FALSE` for empty vectors
# that don't have a `names` attribute.
is_named(list())
is_named2(list())
# have_name() is a vectorised predicate
have_name(c(a = 1, b = 2))
have_name(c(a = 1, 2))
# Empty and missing names are treated as invalid:
invalid <- set_names(letters[1:5])
names(invalid)[1] <- ""
names(invalid)[3] <- NA
is_named(invalid)
have_name(invalid)
# A data frame normally has valid, unique names
is_named(mtcars)
have_name(mtcars)
# A matrix usually doesn't because the names are stored in a
# different attribute
mat <- matrix(1:4, 2)
colnames(mat) <- c("a", "b")
is_named(mat)
names(mat)
Is an object a namespace environment?
Description
Is an object a namespace environment?
Usage
is_namespace(x)
Arguments
x |
An object to test. |
Is object a node or pairlist?
Description
-
is_pairlist()
checks thatx
has typepairlist
. -
is_node()
checks thatx
has typepairlist
orlanguage
. It tests whetherx
is a node that has a CAR and a CDR, including callable nodes (language objects). -
is_node_list()
checks thatx
has typepairlist
orNULL
.NULL
is the empty node list.
Usage
is_pairlist(x)
is_node(x)
is_node_list(x)
Arguments
x |
Object to test. |
Life cycle
These functions are experimental. We are still figuring out a good naming convention to refer to the different lisp-like lists in R.
See Also
is_call()
tests for language nodes.
Is an object referencing another?
Description
There are typically two situations where two symbols may refer to the same object.
R objects usually have copy-on-write semantics. This is an optimisation that ensures that objects are only copied if needed. When you copy a vector, no memory is actually copied until you modify either the original object or the copy is modified.
Note that the copy-on-write optimisation is an implementation detail that is not guaranteed by the specification of the R language.
Assigning an uncopyable object (like an environment) creates a reference. These objects are never copied even if you modify one of the references.
Usage
is_reference(x, y)
Arguments
x , y |
R objects. |
Examples
# Reassigning an uncopyable object such as an environment creates a
# reference:
env <- env()
ref <- env
is_reference(ref, env)
# Due to copy-on-write optimisation, a copied vector can
# temporarily reference the original vector:
vec <- 1:10
copy <- vec
is_reference(copy, vec)
# Once you modify on of them, the copy is triggered in the
# background and the objects cease to reference each other:
vec[[1]] <- 100
is_reference(copy, vec)
Is object a symbol?
Description
Is object a symbol?
Usage
is_symbol(x, name = NULL)
Arguments
x |
An object to test. |
name |
An optional name or vector of names that the symbol should match. |
Is object identical to TRUE or FALSE?
Description
These functions bypass R's automatic conversion rules and check
that x
is literally TRUE
or FALSE
.
Usage
is_true(x)
is_false(x)
Arguments
x |
object to test |
Examples
is_true(TRUE)
is_true(1)
is_false(FALSE)
is_false(0)
Is object a weak reference?
Description
Is object a weak reference?
Usage
is_weakref(x)
Arguments
x |
An object to test. |
Create a call
Description
These functions are deprecated, please use
call2()
and
new_call()
instead.
Usage
lang(.fn, ..., .ns = NULL)
Arguments
.fn |
Function to call. Must be a callable object: a string, symbol, call, or a function. |
... |
<dynamic> Arguments for the function call. Empty arguments are preserved. |
.ns |
Namespace with which to prefix |
Last abort()
error
Description
-
last_error()
returns the last error entraced byabort()
orglobal_entrace()
. The error is printed with a backtrace in simplified form. -
last_trace()
is a shortcut to return the backtrace stored in the last error. This backtrace is printed in full form.
Usage
last_error()
last_trace(drop = NULL)
Arguments
drop |
Whether to drop technical calls. These are hidden from
users by default, set |
See Also
-
rlang_backtrace_on_error
to control what is displayed when an error is thrown. -
global_entrace()
to enablelast_error()
logging for all errors.
Display last messages and warnings
Description
last_warnings()
and last_messages()
return a list of all
warnings and messages that occurred during the last R command.
global_entrace()
must be active in order to log the messages and
warnings.
By default the warnings and messages are printed with a simplified
backtrace, like last_error()
. Use summary()
to print the
conditions with a full backtrace.
Usage
last_warnings(n = NULL)
last_messages(n = NULL)
Arguments
n |
How many warnings or messages to display. Defaults to all. |
Examples
Enable backtrace capture with global_entrace()
:
global_entrace()
Signal some warnings in nested functions. The warnings inform about which function emitted a warning but they don't provide information about the call stack:
f <- function() { warning("foo"); g() } g <- function() { warning("bar", immediate. = TRUE); h() } h <- function() warning("baz") f() #> Warning in g() : bar #> Warning messages: #> 1: In f() : foo #> 2: In h() : baz
Call last_warnings()
to see backtraces for each of these warnings:
last_warnings() #> [[1]] #> <warning/rlang_warning> #> Warning in `f()`: #> foo #> Backtrace: #> x #> 1. \-global f() #> #> [[2]] #> <warning/rlang_warning> #> Warning in `g()`: #> bar #> Backtrace: #> x #> 1. \-global f() #> 2. \-global g() #> #> [[3]] #> <warning/rlang_warning> #> Warning in `h()`: #> baz #> Backtrace: #> x #> 1. \-global f() #> 2. \-global g() #> 3. \-global h()
This works similarly with messages:
f <- function() { inform("Hey!"); g() } g <- function() { inform("Hi!"); h() } h <- function() inform("Hello!") f() #> Hey! #> Hi! #> Hello! rlang::last_messages() #> [[1]] #> <message/rlang_message> #> Message: #> Hey! #> --- #> Backtrace: #> x #> 1. \-global f() #> #> [[2]] #> <message/rlang_message> #> Message: #> Hi! #> --- #> Backtrace: #> x #> 1. \-global f() #> 2. \-global g() #> #> [[3]] #> <message/rlang_message> #> Message: #> Hello! #> --- #> Backtrace: #> x #> 1. \-global f() #> 2. \-global g() #> 3. \-global h()
See Also
Collect dynamic dots in a list
Description
list2(...)
is equivalent to list(...)
with a few additional
features, collectively called dynamic dots. While
list2()
hard-code these features, dots_list()
is a lower-level
version that offers more control.
Usage
list2(...)
dots_list(
...,
.named = FALSE,
.ignore_empty = c("trailing", "none", "all"),
.preserve_empty = FALSE,
.homonyms = c("keep", "first", "last", "error"),
.check_assign = FALSE
)
Arguments
... |
Arguments to collect in a list. These dots are dynamic. |
.named |
If |
.ignore_empty |
Whether to ignore empty arguments. Can be one
of |
.preserve_empty |
Whether to preserve the empty arguments that
were not ignored. If |
.homonyms |
How to treat arguments with the same name. The
default, |
.check_assign |
Whether to check for |
Details
For historical reasons, dots_list()
creates a named list by
default. By comparison list2()
implements the preferred behaviour
of only creating a names vector when a name is supplied.
Value
A list containing the ...
inputs.
Examples
# Let's create a function that takes a variable number of arguments:
numeric <- function(...) {
dots <- list2(...)
num <- as.numeric(dots)
set_names(num, names(dots))
}
numeric(1, 2, 3)
# The main difference with list(...) is that list2(...) enables
# the `!!!` syntax to splice lists:
x <- list(2, 3)
numeric(1, !!! x, 4)
# As well as unquoting of names:
nm <- "yup!"
numeric(!!nm := 1)
# One useful application of splicing is to work around exact and
# partial matching of arguments. Let's create a function taking
# named arguments and dots:
fn <- function(data, ...) {
list2(...)
}
# You normally cannot pass an argument named `data` through the dots
# as it will match `fn`'s `data` argument. The splicing syntax
# provides a workaround:
fn("wrong!", data = letters) # exact matching of `data`
fn("wrong!", dat = letters) # partial matching of `data`
fn(some_data, !!!list(data = letters)) # no matching
# Empty trailing arguments are allowed:
list2(1, )
# But non-trailing empty arguments cause an error:
try(list2(1, , ))
# Use the more configurable `dots_list()` function to preserve all
# empty arguments:
list3 <- function(...) dots_list(..., .preserve_empty = TRUE)
# Note how the last empty argument is still ignored because
# `.ignore_empty` defaults to "trailing":
list3(1, , )
# The list with preserved empty arguments is equivalent to:
list(1, missing_arg())
# Arguments with duplicated names are kept by default:
list2(a = 1, a = 2, b = 3, b = 4, 5, 6)
# Use the `.homonyms` argument to keep only the first of these:
dots_list(a = 1, a = 2, b = 3, b = 4, 5, 6, .homonyms = "first")
# Or the last:
dots_list(a = 1, a = 2, b = 3, b = 4, 5, 6, .homonyms = "last")
# Or raise an informative error:
try(dots_list(a = 1, a = 2, b = 3, b = 4, 5, 6, .homonyms = "error"))
# dots_list() can be configured to warn when a `<-` call is
# detected:
my_list <- function(...) dots_list(..., .check_assign = TRUE)
my_list(a <- 1)
# There is no warning if the assignment is wrapped in braces.
# This requires users to be explicit about their intent:
my_list({ a <- 1 })
Temporarily change bindings of an environment
Description
-
local_bindings()
temporarily changes bindings in.env
(which is by default the caller environment). The bindings are reset to their original values when the current frame (or an arbitrary one if you specify.frame
) goes out of scope. -
with_bindings()
evaluatesexpr
with temporary bindings. Whenwith_bindings()
returns, bindings are reset to their original values. It is a simple wrapper aroundlocal_bindings()
.
Usage
local_bindings(..., .env = .frame, .frame = caller_env())
with_bindings(.expr, ..., .env = caller_env())
Arguments
... |
Pairs of names and values. These dots support splicing (with value semantics) and name unquoting. |
.env |
An environment. |
.frame |
The frame environment that determines the scope of the temporary bindings. When that frame is popped from the call stack, bindings are switched back to their original values. |
.expr |
An expression to evaluate with temporary bindings. |
Value
local_bindings()
returns the values of old bindings
invisibly; with_bindings()
returns the value of expr
.
Examples
foo <- "foo"
bar <- "bar"
# `foo` will be temporarily rebinded while executing `expr`
with_bindings(paste(foo, bar), foo = "rebinded")
paste(foo, bar)
Set local error call in an execution environment
Description
local_error_call()
is an alternative to explicitly passing a
call
argument to abort()
. It sets the call (or a value that
indicates where to find the call, see below) in a local binding
that is automatically picked up by abort()
.
Usage
local_error_call(call, frame = caller_env())
Arguments
call |
This can be:
|
frame |
The execution environment in which to set the local error call. |
Motivation for setting local error calls
By default abort()
uses the function call of its caller as
context in error messages:
foo <- function() abort("Uh oh.") foo() #> Error in `foo()`: Uh oh.
This is not always appropriate. For example a function that checks an input on the behalf of another function should reference the latter, not the former:
arg_check <- function(arg, error_arg = as_string(substitute(arg))) { abort(cli::format_error("{.arg {error_arg}} is failing.")) } foo <- function(x) arg_check(x) foo() #> Error in `arg_check()`: `x` is failing.
The mismatch is clear in the example above. arg_check()
does not
have any x
argument and so it is confusing to present
arg_check()
as being the relevant context for the failure of the
x
argument.
One way around this is to take a call
or error_call
argument
and pass it to abort()
. Here we name this argument error_call
for consistency with error_arg
which is prefixed because there is
an existing arg
argument. In other situations, taking arg
and
call
arguments might be appropriate.
arg_check <- function(arg, error_arg = as_string(substitute(arg)), error_call = caller_env()) { abort( cli::format_error("{.arg {error_arg}} is failing."), call = error_call ) } foo <- function(x) arg_check(x) foo() #> Error in `foo()`: `x` is failing.
This is the generally recommended pattern for argument checking
functions. If you mention an argument in an error message, provide
your callers a way to supply a different argument name and a
different error call. abort()
stores the error call in the call
condition field which is then used to generate the "in" part of
error messages.
In more complex cases it's often burdensome to pass the relevant
call around, for instance if your checking and throwing code is
structured into many different functions. In this case, use
local_error_call()
to set the call locally or instruct abort()
to climb the call stack one level to find the relevant call. In the
following example, the complexity is not so important that sparing
the argument passing makes a big difference. However this
illustrates the pattern:
arg_check <- function(arg, error_arg = caller_arg(arg), error_call = caller_env()) { # Set the local error call local_error_call(error_call) my_classed_stop( cli::format_error("{.arg {error_arg}} is failing.") ) } my_classed_stop <- function(message) { # Forward the local error call to the caller's local_error_call(caller_env()) abort(message, class = "my_class") } foo <- function(x) arg_check(x) foo() #> Error in `foo()`: `x` is failing.
Error call flags in performance-critical functions
The call
argument can also be the string "caller"
. This is
equivalent to caller_env()
or parent.frame()
but has a lower
overhead because call stack introspection is only performed when an
error is triggered. Note that eagerly calling caller_env()
is
fast enough in almost all cases.
If your function needs to be really fast, assign the error call
flag directly instead of calling local_error_call()
:
.__error_call__. <- "caller"
Examples
# Set a context for error messages
function() {
local_error_call(quote(foo()))
local_error_call(sys.call())
}
# Disable the context
function() {
local_error_call(NULL)
}
# Use the caller's context
function() {
local_error_call(caller_env())
}
Change global options
Description
-
local_options()
changes options for the duration of a stack frame (by default the current one). Options are set back to their old values when the frame returns. -
with_options()
changes options while an expression is evaluated. Options are restored when the expression returns. -
push_options()
adds or changes options permanently. -
peek_option()
andpeek_options()
return option values. The former returns the option directly while the latter returns a list.
Usage
local_options(..., .frame = caller_env())
with_options(.expr, ...)
push_options(...)
peek_options(...)
peek_option(name)
Arguments
... |
For |
.frame |
The environment of a stack frame which defines the scope of the temporary options. When the frame returns, the options are set back to their original values. |
.expr |
An expression to evaluate with temporary options. |
name |
An option name as string. |
Value
For local_options()
and push_options()
, the old option
values. peek_option()
returns the current value of an option
while the plural peek_options()
returns a list of current
option values.
Life cycle
These functions are experimental.
Examples
# Store and retrieve a global option:
push_options(my_option = 10)
peek_option("my_option")
# Change the option temporarily:
with_options(my_option = 100, peek_option("my_option"))
peek_option("my_option")
# The scoped variant is useful within functions:
fn <- function() {
local_options(my_option = 100)
peek_option("my_option")
}
fn()
peek_option("my_option")
# The plural peek returns a named list:
peek_options("my_option")
peek_options("my_option", "digits")
Use cli to format error messages
Description
local_use_cli()
marks a package namespace or the environment of a
running function with a special flag that instructs abort()
to
use cli to format error messages. This formatting happens lazily,
at print-time, in various places:
When an unexpected error is displayed to the user.
When a captured error is printed in the console, for instance via
last_error()
.When
conditionMessage()
is called.
cli formats messages and bullets with indentation and width-wrapping to produce a polished display of messages.
Usage
local_use_cli(..., format = TRUE, inline = FALSE, frame = caller_env())
Arguments
Usage
To use cli formatting automatically in your package:
Make sure
run_on_load()
is called from your.onLoad()
hook.Call
on_load(local_use_cli())
at the top level of your namespace.
It is also possible to call local_use_cli()
inside a running
function, in which case the flag only applies within that function.
Missing values
Description
Missing values are represented in R with the general symbol
NA
. They can be inserted in almost all data containers: all
atomic vectors except raw vectors can contain missing values. To
achieve this, R automatically converts the general NA
symbol to a
typed missing value appropriate for the target vector. The objects
provided here are aliases for those typed NA
objects.
Usage
na_lgl
na_int
na_dbl
na_chr
na_cpl
Format
An object of class logical
of length 1.
An object of class integer
of length 1.
An object of class numeric
of length 1.
An object of class character
of length 1.
An object of class complex
of length 1.
Details
Typed missing values are necessary because R needs sentinel values
of the same type (i.e. the same machine representation of the data)
as the containers into which they are inserted. The official typed
missing values are NA_integer_
, NA_real_
, NA_character_
and
NA_complex_
. The missing value for logical vectors is simply the
default NA
. The aliases provided in rlang are consistently named
and thus simpler to remember. Also, na_lgl
is provided as an
alias to NA
that makes intent clearer.
Since na_lgl
is the default NA
, expressions such as c(NA, NA)
yield logical vectors as no data is available to give a clue of the
target type. In the same way, since lists and environments can
contain any types, expressions like list(NA)
store a logical
NA
.
Life cycle
These shortcuts might be moved to the vctrs package at some point. This is why they are marked as questioning.
Examples
typeof(NA)
typeof(na_lgl)
typeof(na_int)
# Note that while the base R missing symbols cannot be overwritten,
# that's not the case for rlang's aliases:
na_dbl <- NA
typeof(na_dbl)
Generate or handle a missing argument
Description
These functions help using the missing argument as a regular R object.
-
missing_arg()
generates a missing argument. -
is_missing()
is likebase::missing()
but also supports testing for missing arguments contained in other objects like lists. It is also more consistent with default arguments which are never treated as missing (see section below). -
maybe_missing()
is useful to pass down an input that might be missing to another function, potentially substituting by a default value. It avoids triggering an "argument is missing" error.
Usage
missing_arg()
is_missing(x)
maybe_missing(x, default = missing_arg())
Arguments
x |
An object that might be the missing argument. |
default |
The object to return if the input is missing,
defaults to |
Other ways to reify the missing argument
-
base::quote(expr = )
is the canonical way to create a missing argument object. -
expr()
called without argument creates a missing argument. -
quo()
called without argument creates an empty quosure, i.e. a quosure containing the missing argument object.
is_missing()
and default arguments
The base function missing()
makes a distinction between default
values supplied explicitly and default values generated through a
missing argument:
fn <- function(x = 1) base::missing(x) fn() #> [1] TRUE fn(1) #> [1] FALSE
This only happens within a function. If the default value has been generated in a calling function, it is never treated as missing:
caller <- function(x = 1) fn(x) caller() #> [1] FALSE
rlang::is_missing()
simplifies these rules by never treating
default arguments as missing, even in internal contexts:
fn <- function(x = 1) rlang::is_missing(x) fn() #> [1] FALSE fn(1) #> [1] FALSE
This is a little less flexible because you can't specialise
behaviour based on implicitly supplied default values. However,
this makes the behaviour of is_missing()
and functions using it
simpler to understand.
Fragility of the missing argument object
The missing argument is an object that triggers an error if and
only if it is the result of evaluating a symbol. No error is
produced when a function call evaluates to the missing argument
object. For instance, it is possible to bind the missing argument
to a variable with an expression like x[[1]] <- missing_arg()
.
Likewise, x[[1]]
is safe to use as argument, e.g. list(x[[1]])
even when the result is the missing object.
However, as soon as the missing argument is passed down between functions through a bare variable, it is likely to cause a missing argument error:
x <- missing_arg() list(x) #> Error: #> ! argument "x" is missing, with no default
To work around this, is_missing()
and maybe_missing(x)
use a
bit of magic to determine if the input is the missing argument
without triggering a missing error.
x <- missing_arg() list(maybe_missing(x)) #> [[1]] #>
maybe_missing()
is particularly useful for prototyping
meta-programming algorithms in R. The missing argument is a likely
input when computing on the language because it is a standard
object in formals lists. While C functions are always allowed to
return the missing argument and pass it to other C functions, this
is not the case on the R side. If you're implementing your
meta-programming algorithm in R, use maybe_missing()
when an
input might be the missing argument object.
Examples
# The missing argument usually arises inside a function when the
# user omits an argument that does not have a default:
fn <- function(x) is_missing(x)
fn()
# Creating a missing argument can also be useful to generate calls
args <- list(1, missing_arg(), 3, missing_arg())
quo(fn(!!! args))
# Other ways to create that object include:
quote(expr = )
expr()
# It is perfectly valid to generate and assign the missing
# argument in a list.
x <- missing_arg()
l <- list(missing_arg())
# Just don't evaluate a symbol that contains the empty argument.
# Evaluating the object `x` that we created above would trigger an
# error.
# x # Not run
# On the other hand accessing a missing argument contained in a
# list does not trigger an error because subsetting is a function
# call:
l[[1]]
is.null(l[[1]])
# In case you really need to access a symbol that might contain the
# empty argument object, use maybe_missing():
maybe_missing(x)
is.null(maybe_missing(x))
is_missing(maybe_missing(x))
# Note that base::missing() only works on symbols and does not
# support complex expressions. For this reason the following lines
# would throw an error:
#> missing(missing_arg())
#> missing(l[[1]])
# while is_missing() will work as expected:
is_missing(missing_arg())
is_missing(l[[1]])
Inform about name repair
Description
Inform about name repair
Usage
names_inform_repair(old, new)
Arguments
old |
Original names vector. |
new |
Repaired names vector. |
Muffling and silencing messages
Name repair messages are signaled with inform()
and are given the class
"rlib_message_name_repair"
. These messages can be muffled with
base::suppressMessages()
.
Name repair messages can also be silenced with the global option
rlib_name_repair_verbosity
. This option takes the values:
-
"verbose"
: Always verbose. -
"quiet"
: Always quiet.
When set to quiet, the message is not displayed and the condition is not
signaled. This is particularly useful for silencing messages during testing
when combined with local_options()
.
Get names of a vector
Description
names2()
always returns a character vector, even when an
object does not have a names
attribute. In this case, it returns
a vector of empty names ""
. It also standardises missing names to
""
.
The replacement variant names2<-
never adds NA
names and
instead fills unnamed vectors with ""
.
Usage
names2(x)
names2(x) <- value
Arguments
x |
A vector. |
value |
New names. |
Examples
names2(letters)
# It also takes care of standardising missing names:
x <- set_names(1:3, c("a", NA, "b"))
names2(x)
# Replacing names with the base `names<-` function may introduce
# `NA` values when the vector is unnamed:
x <- 1:3
names(x)[1:2] <- "foo"
names(x)
# Use the `names2<-` variant to avoid this
x <- 1:3
names2(x)[1:2] <- "foo"
names(x)
Create a new call from components
Description
Create a new call from components
Usage
new_call(car, cdr = NULL)
Arguments
car |
The head of the call. It should be a callable object: a symbol, call, or literal function. |
cdr |
The tail of the call, i.e. a pairlist of arguments. |
Create a formula
Description
Create a formula
Usage
new_formula(lhs, rhs, env = caller_env())
Arguments
lhs , rhs |
A call, name, or atomic vector. |
env |
An environment. |
Value
A formula object.
See Also
Examples
new_formula(quote(a), quote(b))
new_formula(NULL, quote(b))
Create a function
Description
This constructs a new function given its three components: list of arguments, body code and parent environment.
Usage
new_function(args, body, env = caller_env())
Arguments
args |
A named list or pairlist of default arguments. Note
that if you want arguments that don't have defaults, you'll need
to use the special function |
body |
A language object representing the code inside the
function. Usually this will be most easily generated with
|
env |
The parent environment of the function, defaults to the
calling environment of |
Examples
f <- function() letters
g <- new_function(NULL, quote(letters))
identical(f, g)
# Pass a list or pairlist of named arguments to create a function
# with parameters. The name becomes the parameter name and the
# argument the default value for this parameter:
new_function(list(x = 10), quote(x))
new_function(pairlist2(x = 10), quote(x))
# Use `exprs()` to create quoted defaults. Compare:
new_function(pairlist2(x = 5 + 5), quote(x))
new_function(exprs(x = 5 + 5), quote(x))
# Pass empty arguments to omit defaults. `list()` doesn't allow
# empty arguments but `pairlist2()` does:
new_function(pairlist2(x = , y = 5 + 5), quote(x + y))
new_function(exprs(x = , y = 5 + 5), quote(x + y))
Helpers for pairlist and language nodes
Description
Important: These functions are for expert R programmers only. You should only use them if you feel comfortable manipulating low level R data structures at the C level. We export them at the R level in order to make it easy to prototype C code. They don't perform any type checking and can crash R very easily (try to take the CAR of an integer vector — save any important objects beforehand!).
Usage
new_node(car, cdr = NULL)
node_car(x)
node_cdr(x)
node_caar(x)
node_cadr(x)
node_cdar(x)
node_cddr(x)
node_poke_car(x, newcar)
node_poke_cdr(x, newcdr)
node_poke_caar(x, newcar)
node_poke_cadr(x, newcar)
node_poke_cdar(x, newcdr)
node_poke_cddr(x, newcdr)
node_tag(x)
node_poke_tag(x, newtag)
Arguments
car , newcar , cdr , newcdr |
The new CAR or CDR for the node. These can be any R objects. |
x |
A language or pairlist node. Note that these functions are barebones and do not perform any type checking. |
newtag |
The new tag for the node. This should be a symbol. |
Value
Setters like node_poke_car()
invisibly return x
modified
in place. Getters return the requested node component.
See Also
duplicate()
for creating copy-safe objects and
base::pairlist()
for an easier way of creating a linked list of
nodes.
Create a quosure from components
Description
-
new_quosure()
wraps any R object (including expressions, formulas, or other quosures) into a quosure. -
as_quosure()
is similar but it does not rewrap formulas and quosures.
Usage
new_quosure(expr, env = caller_env())
as_quosure(x, env = NULL)
is_quosure(x)
Arguments
expr |
An expression to wrap in a quosure. |
env |
The environment in which the expression should be evaluated. Only used for symbols and calls. This should normally be the environment in which the expression was created. |
x |
An object to test. |
See Also
-
enquo()
andquo()
for creating a quosure by argument defusal.
Examples
# `new_quosure()` creates a quosure from its components. These are
# equivalent:
new_quosure(quote(foo), current_env())
quo(foo)
# `new_quosure()` always rewraps its input into a new quosure, even
# if the input is itself a quosure:
new_quosure(quo(foo))
# This is unlike `as_quosure()` which preserves its input if it's
# already a quosure:
as_quosure(quo(foo))
# `as_quosure()` uses the supplied environment with naked expressions:
env <- env(var = "thing")
as_quosure(quote(var), env)
# If the expression already carries an environment, this
# environment is preserved. This is the case for formulas and
# quosures:
as_quosure(~foo, env)
as_quosure(~foo)
# An environment must be supplied when the input is a naked
# expression:
try(
as_quosure(quote(var))
)
Create a list of quosures
Description
This small S3 class provides methods for [
and c()
and ensures
the following invariants:
The list only contains quosures.
It is always named, possibly with a vector of empty strings.
new_quosures()
takes a list of quosures and adds the quosures
class and a vector of empty names if needed. as_quosures()
calls
as_quosure()
on all elements before creating the quosures
object.
Usage
new_quosures(x)
as_quosures(x, env, named = FALSE)
is_quosures(x)
Arguments
x |
A list of quosures or objects to coerce to quosures. |
env |
The default environment for the new quosures. |
named |
Whether to name the list with |
Create a weak reference
Description
A weak reference is a special R object which makes it possible to keep a reference to an object without preventing garbage collection of that object. It can also be used to keep data about an object without preventing GC of the object, similar to WeakMaps in JavaScript.
Objects in R are considered reachable if they can be accessed by following
a chain of references, starting from a root node; root nodes are
specially-designated R objects, and include the global environment and base
environment. As long as the key is reachable, the value will not be garbage
collected. This is true even if the weak reference object becomes
unreachable. The key effectively prevents the weak reference and its value
from being collected, according to the following chain of ownership:
weakref <- key -> value
.
When the key becomes unreachable, the key and value in the weak reference
object are replaced by NULL
, and the finalizer is scheduled to execute.
Usage
new_weakref(key, value = NULL, finalizer = NULL, on_quit = FALSE)
Arguments
key |
The key for the weak reference. Must be a reference object – that is, an environment or external pointer. |
value |
The value for the weak reference. This can be |
finalizer |
A function that is run after the key becomes unreachable. |
on_quit |
Should the finalizer be run when R exits? |
See Also
is_weakref()
, wref_key()
and wref_value()
.
Examples
e <- env()
# Create a weak reference to e
w <- new_weakref(e, finalizer = function(e) message("finalized"))
# Get the key object from the weak reference
identical(wref_key(w), e)
# When the regular reference (the `e` binding) is removed and a GC occurs,
# the weak reference will not keep the object alive.
rm(e)
gc()
identical(wref_key(w), NULL)
# A weak reference with a key and value. The value contains data about the
# key.
k <- env()
v <- list(1, 2, 3)
w <- new_weakref(k, v)
identical(wref_key(w), k)
identical(wref_value(w), v)
# When v is removed, the weak ref keeps it alive because k is still reachable.
rm(v)
gc()
identical(wref_value(w), list(1, 2, 3))
# When k is removed, the weak ref does not keep k or v alive.
rm(k)
gc()
identical(wref_key(w), NULL)
identical(wref_value(w), NULL)
Create vectors matching a given length
Description
These functions construct vectors of a given length, with attributes
specified via dots. Except for new_list()
and new_raw()
, the
empty vectors are filled with typed missing values. This is in
contrast to the base function base::vector()
which creates
zero-filled vectors.
Usage
new_logical(n, names = NULL)
new_integer(n, names = NULL)
new_double(n, names = NULL)
new_character(n, names = NULL)
new_complex(n, names = NULL)
new_raw(n, names = NULL)
new_list(n, names = NULL)
Arguments
n |
The vector length. |
names |
Names for the new vector. |
Lifecycle
These functions are likely to be replaced by a vctrs equivalent in the future. They are in the questioning lifecycle stage.
See Also
rep_along
Examples
new_list(10)
new_logical(10)
Get the namespace of a package
Description
Namespaces are the environment where all the functions of a package
live. The parent environments of namespaces are the imports
environments, which contain all the functions imported from other
packages.
Usage
ns_env(x = caller_env())
ns_imports_env(x = caller_env())
ns_env_name(x = caller_env())
Arguments
x |
|
See Also
Return the namespace registry env
Description
Note that the namespace registry does not behave like a normal
environment because the parent is NULL
instead of the empty
environment. This is exported for expert usage in development tools
only.
Usage
ns_registry_env()
Address of an R object
Description
Address of an R object
Usage
obj_address(x)
Arguments
x |
Any R object. |
Value
Its address in memory in a string.
Run expressions on load
Description
-
on_load()
registers expressions to be run on the user's machine each time the package is loaded in memory. This is by contrast to normal R package code which is run once at build time on the packager's machine (e.g. CRAN).on_load()
expressions requirerun_on_load()
to be called inside.onLoad()
. -
on_package_load()
registers expressions to be run each time another package is loaded.
on_load()
is for your own package and runs expressions when the
namespace is not sealed yet. This means you can modify existing
binding or create new ones. This is not the case with
on_package_load()
which runs expressions after a foreign package
has finished loading, at which point its namespace is sealed.
Usage
on_load(expr, env = parent.frame(), ns = topenv(env))
run_on_load(ns = topenv(parent.frame()))
on_package_load(pkg, expr, env = parent.frame())
Arguments
expr |
An expression to run on load. |
env |
The environment in which to evaluate |
ns |
The namespace in which to hook |
pkg |
Package to hook expression into. |
When should I run expressions on load?
There are two main use cases for running expressions on load:
When a side effect, such as registering a method with
s3_register()
, must occur in the user session rather than the package builder session.To avoid hard-coding objects from other packages in your namespace. If you assign
foo::bar
or the result offoo::baz()
in your package, they become constants. Any upstream changes in thefoo
package will not be reflected in the objects you've assigned in your namespace. This often breaks assumptions made by the authors offoo
and causes all sorts of issues.Recreating the foreign objects each time your package is loaded makes sure that any such changes will be taken into account. In technical terms, running an expression on load introduces indirection.
Comparison with .onLoad()
on_load()
has the advantage that hooked expressions can appear in
any file, in context. This is unlike .onLoad()
which gathers
disparate expressions in a single block.
on_load()
is implemented via .onLoad()
and requires
run_on_load()
to be called from that hook.
The expressions inside on_load()
do not undergo static analysis
by R CMD check
. Therefore, it is advisable to only use
simple function calls inside on_load()
.
Examples
quote({ # Not run
# First add `run_on_load()` to your `.onLoad()` hook,
# then use `on_load()` anywhere in your package
.onLoad <- function(lib, pkg) {
run_on_load()
}
# Register a method on load
on_load({
s3_register("foo::bar", "my_class")
})
# Assign an object on load
var <- NULL
on_load({
var <- foo()
})
# To use `on_package_load()` at top level, wrap it in `on_load()`
on_load({
on_package_load("foo", message("foo is loaded"))
})
# In functions it can be called directly
f <- function() on_package_load("foo", message("foo is loaded"))
})
Infix attribute accessor and setter
Description
This operator extracts or sets attributes for regular objects and S4 fields for S4 objects.
Usage
x %@% name
x %@% name <- value
Arguments
x |
Object |
name |
Attribute name |
value |
New value for attribute |
Examples
# Unlike `@`, this operator extracts attributes for any kind of
# objects:
factor(1:3) %@% "levels"
mtcars %@% class
mtcars %@% class <- NULL
mtcars
# It also works on S4 objects:
.Person <- setClass("Person", slots = c(name = "character", species = "character"))
fievel <- .Person(name = "Fievel", species = "mouse")
fievel %@% name
Replace missing values
Description
Note: This operator is now out of scope for rlang. It will be
replaced by a vctrs-powered operator (probably in the funs package) at which point the
rlang version of %|%
will be deprecated.
This infix function is similar to %||%
but is vectorised
and provides a default value for missing elements. It is faster
than using base::ifelse()
and does not perform type conversions.
Usage
x %|% y
Arguments
x |
The original values. |
y |
The replacement values. Must be of length 1 or the same length as |
See Also
Examples
c("a", "b", NA, "c") %|% "default"
c(1L, NA, 3L, NA, NA) %|% (6L:10L)
Default value for NULL
Description
This infix function makes it easy to replace NULL
s with a default
value. It's inspired by the way that Ruby's or operation (||
)
works.
Usage
x %||% y
Arguments
x , y |
If |
Examples
1 %||% 2
NULL %||% 2
Collect dynamic dots in a pairlist
Description
This pairlist constructor uses dynamic dots. Use it to manually create argument lists for calls or parameter lists for functions.
Usage
pairlist2(...)
Arguments
... |
<dynamic> Arguments stored in the pairlist. Empty arguments are preserved. |
Examples
# Unlike `exprs()`, `pairlist2()` evaluates its arguments.
new_function(pairlist2(x = 1, y = 3 * 6), quote(x * y))
new_function(exprs(x = 1, y = 3 * 6), quote(x * y))
# It preserves missing arguments, which is useful for creating
# parameters without defaults:
new_function(pairlist2(x = , y = 3 * 6), quote(x * y))
Parse R code
Description
These functions parse and transform text into R expressions. This is the first step to interpret or evaluate a piece of R code written by a programmer.
-
parse_expr()
returns one expression. If the text contains more than one expression (separated by semicolons or new lines), an error is issued. On the other handparse_exprs()
can handle multiple expressions. It always returns a list of expressions (compare tobase::parse()
which returns a base::expression vector). All functions also support R connections. -
parse_expr()
concatenatesx
with\\n
separators prior to parsing in order to support the roundtripparse_expr(expr_deparse(x))
(deparsed expressions might be multiline). On the other hand,parse_exprs()
doesn't do any concatenation because it's designed to support named inputs. The names are matched to the expressions in the output, which is useful when a single named string creates multiple expressions.In other words,
parse_expr()
supports vector of lines whereasparse_exprs()
expects vectors of complete deparsed expressions. -
parse_quo()
andparse_quos()
are variants that create a quosure. Supplyenv = current_env()
if you're parsing code to be evaluated in your current context. Supplyenv = global_env()
when you're parsing external user input to be evaluated in user context.Unlike quosures created with
enquo()
,enquos()
, or{{
, a parsed quosure never contains injected quosures. It is thus safe to evaluate them witheval()
instead ofeval_tidy()
, though the latter is more convenient as you don't need to extractexpr
andenv
.
Usage
parse_expr(x)
parse_exprs(x)
parse_quo(x, env)
parse_quos(x, env)
Arguments
x |
Text containing expressions to parse_expr for
|
env |
The environment for the quosures. The global environment (the default) may be the right choice when you are parsing external user inputs. You might also want to evaluate the R code in an isolated context (perhaps a child of the global environment or of the base environment). |
Details
Unlike base::parse()
, these functions never retain source reference
information, as doing so is slow and rarely necessary.
Value
parse_expr()
returns an expression,
parse_exprs()
returns a list of expressions. Note that for the
plural variants the length of the output may be greater than the
length of the input. This would happen is one of the strings
contain several expressions (such as "foo; bar"
). The names of
x
are preserved (and recycled in case of multiple expressions).
The _quo
suffixed variants return quosures.
See Also
Examples
# parse_expr() can parse any R expression:
parse_expr("mtcars %>% dplyr::mutate(cyl_prime = cyl / sd(cyl))")
# A string can contain several expressions separated by ; or \n
parse_exprs("NULL; list()\n foo(bar)")
# Use names to figure out which input produced an expression:
parse_exprs(c(foo = "1; 2", bar = "3"))
# You can also parse source files by passing a R connection. Let's
# create a file containing R code:
path <- tempfile("my-file.R")
cat("1; 2; mtcars", file = path)
# We can now parse it by supplying a connection:
parse_exprs(file(path))
Name of a primitive function
Description
Name of a primitive function
Usage
prim_name(prim)
Arguments
prim |
A primitive function such as |
Show injected expression
Description
qq_show()
helps examining injected expressions
inside a function. This is useful for learning about injection and
for debugging injection code.
Arguments
expr |
An expression involving injection operators. |
Examples
qq_show()
shows the intermediary expression before it is
evaluated by R:
list2(!!!1:3) #> [[1]] #> [1] 1 #> #> [[2]] #> [1] 2 #> #> [[3]] #> [1] 3 qq_show(list2(!!!1:3)) #> list2(1L, 2L, 3L)
It is especially useful inside functions to reveal what an injected expression looks like:
my_mean <- function(data, var) { qq_show(data %>% dplyr::summarise(mean({{ var }}))) } mtcars %>% my_mean(cyl) #> data %>% dplyr::summarise(mean(^cyl))
See Also
Squash a quosure
Description
This function is deprecated, please use
quo_squash()
instead.
Usage
quo_expr(quo, warn = FALSE)
Arguments
quo |
A quosure or expression. |
warn |
Whether to warn if the quosure contains other quosures
(those will be collapsed). This is useful when you use
|
Format quosures for printing or labelling
Description
Note: You should now use as_label()
or as_name()
instead
of quo_name()
. See life cycle section below.
These functions take an arbitrary R object, typically an expression, and represent it as a string.
-
quo_name()
returns an abbreviated representation of the object as a single line string. It is suitable for default names. -
quo_text()
returns a multiline string. For instance block expressions like{ foo; bar }
are represented on 4 lines (one for each symbol, and the curly braces on their own lines).
These deparsers are only suitable for creating default names or
printing output at the console. The behaviour of your functions
should not depend on deparsed objects. If you are looking for a way
of transforming symbols to strings, use as_string()
instead of
quo_name()
. Unlike deparsing, the transformation between symbols
and strings is non-lossy and well defined.
Usage
quo_label(quo)
quo_text(quo, width = 60L, nlines = Inf)
quo_name(quo)
Arguments
quo |
A quosure or expression. |
width |
Width of each line. |
nlines |
Maximum number of lines to extract. |
Life cycle
These functions are superseded.
-
as_label()
andas_name()
should be used instead ofquo_name()
.as_label()
transforms any R object to a string but should only be used to create a default name. Labelisation is not a well defined operation and no assumption should be made about the label. On the other hand,as_name()
only works with (possibly quosured) symbols, but is a well defined and deterministic operation. We don't have a good replacement for
quo_text()
yet. See https://github.com/r-lib/rlang/issues/636 to follow discussions about a new deparsing API.
See Also
Examples
# Quosures can contain nested quosures:
quo <- quo(foo(!! quo(bar)))
quo
# quo_squash() unwraps all quosures and returns a raw expression:
quo_squash(quo)
# This is used by quo_text() and quo_label():
quo_text(quo)
# Compare to the unwrapped expression:
expr_text(quo)
# quo_name() is helpful when you need really short labels:
quo_name(quo(sym))
quo_name(quo(!! sym))
Squash a quosure
Description
quo_squash()
flattens all nested quosures within an expression.
For example it transforms ^foo(^bar(), ^baz)
to the bare
expression foo(bar(), baz)
.
This operation is safe if the squashed quosure is used for
labelling or printing (see as_label()
, but note that as_label()
squashes quosures automatically). However if the squashed quosure
is evaluated, all expressions of the flattened quosures are
resolved in a single environment. This is a source of bugs so it is
good practice to set warn
to TRUE
to let the user know about
the lossy squashing.
Usage
quo_squash(quo, warn = FALSE)
Arguments
quo |
A quosure or expression. |
warn |
Whether to warn if the quosure contains other quosures
(those will be collapsed). This is useful when you use
|
Examples
# Quosures can contain nested quosures:
quo <- quo(wrapper(!!quo(wrappee)))
quo
# quo_squash() flattens all the quosures and returns a simple expression:
quo_squash(quo)
Quosure getters, setters and predicates
Description
These tools inspect and modify quosures, a type of defused expression that includes a reference to the context where it was created. A quosure is guaranteed to evaluate in its original environment and can refer to local objects safely.
You can access the quosure components with
quo_get_expr()
andquo_get_env()
.The
quo_
prefixed predicates test the expression of a quosure,quo_is_missing()
,quo_is_symbol()
, etc.
All quo_
prefixed functions expect a quosure and will fail if
supplied another type of object. Make sure the input is a quosure
with is_quosure()
.
Usage
quo_is_missing(quo)
quo_is_symbol(quo, name = NULL)
quo_is_call(quo, name = NULL, n = NULL, ns = NULL)
quo_is_symbolic(quo)
quo_is_null(quo)
quo_get_expr(quo)
quo_get_env(quo)
quo_set_expr(quo, expr)
quo_set_env(quo, env)
Arguments
quo |
A quosure to test. |
name |
The name of the symbol or function call. If |
n |
An optional number of arguments that the call should match. |
ns |
The namespace of the call. If Can be a character vector of namespaces, in which case the call
has to match at least one of them, otherwise |
expr |
A new expression for the quosure. |
env |
A new environment for the quosure. |
Empty quosures and missing arguments
When missing arguments are captured as quosures, either through
enquo()
or quos()
, they are returned as an empty quosure. These
quosures contain the missing argument and typically
have the empty environment as enclosure.
Use quo_is_missing()
to test for a missing argument defused with
enquo()
.
See Also
-
quo()
for creating quosures by argument defusal. -
new_quosure()
andas_quosure()
for assembling quosures from components. -
What are quosures and when are they needed? for an overview.
Examples
quo <- quo(my_quosure)
quo
# Access and set the components of a quosure:
quo_get_expr(quo)
quo_get_env(quo)
quo <- quo_set_expr(quo, quote(baz))
quo <- quo_set_env(quo, empty_env())
quo
# Test wether an object is a quosure:
is_quosure(quo)
# If it is a quosure, you can use the specialised type predicates
# to check what is inside it:
quo_is_symbol(quo)
quo_is_call(quo)
quo_is_null(quo)
# quo_is_missing() checks for a special kind of quosure, the one
# that contains the missing argument:
quo()
quo_is_missing(quo())
fn <- function(arg) enquo(arg)
fn()
quo_is_missing(fn())
Serialize a raw vector to a string
Description
This function converts a raw vector to a hexadecimal string,
optionally adding a prefix and a suffix.
It is roughly equivalent to
paste0(prefix, paste(format(x), collapse = ""), suffix)
and much faster.
Usage
raw_deparse_str(x, prefix = NULL, suffix = NULL)
Arguments
x |
A raw vector. |
prefix , suffix |
Prefix and suffix strings, or 'NULL. |
Value
A string.
Examples
raw_deparse_str(raw())
raw_deparse_str(charToRaw("string"))
raw_deparse_str(raw(10), prefix = "'0x", suffix = "'")
Create vectors matching the length of a given vector
Description
These functions take the idea of seq_along()
and apply it to
repeating values.
Usage
rep_along(along, x)
rep_named(names, x)
Arguments
along |
Vector whose length determine how many times |
x |
Values to repeat. |
names |
Names for the new vector. The length of |
See Also
new-vector
Examples
x <- 0:5
rep_along(x, 1:2)
rep_along(x, 1)
# Create fresh vectors by repeating missing values:
rep_along(x, na_int)
rep_along(x, na_chr)
# rep_named() repeats a value along a names vectors
rep_named(c("foo", "bar"), list(letters))
Jump to or from a frame
Description
While base::return()
can only return from the current local
frame, return_from()
will return from any frame on the
current evaluation stack, between the global and the currently
active context.
Usage
return_from(frame, value = NULL)
Arguments
frame |
An execution environment of a currently running function. |
value |
The return value. |
Examples
fn <- function() {
g(current_env())
"ignored"
}
g <- function(env) {
h(env)
"ignored"
}
h <- function(env) {
return_from(env, "early return")
"ignored"
}
fn()
Display backtrace on error
Description
rlang errors carry a backtrace that can be inspected by calling
last_error()
. You can also control the default display of the
backtrace by setting the option rlang_backtrace_on_error
to one
of the following values:
-
"none"
show nothing. -
"reminder"
, the default in interactive sessions, displays a reminder that you can see the backtrace withlast_error()
. -
"branch"
displays a simplified backtrace. -
"full"
, the default in non-interactive sessions, displays the full tree.
rlang errors are normally thrown with abort()
. If you promote
base errors to rlang errors with global_entrace()
,
rlang_backtrace_on_error
applies to all errors.
Promote base errors to rlang errors
You can use options(error = rlang::entrace)
to promote base errors to
rlang errors. This does two things:
It saves the base error as an rlang object so you can call
last_error()
to print the backtrace or inspect its data.It prints the backtrace for the current error according to the
rlang_backtrace_on_error
option.
Warnings and errors in RMarkdown
The display of errors depends on whether they're expected (i.e.
chunk option error = TRUE
) or unexpected:
Expected errors are controlled by the global option
"rlang_backtrace_on_error_report"
(note the_report
suffix). The default is"none"
so that your expected errors don't include a reminder to runrlang::last_error()
. Customise this option if you want to demonstrate what the error backtrace will look like.You can also use
last_error()
to display the trace like you would in your session, but it currently only works in the next chunk.Unexpected errors are controlled by the global option
"rlang_backtrace_on_error"
. The default is"branch"
so you'll see a simplified backtrace in the knitr output to help you figure out what went wrong.
When knitr is running (as determined by the knitr.in.progress
global option), the default top environment for backtraces is set
to the chunk environment knitr::knit_global()
. This ensures that
the part of the call stack belonging to knitr does not end up in
backtraces. If needed, you can override this by setting the
rlang_trace_top_env
global option.
Similarly to rlang_backtrace_on_error_report
, you can set
rlang_backtrace_on_warning_report
inside RMarkdown documents to
tweak the display of warnings. This is useful in conjunction with
global_entrace()
. Because of technical limitations, there is
currently no corresponding rlang_backtrace_on_warning
option for
normal R sessions.
To get full entracing in an Rmd document, include this in a setup chunk before the first error or warning is signalled.
```{r setup} rlang::global_entrace() options(rlang_backtrace_on_warning_report = "full") options(rlang_backtrace_on_error_report = "full") ```
See Also
rlang_backtrace_on_warning
Examples
# Display a simplified backtrace on error for both base and rlang
# errors:
# options(
# rlang_backtrace_on_error = "branch",
# error = rlang::entrace
# )
# stop("foo")
Errors of class rlang_error
Description
abort()
and error_cnd()
create errors of class "rlang_error"
.
The differences with base errors are:
Implementing
conditionMessage()
methods for subclasses of"rlang_error"
is undefined behaviour. Instead, implement thecnd_header()
method (and possiblycnd_body()
andcnd_footer()
). These methods return character vectors which are assembled by rlang when needed: whenconditionMessage.rlang_error()
is called (e.g. viatry()
), when the error is displayed throughprint()
orformat()
, and of course when the error is displayed to the user byabort()
.-
cnd_header()
,cnd_body()
, andcnd_footer()
methods can be overridden by storing closures in theheader
,body
, andfooter
fields of the condition. This is useful to lazily generate messages based on state captured in the closure environment. -
The
use_cli_format
condition field instructs whether to use cli (or rlang's fallback method if cli is not installed) to format the error message at print time.In this case, the
message
field may be a character vector of header and bullets. These are formatted at the last moment to take the context into account (starting position on the screen and indentation).See
local_use_cli()
for automatically setting this field in errors thrown withabort()
within your package.
Backtrace specification
Description
Structure
An r-lib backtrace is a data frame that contains the following columns:
-
call
: List of calls. These may carrysrcref
objects. -
visible
: Logical vector. IfFALSE
, the corresponding call will be hidden from simplified backtraces. -
parent
: Integer vector of parent references (seesys.parents()
) as row numbers. 0 is global. -
namespace
: Character vector of namespaces.NA
for global or no namespace -
scope
: Character vector of strings taking values"::"
,":::"
,"global"
, or"local"
.
A backtrace data frame may contain extra columns. If you add
additional columns, make sure to prefix their names with the name
of your package or organisation to avoid potential conflicts with
future extensions of this spec, e.g. "mypkg_column"
.
Operations
-
Length. The length of the backtrace is the number of rows of the underlying data.
-
Concatenation. Performed by row-binding two backtraces. The
parent
column of the RHS is shifted bynrow(LHS)
so that the last call of the LHS takes place of the global frame of the RHS. -
Subsetting. Performed by slicing the backtrace. After the data frame is sliced, the
parent
column is adjusted to the new row indices. Anyparent
value that no longer exists in the sliced backtrace is set to 0 (the global frame).
Scalar type predicates
Description
These predicates check for a given type and whether the vector is "scalar", that is, of length 1.
In addition to the length check, is_string()
and is_bool()
return FALSE
if their input is missing. This is useful for
type-checking arguments, when your function expects a single string
or a single TRUE
or FALSE
.
Usage
is_scalar_list(x)
is_scalar_atomic(x)
is_scalar_vector(x)
is_scalar_integer(x)
is_scalar_double(x)
is_scalar_complex(x)
is_scalar_character(x)
is_scalar_logical(x)
is_scalar_raw(x)
is_string(x, string = NULL)
is_scalar_bytes(x)
is_bool(x)
Arguments
x |
object to be tested. |
string |
A string to compare to |
See Also
type-predicates, bare-type-predicates
Deprecated scoped
functions
Description
These functions are deprecated as of rlang 0.3.0. Please use
is_attached()
instead.
Usage
scoped_env(nm)
is_scoped(nm)
Arguments
nm |
The name of an environment attached to the search
path. Call |
Deprecated scoped_
functions
Description
Deprecated as of rlang 0.4.2. Use local_interactive()
,
local_options()
, or local_bindings()
instead.
Usage
scoped_interactive(value = TRUE, frame = caller_env())
scoped_options(..., .frame = caller_env())
scoped_bindings(..., .env = .frame, .frame = caller_env())
Arguments
value |
A single |
frame , .frame |
The environment of a running function which defines the scope of the temporary options. When the function returns, the options are reset to their original values. |
... |
For |
.env |
An environment. |
Search path environments
Description
The search path is a chain of environments containing exported functions of attached packages.
The API includes:
-
base::search()
to get the names of environments attached to the search path. -
search_envs()
returns the environments on the search path as a list. -
pkg_env_name()
takes a bare package name and prefixes it with"package:"
. Attached package environments have search names of the formpackage:name
. -
pkg_env()
takes a bare package name and returns the scoped environment of packages if they are attached to the search path, and throws an error otherwise. It is a shortcut forsearch_env(pkg_env_name("pkgname"))
. -
global_env()
andbase_env()
(simple aliases forglobalenv()
andbaseenv()
). These are respectively the first and last environments of the search path. -
is_attached()
returnsTRUE
when its argument (a search name or a package environment) is attached to the search path.
Usage
search_envs()
search_env(name)
pkg_env(pkg)
pkg_env_name(pkg)
is_attached(x)
base_env()
global_env()
Arguments
name |
The name of an environment attached to the search
path. Call |
pkg |
The name of a package. |
x |
An environment or a search name. |
The search path
This chain of environments determines what objects are visible from the global workspace. It contains the following elements:
The chain always starts with
global_env()
and finishes withbase_env()
which inherits from the terminal environmentempty_env()
.Each
base::library()
call attaches a new package environment to the search path. Attached packages are associated with a search name.In addition, any list, data frame, or environment can be attached to the search path with
base::attach()
.
Examples
# List the search names of environments attached to the search path:
search()
# Get the corresponding environments:
search_envs()
# The global environment and the base package are always first and
# last in the chain, respectively:
envs <- search_envs()
envs[[1]]
envs[[length(envs)]]
# These two environments have their own shortcuts:
global_env()
base_env()
# Packages appear in the search path with a special name. Use
# pkg_env_name() to create that name:
pkg_env_name("rlang")
search_env(pkg_env_name("rlang"))
# Alternatively, get the scoped environment of a package with
# pkg_env():
pkg_env("utils")
Increasing sequence of integers in an interval
Description
These helpers take two endpoints and return the sequence of all
integers within that interval. For seq2_along()
, the upper
endpoint is taken from the length of a vector. Unlike
base::seq()
, they return an empty vector if the starting point is
a larger integer than the end point.
Usage
seq2(from, to)
seq2_along(from, x)
Arguments
from |
The starting point of the sequence. |
to |
The end point. |
x |
A vector whose length is the end point. |
Value
An integer vector containing a strictly increasing sequence.
Examples
seq2(2, 10)
seq2(10, 2)
seq(10, 2)
seq2_along(10, letters)
Add attributes to an object
Description
Usage
set_attrs(.x, ...)
Arguments
.x , ... |
Set and get an expression
Description
These helpers are useful to make your function work generically
with quosures and raw expressions. First call get_expr()
to
extract an expression. Once you're done processing the expression,
call set_expr()
on the original object to update the expression.
You can return the result of set_expr()
, either a formula or an
expression depending on the input type. Note that set_expr()
does
not change its input, it creates a new object.
Usage
set_expr(x, value)
get_expr(x, default = x)
Arguments
x |
An expression, closure, or one-sided formula. In addition,
|
value |
An updated expression. |
default |
A default expression to return when |
Value
The updated original input for set_expr()
. A raw
expression for get_expr()
.
See Also
quo_get_expr()
and quo_set_expr()
for versions of
get_expr()
and set_expr()
that only work on quosures.
Examples
f <- ~foo(bar)
e <- quote(foo(bar))
frame <- identity(identity(ctxt_frame()))
get_expr(f)
get_expr(e)
get_expr(frame)
set_expr(f, quote(baz))
set_expr(e, quote(baz))
Set names of a vector
Description
This is equivalent to stats::setNames()
, with more features and
stricter argument checking.
Usage
set_names(x, nm = x, ...)
Arguments
x |
Vector to name. |
nm , ... |
Vector of names, the same length as You can specify names in the following ways:
|
Life cycle
set_names()
is stable and exported in purrr.
Examples
set_names(1:4, c("a", "b", "c", "d"))
set_names(1:4, letters[1:4])
set_names(1:4, "a", "b", "c", "d")
# If the second argument is ommitted a vector is named with itself
set_names(letters[1:5])
# Alternatively you can supply a function
set_names(1:10, ~ letters[seq_along(.)])
set_names(head(mtcars), toupper)
# If the input vector is unnamed, it is first named after itself
# before the function is applied:
set_names(letters, toupper)
# `...` is passed to the function:
set_names(head(mtcars), paste0, "_foo")
# If length 1, the second argument is recycled to the length of the first:
set_names(1:3, "foo")
set_names(list(), "")
Splice values at dots collection time
Description
The splicing operator !!!
operates both in values contexts like
list2()
and dots_list()
, and in metaprogramming contexts like
expr()
, enquos()
, or inject()
. While the end result looks the
same, the implementation is different and much more efficient in
the value cases. This difference in implementation may cause
performance issues for instance when going from:
xs <- list(2, 3) list2(1, !!!xs, 4)
to:
inject(list2(1, !!!xs, 4))
In the former case, the performant value-splicing is used. In the latter case, the slow metaprogramming splicing is used.
A common practical case where this may occur is when code is
wrapped inside a tidyeval context like dplyr::mutate()
. In this
case, the metaprogramming operator !!!
will take over the
value-splicing operator, causing an unexpected slowdown.
To avoid this in performance-critical code, use splice()
instead
of !!!
:
# These both use the fast splicing: list2(1, splice(xs), 4) inject(list2(1, splice(xs), 4))
Usage
splice(x)
is_spliced(x)
is_spliced_bare(x)
Arguments
x |
A list or vector to splice non-eagerly. |
Splice operator !!!
Description
The splice operator !!!
implemented in dynamic dots
injects a list of arguments into a function call. It belongs to the
family of injection operators and provides the same
functionality as do.call()
.
The two main cases for splice injection are:
Turning a list of inputs into distinct arguments. This is especially useful with functions that take data in
...
, such asbase::rbind()
.dfs <- list(mtcars, mtcars) inject(rbind(!!!dfs))
Injecting defused expressions like symbolised column names.
For tidyverse APIs, this second case is no longer as useful since dplyr 1.0 and the
across()
operator.
Where does !!!
work?
!!!
does not work everywhere, you can only use it within certain
special functions:
Functions taking dynamic dots like
list2()
.Functions taking defused and data-masked arguments, which are dynamic by default.
Inside
inject()
.
Most tidyverse functions support !!!
out of the box. With base
functions you need to use inject()
to enable !!!
.
Using the operator out of context may lead to incorrect results, see What happens if I use injection operators out of context?.
Splicing a list of arguments
Take a function like base::rbind()
that takes data in ...
. This
sort of functions takes a variable number of arguments.
df1 <- data.frame(x = 1) df2 <- data.frame(x = 2) rbind(df1, df2) #> x #> 1 1 #> 2 2
Passing individual arguments is only possible for a fixed amount of
arguments. When the arguments are in a list whose length is
variable (and potentially very large), we need a programmatic
approach like the splicing syntax !!!
:
dfs <- list(df1, df2) inject(rbind(!!!dfs)) #> x #> 1 1 #> 2 2
Because rbind()
is a base function we used inject()
to
explicitly enable !!!
. However, many functions implement dynamic dots with !!!
implicitly enabled out of the box.
tidyr::expand_grid(x = 1:2, y = c("a", "b")) #> # A tibble: 4 x 2 #> x y #> <int> <chr> #> 1 1 a #> 2 1 b #> 3 2 a #> 4 2 b xs <- list(x = 1:2, y = c("a", "b")) tidyr::expand_grid(!!!xs) #> # A tibble: 4 x 2 #> x y #> <int> <chr> #> 1 1 a #> 2 1 b #> 3 2 a #> 4 2 b
Note how the expanded grid has the right column names. That's because we spliced a named list. Splicing causes each name of the list to become an argument name.
tidyr::expand_grid(!!!set_names(xs, toupper)) #> # A tibble: 4 x 2 #> X Y #> <int> <chr> #> 1 1 a #> 2 1 b #> 3 2 a #> 4 2 b
Splicing a list of expressions
Another usage for !!!
is to inject defused expressions into data-masked
dots. However this usage is no longer a common pattern for
programming with tidyverse functions and we recommend using other
patterns if possible.
First, instead of using the defuse-and-inject pattern with ...
, you can simply pass
them on as you normally would. These two expressions are completely
equivalent:
my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(!!!enquos(...)) } # This equivalent syntax is preferred my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(...) }
Second, more complex applications such as transformation patterns can be solved with the across()
operation introduced in dplyr 1.0. Say you want to take the
mean()
of all expressions in ...
. Before across()
, you had to
defuse the ...
expressions, wrap them in a call to mean()
, and
inject them in summarise()
.
my_mean <- function(.data, ...) { # Defuse dots and auto-name them exprs <- enquos(..., .named = TRUE) # Wrap the expressions in a call to `mean()` exprs <- purrr::map(exprs, ~ call("mean", .x, na.rm = TRUE)) # Inject them .data %>% dplyr::summarise(!!!exprs) }
It is much easier to use across()
instead:
my_mean <- function(.data, ...) { .data %>% dplyr::summarise(across(c(...), ~ mean(.x, na.rm = TRUE))) }
Performance of injected dots and dynamic dots
Take this dynamic dots function:
n_args <- function(...) { length(list2(...)) }
Because it takes dynamic dots you can splice with !!!
out of the
box.
n_args(1, 2) #> [1] 2 n_args(!!!mtcars) #> [1] 11
Equivalently you could enable !!!
explicitly with inject()
.
inject(n_args(!!!mtcars)) #> [1] 11
While the result is the same, what is going on under the hood is
completely different. list2()
is a dots collector that
special-cases !!!
arguments. On the other hand, inject()
operates on the language and creates a function call containing as
many arguments as there are elements in the spliced list. If you
supply a list of size 1e6, inject()
is creating one million
arguments before evaluation. This can be much slower.
xs <- rep(list(1), 1e6) system.time( n_args(!!!xs) ) #> user system elapsed #> 0.009 0.000 0.009 system.time( inject(n_args(!!!xs)) ) #> user system elapsed #> 0.445 0.012 0.457
The same issue occurs when functions taking dynamic dots are called
inside a data-masking function like dplyr::mutate()
. The
mechanism that enables !!!
injection in these arguments is the
same as in inject()
.
See Also
Get properties of the current or caller frame
Description
These accessors retrieve properties of frames on the call stack. The prefix indicates for which frame a property should be accessed:
From the current frame with
current_
accessors.From a calling frame with
caller_
accessors.From a matching frame with
frame_
accessors.
The suffix indicates which property to retrieve:
-
_fn
accessors return the function running in the frame. -
_call
accessors return the defused call with which the function running in the frame was invoked. -
_env
accessors return the execution environment of the function running in the frame.
Usage
current_call()
current_fn()
current_env()
caller_call(n = 1)
caller_fn(n = 1)
caller_env(n = 1)
frame_call(frame = caller_env())
frame_fn(frame = caller_env())
Arguments
n |
The number of callers to go back. |
frame |
A frame environment of a currently running function,
as returned by |
See Also
caller_env()
and current_env()
Call stack information
Description
Usage
ctxt_frame(n = 1)
global_frame()
Arguments
n |
The number of frames to go back in the stack. |
Create a string
Description
These base-type constructors allow more control over the creation of strings in R. They take character vectors or string-like objects (integerish or raw vectors), and optionally set the encoding. The string version checks that the input contains a scalar string.
Usage
string(x, encoding = NULL)
Arguments
x |
A character vector or a vector or list of string-like objects. |
encoding |
If non-null, set an encoding mark. This is only declarative, no encoding conversion is performed. |
Examples
# As everywhere in R, you can specify a string with Unicode
# escapes. The characters corresponding to Unicode codepoints will
# be encoded in UTF-8, and the string will be marked as UTF-8
# automatically:
cafe <- string("caf\uE9")
Encoding(cafe)
charToRaw(cafe)
# In addition, string() provides useful conversions to let
# programmers control how the string is represented in memory. For
# encodings other than UTF-8, you'll need to supply the bytes in
# hexadecimal form. If it is a latin1 encoding, you can mark the
# string explicitly:
cafe_latin1 <- string(c(0x63, 0x61, 0x66, 0xE9), "latin1")
Encoding(cafe_latin1)
charToRaw(cafe_latin1)
Dispatch on base types
Description
switch_type()
is equivalent to
switch(type_of(x, ...))
, while
switch_class()
switchpatches based on class(x)
. The coerce_
versions are intended for type conversion and provide a standard
error message when conversion fails.
Usage
switch_type(.x, ...)
coerce_type(.x, .to, ...)
switch_class(.x, ...)
coerce_class(.x, .to, ...)
Arguments
.x |
An object from which to dispatch. |
... |
Named clauses. The names should be types as returned by
|
.to |
This is useful when you switchpatch within a coercing
function. If supplied, this should be a string indicating the
target type. A catch-all clause is then added to signal an error
stating the conversion failure. This type is prettified unless
|
Examples
switch_type(3L,
double = "foo",
integer = "bar",
"default"
)
# Use the coerce_ version to get standardised error handling when no
# type matches:
to_chr <- function(x) {
coerce_type(x, "a chr",
integer = as.character(x),
double = as.character(x)
)
}
to_chr(3L)
# Strings have their own type:
switch_type("str",
character = "foo",
string = "bar",
"default"
)
# Use a fallthrough clause if you need to dispatch on all character
# vectors, including strings:
switch_type("str",
string = ,
character = "foo",
"default"
)
# special and builtin functions are treated as primitive, since
# there is usually no reason to treat them differently:
switch_type(base::list,
primitive = "foo",
"default"
)
switch_type(base::`$`,
primitive = "foo",
"default"
)
# closures are not primitives:
switch_type(rlang::switch_type,
primitive = "foo",
"default"
)
Create a symbol or list of symbols
Description
Symbols are a kind of defused expression that represent objects in environments.
-
sym()
andsyms()
take strings as input and turn them into symbols. -
data_sym()
anddata_syms()
create calls of the form.data$foo
instead of symbols. Subsetting the.data
pronoun is more robust when you expect a data-variable. See The data mask ambiguity.
Only tidy eval APIs support the .data
pronoun. With base R
functions, use simple symbols created with sym()
or syms()
.
Usage
sym(x)
syms(x)
data_sym(x)
data_syms(x)
Arguments
x |
For |
Value
For sym()
and syms()
, a symbol or list of symbols. For
data_sym()
and data_syms()
, calls of the form .data$foo
.
See Also
Examples
# Create a symbol
sym("cyl")
# Create a list of symbols
syms(c("cyl", "am"))
# Symbolised names refer to variables
eval(sym("cyl"), mtcars)
# Beware of scoping issues
Cyl <- "wrong"
eval(sym("Cyl"), mtcars)
# Data symbols are explicitly scoped in the data mask
try(eval_tidy(data_sym("Cyl"), mtcars))
# These can only be used with tidy eval functions
try(eval(data_sym("Cyl"), mtcars))
# The empty string returns the missing argument:
sym("")
# This way sym() and as_string() are inverse of each other:
as_string(missing_arg())
sym(as_string(missing_arg()))
Customising condition messages
Description
Various aspects of the condition messages displayed by abort()
, warn()
, and inform()
can be customised using options from the cli package.
Turning off unicode bullets
By default, bulleted lists are prefixed with unicode symbols:
rlang::abort(c( "The error message.", "*" = "Regular bullet.", "i" = "Informative bullet.", "x" = "Cross bullet.", "v" = "Victory bullet.", ">" = "Arrow bullet." )) #> Error: #> ! The error message. #> • Regular bullet. #> ℹ Informative bullet. #> ✖ Cross bullet. #> ✔ Victory bullet. #> → Arrow bullet.
Set this option to use simple letters instead:
options(cli.condition_unicode_bullets = FALSE) rlang::abort(c( "The error message.", "*" = "Regular bullet.", "i" = "Informative bullet.", "x" = "Cross bullet.", "v" = "Victory bullet.", ">" = "Arrow bullet." )) #> Error: #> ! The error message. #> * Regular bullet. #> i Informative bullet. #> x Cross bullet. #> v Victory bullet. #> > Arrow bullet.
Changing the bullet symbols
You can specify what symbol to use for each type of bullet through your cli user theme. For instance, here is how to uniformly use *
for all bullet kinds:
options(cli.user_theme = list( ".cli_rlang .bullet-*" = list(before = "* "), ".cli_rlang .bullet-i" = list(before = "* "), ".cli_rlang .bullet-x" = list(before = "* "), ".cli_rlang .bullet-v" = list(before = "* "), ".cli_rlang .bullet->" = list(before = "* ") )) rlang::abort(c( "The error message.", "*" = "Regular bullet.", "i" = "Informative bullet.", "x" = "Cross bullet.", "v" = "Victory bullet.", ">" = "Arrow bullet." )) #> Error: #> ! The error message. #> * Regular bullet. #> * Informative bullet. #> * Cross bullet. #> * Victory bullet. #> * Arrow bullet.
If you want all the bullets to be the same, including the leading bullet, you can achieve this using the bullet
class:
options(cli.user_theme = list( ".cli_rlang .bullet" = list(before = "* ") )) rlang::abort(c( "The error message.", "*" = "Regular bullet.", "i" = "Informative bullet.", "x" = "Cross bullet.", "v" = "Victory bullet.", ">" = "Arrow bullet." )) #> Error: #> * The error message. #> * Regular bullet. #> * Informative bullet. #> * Cross bullet. #> * Victory bullet. #> * Arrow bullet.
Changing the foreground and background colour of error calls
When called inside a function, abort()
displays the function call to help contextualise the error:
splash <- function() { abort("Can't splash without water.") } splash() #> Error in `splash()`: #> ! Can't splash without water.
The call is formatted with cli as a code
element. This is not visible in the manual, but code text is formatted with a highlighted background colour by default. When this can be reliably detected, that background colour is different depending on whether you're using a light or dark theme.
You can override the colour of code elements in your cli theme. Here is my personal configuration that fits well with the colour theme I currently use in my IDE:
options(cli.user_theme = list( span.code = list( "background-color" = "#3B4252", color = "#E5E9F0" ) ))
Formatting messages with cli
Description
Condition formatting is a set of operations applied to raw inputs for error messages that includes:
Transforming a character vector of lines to a width-wrapped list of error bullets. This makes it easy to write messages in a list format where each bullet conveys a single important point.
abort(c( "The error header", "*" = "An error bullet", "i" = "An info bullet", "x" = "A cross bullet" )) #> Error: #> ! The error header #> * An error bullet #> i An info bullet #> x A cross bullet
See the tidyverse error style guide for more about this style of error messaging.
Applying style (emphasis, boldness, ...) and colours to message elements.
While the rlang package embeds rudimentary formatting routines, the main formatting engine is implemented in the cli package.
Formatting messages with cli
By default, rlang uses an internal mechanism to format bullets. It is preferable to delegate formatting to the cli package by using cli::cli_abort()
, cli::cli_warn()
, and cli::cli_inform()
instead of the rlang versions. These wrappers enable cli formatting with sophisticated paragraph wrapping and bullet indenting that make long lines easier to read. In the following example, a long !
bullet is broken with an indented newline:
rlang::global_entrace(class = "errorr") #> Error in `rlang::global_entrace()`: #> ! `class` must be one of "error", "warning", or "message", #> not "errorr". #> i Did you mean "error"?
The cli wrappers also add many features such as interpolation, semantic formatting of text elements, and pluralisation:
inform_marbles <- function(n_marbles) { cli::cli_inform(c( "i" = "I have {n_marbles} shiny marble{?s} in my bag.", "v" = "Way to go {.code cli::cli_inform()}!" )) } inform_marbles(1) #> i I have 1 shiny marble in my bag. #> v Way to go `cli::cli_inform()`! inform_marbles(2) #> i I have 2 shiny marbles in my bag. #> v Way to go `cli::cli_inform()`!
Transitioning from abort()
to cli_abort()
If you plan to mass-rename calls from abort()
to cli::cli_abort()
, be careful if you assemble error messages from user inputs. If these individual pieces contain cli or glue syntax, this will result in hard-to-debug errors and possibly unexpected behaviour.
user_input <- "{base::stop('Wrong message.', call. = FALSE)}" cli::cli_abort(sprintf("Can't handle input `%s`.", user_input)) #> Error: #> ! ! Could not evaluate cli `{}` expression: `base::stop('Wrong...`. #> Caused by error: #> ! Wrong message.
To avoid this, protect your error messages by using cli to assemble the pieces:
user_input <- "{base::stop('Wrong message.', call. = FALSE)}" cli::cli_abort("Can't handle input {.code {user_input}}.") #> Error: #> ! Can't handle input `{base::stop('Wrong message.', call. = FALSE)}`.
Enabling cli formatting globally
To enable cli formatting for all abort()
calls in your namespace, call local_use_cli()
in the onLoad
hook of your package. Using on_load()
(make sure to call run_on_load()
in your hook):
on_load(local_use_cli())
Enabling cli formatting in abort()
is useful for:
Transitioning from
abort()
tocli::cli_abort()
progressively.Using
abort()
when you'd like to disable interpolation syntax.Creating error conditions with
error_cnd()
. These condition messages will be automatically formatted with cli as well.
What is data-masking and why do I need {{
?
Description
Data-masking is a distinctive feature of R whereby programming is performed directly on a data set, with columns defined as normal objects.
# Unmasked programming mean(mtcars$cyl + mtcars$am) #> [1] 6.59375 # Referring to columns is an error - Where is the data? mean(cyl + am) #> Error: #> ! object 'cyl' not found # Data-masking with(mtcars, mean(cyl + am)) #> [1] 6.59375
While data-masking makes it easy to program interactively with data frames, it makes it harder to create functions. Passing data-masked arguments to functions requires injection with the embracing operator {{
or, in more complex cases, the injection operator !!
.
Why does data-masking require embracing and injection?
Injection (also known as quasiquotation) is a metaprogramming feature that allows you to modify parts of a program. This is needed because under the hood data-masking works by defusing R code to prevent its immediate evaluation. The defused code is resumed later on in a context where data frame columns are defined.
Let's see what happens when we pass arguments to a data-masking function like summarise()
in the normal way:
my_mean <- function(data, var1, var2) { dplyr::summarise(data, mean(var1 + var2)) } my_mean(mtcars, cyl, am) #> Error in `dplyr::summarise()`: #> i In argument: `mean(var1 + var2)`. #> Caused by error: #> ! object 'cyl' not found
The problem here is that summarise()
defuses the R code it was supplied, i.e. mean(var1 + var2)
. Instead we want it to see mean(cyl + am)
. This is why we need injection, we need to modify that piece of code by injecting the code supplied to the function in place of var1
and var2
.
To inject a function argument in data-masked context, just embrace it with {{
:
my_mean <- function(data, var1, var2) { dplyr::summarise(data, mean({{ var1 }} + {{ var2 }})) } my_mean(mtcars, cyl, am) #> # A tibble: 1 x 1 #> `mean(cyl + am)` #> <dbl> #> 1 6.59
See Data mask programming patterns to learn more about creating functions around data-masking functions.
What does "masking" mean?
In normal R programming objects are defined in the current environment, for instance in the global environment or the environment of a function.
factor <- 1000 # Can now use `factor` in computations mean(mtcars$cyl * factor) #> [1] 6187.5
This environment also contains all functions currently in scope. In a script this includes the functions attached with library()
calls; in a package, the functions imported from other packages. If evaluation was performed only in the data frame, we'd lose track of these objects and functions necessary to perform computations.
To keep these objects and functions in scope, the data frame is inserted at the bottom of the current chain of environments. It comes first and has precedence over the user environment. In other words, it masks the user environment.
Since masking blends the data and the user environment by giving priority to the former, R can sometimes use a data frame column when you really intended to use a local object.
# Defining an env-variable cyl <- 1000 # Referring to a data-variable dplyr::summarise(mtcars, mean(cyl)) #> # A tibble: 1 x 1 #> `mean(cyl)` #> <dbl> #> 1 6.19
The tidy eval framework provides pronouns to help disambiguate between the mask and user contexts. It is often a good idea to use these pronouns in production code.
cyl <- 1000 mtcars %>% dplyr::summarise( mean_data = mean(.data$cyl), mean_env = mean(.env$cyl) ) #> # A tibble: 1 x 2 #> mean_data mean_env #> <dbl> <dbl> #> 1 6.19 1000
Read more about this in The data mask ambiguity.
How does data-masking work?
Data-masking relies on three language features:
-
Argument defusal with
substitute()
(base R) orenquo()
,enquos()
, and{{
(rlang). R code is defused so it can be evaluated later on in a special environment enriched with data. First class environments. Environments are a special type of list-like object in which defused R code can be evaluated. The named elements in an environment define objects. Lists and data frames can be transformed to environments:
as.environment(mtcars) #> <environment: 0x7febb17e3468>
Explicit evaluation with
eval()
(base) oreval_tidy()
(rlang). When R code is defused, evaluation is interrupted. It can be resumed later on witheval()
:expr(1 + 1) #> 1 + 1 eval(expr(1 + 1)) #> [1] 2
By default
eval()
andeval_tidy()
evaluate in the current environment.code <- expr(mean(cyl + am)) eval(code) #> Error: #> ! object 'am' not found
You can supply an optional list or data frame that will be converted to an environment.
eval(code, mtcars) #> [1] 6.59375
Evaluation of defused code then occurs in the context of a data mask.
History
The tidyverse embraced the data-masking approach in packages like ggplot2 and dplyr and eventually developed its own programming framework in the rlang package. None of this would have been possible without the following landmark developments from S and R authors.
The S language introduced data scopes with
attach()
(Becker, Chambers and Wilks, The New S Language, 1988).The S language introduced data-masked formulas in modelling functions (Chambers and Hastie, 1993).
Peter Dalgaard (R team) wrote the frametools package in 1997. It was later included in R as
base::transform()
andbase::subset()
. This API is an important source of inspiration for the dplyr package. It was also the first apparition of selections, a variant of data-masking extended and codified later on in the tidyselect package.In 2000 Luke Tierney (R team) changed formulas to keep track of their original environments. This change published in R 1.1.0 was a crucial step towards hygienic data masking, i.e. the proper resolution of symbols in their original environments. Quosures were inspired by the environment-tracking mechanism of formulas.
Luke introduced
base::with()
in 2001.In 2006 the data.table package included data-masking and selections in the
i
andj
arguments of the[
method of a data frame.The dplyr package was published in 2014.
The rlang package developed tidy eval in 2017 as the data-masking framework of the tidyverse. It introduced the notions of quosure, implicit injection with
!!
and!!!
, and data pronouns.In 2019, injection with
{{
was introduced in rlang 0.4.0 to simplify the defuse-and-inject pattern. This operator allows R programmers to transport data-masked arguments across functions more intuitively and with minimal boilerplate.
See also
The data mask ambiguity
Description
Data masking is an R feature that blends programming variables that live inside environments (env-variables) with statistical variables stored in data frames (data-variables). This mixture makes it easy to refer to data frame columns as well as objects defined in the current environment.
x <- 100 mtcars %>% dplyr::summarise(mean(disp / x)) #> # A tibble: 1 x 1 #> `mean(disp/x)` #> <dbl> #> 1 2.31
However this convenience introduces an ambiguity between data-variables and env-variables which might cause collisions.
Column collisions
In the following snippet, are we referring to the env-variable x
or to the data-variable of the same name?
df <- data.frame(x = NA, y = 2) x <- 100 df %>% dplyr::mutate(y = y / x) #> x y #> 1 NA NA
A column collision occurs when you want to use an object defined outside of the data frame, but a column of the same name happens to exist.
Object collisions
The opposite problem occurs when there is a typo in a data-variable name and an env-variable of the same name exists:
df <- data.frame(foo = "right") ffo <- "wrong" df %>% dplyr::mutate(foo = toupper(ffo)) #> foo #> 1 WRONG
Instead of a typo, it might also be that you were expecting a column in the data frame which is unexpectedly missing. In both cases, if a variable can't be found in the data mask, R looks for variables in the surrounding environment. This isn't what we intended here and it would have been better to fail early with a "Column not found" error.
Preventing collisions
In casual scripts or interactive programming, data mask ambiguity is not a huge deal compared to the payoff of iterating quickly while developing your analysis. However in production code and in package functions, the ambiguity might cause collision bugs in the long run.
Fortunately it is easy to be explicit about the scoping of variables with a little more verbose code. This topic lists the solutions and workarounds that have been created to solve ambiguity issues in data masks.
The .data
and .env
pronouns
The simplest solution is to use the .data
and .env
pronouns to disambiguate between data-variables and env-variables.
df <- data.frame(x = 1, y = 2) x <- 100 df %>% dplyr::mutate(y = .data$y / .env$x) #> x y #> 1 1 0.02
This is especially useful in functions because the data frame is not known in advance and potentially contain masking columns for any of the env-variables in scope in the function:
my_rescale <- function(data, var, factor = 10) { data %>% dplyr::mutate("{{ var }}" := {{ var }} / factor) } # This works data.frame(value = 1) %>% my_rescale(value) #> value #> 1 0.1 # Oh no! data.frame(factor = 0, value = 1) %>% my_rescale(value) #> factor value #> 1 0 Inf
Subsetting function arguments with .env
ensures we never hit a masking column:
my_rescale <- function(data, var, factor = 10) { data %>% dplyr::mutate("{{ var }}" := {{ var }} / .env$factor) } # Yay! data.frame(factor = 0, value = 1) %>% my_rescale(value) #> factor value #> 1 0 0.1
Subsetting .data
with env-variables
The .data
pronoun may be used as a name-to-data-mask pattern (see Data mask programming patterns):
var <- "cyl" mtcars %>% dplyr::summarise(mean = mean(.data[[var]])) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19
In this example, the env-variable var
is used inside the data mask to subset the .data
pronoun. Does this mean that var
is at risk of a column collision if the input data frame contains a column of the same name? Fortunately not:
var <- "cyl" mtcars2 <- mtcars mtcars2$var <- "wrong" mtcars2 %>% dplyr::summarise(mean = mean(.data[[var]])) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19
The evaluation of .data[[var]]
is set up in such a way that there is no ambiguity. The .data
pronoun can only be subsetted with env-variables, not data-variables. Technically, this is because [[
behaves like an injection operator when applied to .data
. It is evaluated very early before the data mask is even created. See the !!
section below.
Injecting env-variables with !!
Injection operators such as !!
have interesting properties regarding the ambiguity problem. They modify a piece of code early on by injecting objects or other expressions before any data-masking logic comes into play. If you inject the value of a variable, it becomes inlined in the expression. R no longer needs to look up any variable to find the value.
Taking the earlier division example, let's use !!
to inject the value of the env-variable x
inside the division expression:
df <- data.frame(x = NA, y = 2) x <- 100 df %>% dplyr::mutate(y = y / !!x) #> x y #> 1 NA 0.02
While injection solves issues of ambiguity, it is a bit heavy handed compared to using the .env
pronoun. Big objects inlined in expressions might cause issues in unexpected places, for instance they might make the calls in a traceback()
less readable.
No ambiguity in tidy selections
Tidy selection is a dialect of R that optimises column selection in tidyverse packages. Examples of functions that use tidy selections are dplyr::select()
and tidyr::pivot_longer()
.
Unlike data masking, tidy selections do not suffer from ambiguity. The selection language is designed in such a way that evaluation of expressions is either scoped in the data mask only, or in the environment only. Take this example:
mtcars %>% dplyr::select(gear:ncol(mtcars))
gear
is a symbol supplied to a selection operator :
and thus scoped in the data mask only. Any other kind of expression, such as ncol(mtcars)
, is evaluated as normal R code outside of any data context. This is why there is no column collision here:
data <- data.frame(x = 1, data = 1:3) data %>% dplyr::select(data:ncol(data)) #> data #> 1 1 #> 2 2 #> 3 3
It is useful to introduce two new terms. Tidy selections distinguish data-expressions and env-expressions:
-
data
is a data-expression that refers to the data-variable. -
ncol(data)
is an env-expression that refers to the env-variable.
To learn more about the difference between the two kinds of expressions, see the technical description of the tidy selection syntax.
Names pattern with all_of()
all_of()
is often used in functions as a programming pattern that connects column names to a data mask, similarly to the .data
pronoun. A simple example is:
my_group_by <- function(data, vars) { data %>% dplyr::group_by(across(all_of(vars))) }
If tidy selections were affected by the data mask ambiguity, this function would be at risk of a column collision. It would break as soon as the user supplies a data frame containing a vars
column. However, all_of()
is an env-expression that is evaluated outside of the data mask, so there is no possibility of collisions.
Data mask programming patterns
Description
Data-masking functions require special programming patterns when used inside other functions. In this topic we'll review and compare the different patterns that can be used to solve specific problems.
If you are a beginner, you might want to start with one of these tutorials:
If you'd like to go further and learn about defusing and injecting expressions, read the metaprogramming patterns topic.
Choosing a pattern
Two main considerations determine which programming pattern you need to wrap a data-masking function:
What behaviour does the wrapped function implement?
What behaviour should your function implement?
Depending on the answers to these questions, you can choose between these approaches:
The forwarding patterns with which your function inherits the behaviour of the function it interfaces with.
The name patterns with which your function takes strings or character vectors of column names.
The bridge patterns with which you change the behaviour of an argument instead of inheriting it.
You will also need to use different solutions for single named arguments than for multiple arguments in ...
.
Argument behaviours
In a regular function, arguments can be defined in terms of a type of objects that they accept. An argument might accept a character vector, a data frame, a single logical value, etc. Data-masked arguments are more complex. Not only do they generally accept a specific type of objects (for instance dplyr::mutate()
accepts vectors), they exhibit special computational behaviours.
Data-masked expressions (base): E.g.
transform()
,with()
. Expressions may refer to the columns of the supplied data frame.Data-masked expressions (tidy eval): E.g.
dplyr::mutate()
,ggplot2::aes()
. Same as base data-masking but with tidy eval features enabled. This includes injection operators such as{{
and!!
and the.data
and.env
pronouns.Data-masked symbols: Same as data-masked arguments but the supplied expressions must be simple column names. This often simplifies things, for instance this is an easy way of avoiding issues of double evaluation.
-
Tidy selections: E.g.
dplyr::select()
,tidyr::pivot_longer()
. This is an alternative to data masking that supports selection helpers likestarts_with()
orall_of()
, and implements special behaviour for operators likec()
,|
and&
.Unlike data masking, tidy selection is an interpreted dialect. There is in fact no masking at all. Expressions are either interpreted in the context of the data frame (e.g.
c(cyl, am)
which stands for the union of the columnscyl
andam
), or evaluated in the user environment (e.g.all_of()
,starts_with()
, and any other expressions). This has implications for inheritance of argument behaviour as we will see below. -
Dynamic dots: These may be data-masked arguments, tidy selections, or just regular arguments. Dynamic dots support injection of multiple arguments with the
!!!
operator as well as name injection with glue operators.
To let users know about the capabilities of your function arguments, document them with the following tags, depending on which set of semantics they inherit from:
@param foo <[`data-masked`][dplyr::dplyr_data_masking]> What `foo` does. @param bar <[`tidy-select`][dplyr::dplyr_tidy_select]> What `bar` does. @param ... <[`dynamic-dots`][rlang::dyn-dots]> What these dots do.
Forwarding patterns
With the forwarding patterns, arguments inherit the behaviour of the data-masked arguments they are passed in.
Embrace with {{
The embrace operator {{
is a forwarding syntax for single arguments. You can forward an argument in data-masked context:
my_summarise <- function(data, var) { data %>% dplyr::summarise({{ var }}) }
Or in tidyselections:
my_pivot_longer <- function(data, var) { data %>% tidyr::pivot_longer(cols = {{ var }}) }
The function automatically inherits the behaviour of the surrounding context. For instance arguments forwarded to a data-masked context may refer to columns or use the .data
pronoun:
mtcars %>% my_summarise(mean(cyl)) x <- "cyl" mtcars %>% my_summarise(mean(.data[[x]]))
And arguments forwarded to a tidy selection may use all tidyselect features:
mtcars %>% my_pivot_longer(cyl) mtcars %>% my_pivot_longer(vs:gear) mtcars %>% my_pivot_longer(starts_with("c")) x <- c("cyl", "am") mtcars %>% my_pivot_longer(all_of(x))
Forward ...
Simple forwarding of ...
arguments does not require any special syntax since dots are already a forwarding syntax. Just pass them to another function like you normally would. This works with data-masked arguments:
my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(...) } mtcars %>% my_group_by(cyl = cyl * 100, am)
As well as tidy selections:
my_select <- function(.data, ...) { .data %>% dplyr::select(...) } mtcars %>% my_select(starts_with("c"), vs:carb)
Some functions take a tidy selection in a single named argument. In that case, pass the ...
inside c()
:
my_pivot_longer <- function(.data, ...) { .data %>% tidyr::pivot_longer(c(...)) } mtcars %>% my_pivot_longer(starts_with("c"), vs:carb)
Inside a tidy selection, c()
is not a vector concatenator but a selection combinator. This makes it handy to interface between functions that take ...
and functions that take a single argument.
Names patterns
With the names patterns you refer to columns by name with strings or character vectors stored in env-variables. Whereas the forwarding patterns are exclusively used within a function to pass arguments, the names patterns can be used anywhere.
In a script, you can loop over a character vector with
for
orlapply()
and use the.data
pattern to connect a name to its data-variable. A vector can also be supplied all at once to the tidy select helperall_of()
.In a function, using the names patterns on function arguments lets users supply regular data-variable names without any of the complications that come with data-masking.
Subsetting the .data
pronoun
The .data
pronoun is a tidy eval feature that is enabled in all data-masked arguments, just like {{
. The pronoun represents the data mask and can be subsetted with [[
and $
. These three statements are equivalent:
mtcars %>% dplyr::summarise(mean = mean(cyl)) mtcars %>% dplyr::summarise(mean = mean(.data$cyl)) var <- "cyl" mtcars %>% dplyr::summarise(mean = mean(.data[[var]]))
The .data
pronoun can be subsetted in loops:
vars <- c("cyl", "am") for (var in vars) print(dplyr::summarise(mtcars, mean = mean(.data[[var]]))) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19 #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 0.406 purrr::map(vars, ~ dplyr::summarise(mtcars, mean = mean(.data[[.x]]))) #> [[1]] #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19 #> #> [[2]] #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 0.406
And it can be used to connect function arguments to a data-variable:
my_mean <- function(data, var) { data %>% dplyr::summarise(mean = mean(.data[[var]])) } my_mean(mtcars, "cyl") #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19
With this implementation, my_mean()
is completely insulated from data-masking behaviour and is called like an ordinary function.
# No masking am <- "cyl" my_mean(mtcars, am) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19 # Programmable my_mean(mtcars, tolower("CYL")) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19
Character vector of names
The .data
pronoun can only be subsetted with single column names. It doesn't support single-bracket indexing:
mtcars %>% dplyr::summarise(.data[c("cyl", "am")]) #> Error in `dplyr::summarise()`: #> i In argument: `.data[c("cyl", "am")]`. #> Caused by error in `.data[c("cyl", "am")]`: #> ! `[` is not supported by the `.data` pronoun, use `[[` or $ instead.
There is no plural variant of .data
built in tidy eval. Instead, we'll used the all_of()
operator available in tidy selections to supply character vectors. This is straightforward in functions that take tidy selections, like tidyr::pivot_longer()
:
vars <- c("cyl", "am") mtcars %>% tidyr::pivot_longer(all_of(vars)) #> # A tibble: 64 x 11 #> mpg disp hp drat wt qsec vs gear carb name value #> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <chr> <dbl> #> 1 21 160 110 3.9 2.62 16.5 0 4 4 cyl 6 #> 2 21 160 110 3.9 2.62 16.5 0 4 4 am 1 #> 3 21 160 110 3.9 2.88 17.0 0 4 4 cyl 6 #> 4 21 160 110 3.9 2.88 17.0 0 4 4 am 1 #> # i 60 more rows
If the function does not take a tidy selection, it might be possible to use a bridge pattern. This option is presented in the bridge section below. If a bridge is impossible or inconvenient, a little metaprogramming with the symbolise-and-inject pattern can help.
Bridge patterns
Sometimes the function you are calling does not implement the behaviour you would like to give to the arguments of your function. To work around this may require a little thought since there is no systematic way of turning one behaviour into another. The general technique consists in forwarding the arguments inside a context that implements the behaviour that you want. Then, find a way to bridge the result to the target verb or function.
across()
as a selection to data-mask bridge
dplyr 1.0 added support for tidy selections in all verbs via across()
. This function is normally used for mapping over columns but can also be used to perform a simple selection. For instance, if you'd like to pass an argument to group_by()
with a tidy-selection interface instead of a data-masked one, use across()
as a bridge:
my_group_by <- function(data, var) { data %>% dplyr::group_by(across({{ var }})) } mtcars %>% my_group_by(starts_with("c"))
Since across()
takes selections in a single argument (unlike select()
which takes multiple arguments), you can't directly pass ...
. Instead, take them within c()
, which is the tidyselect way of supplying multiple selections within a single argument:
my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(across(c(...))) } mtcars %>% my_group_by(starts_with("c"), vs:gear)
across(all_of())
as a names to data mask bridge
If instead of forwarding variables in across()
you pass them to all_of()
, you create a names to data mask bridge.
my_group_by <- function(data, vars) { data %>% dplyr::group_by(across(all_of(vars))) } mtcars %>% my_group_by(c("cyl", "am"))
Use this bridge technique to connect vectors of names to a data-masked context.
transmute()
as a data-mask to selection bridge
Passing data-masked arguments to a tidy selection is a little more tricky and requires a three step process.
my_pivot_longer <- function(data, ...) { # Forward `...` in data-mask context with `transmute()` # and save the inputs names inputs <- dplyr::transmute(data, ...) names <- names(inputs) # Update the data with the inputs data <- dplyr::mutate(data, !!!inputs) # Select the inputs by name with `all_of()` tidyr::pivot_longer(data, cols = all_of(names)) } mtcars %>% my_pivot_longer(cyl, am = am * 100)
In a first step we pass the
...
expressions totransmute()
. Unlikemutate()
, it creates a new data frame from the user inputs. The only goal of this step is to inspect the names in...
, including the default names created for unnamed arguments.Once we have the names, we inject the arguments into
mutate()
to update the data frame.Finally, we pass the names to the tidy selection via
all_of()
.
Transformation patterns
Named inputs versus ...
In the case of a named argument, transformation is easy. We simply surround the embraced input in R code. For instance, the my_summarise()
function is not exactly useful compared to just calling summarise()
:
my_summarise <- function(data, var) { data %>% dplyr::summarise({{ var }}) }
We can make it more useful by adding code around the variable:
my_mean <- function(data, var) { data %>% dplyr::summarise(mean = mean({{ var }}, na.rm = TRUE)) }
For inputs in ...
however, this technique does not work. We would need some kind of templating syntax for dots that lets us specify R code with a placeholder for the dots elements. This isn't built in tidy eval but you can use operators like dplyr::across()
, dplyr::if_all()
, or dplyr::if_any()
. When that isn't possible, you can template the expression manually.
Transforming inputs with across()
The across()
operation in dplyr is a convenient way of mapping an expression across a set of inputs. We will create a variant of my_mean()
that computes the mean()
of all arguments supplied in ...
. The easiest way it to forward the dots to across()
(which causes ...
to inherit its tidy selection behaviour):
my_mean <- function(data, ...) { data %>% dplyr::summarise(across(c(...), ~ mean(.x, na.rm = TRUE))) } mtcars %>% my_mean(cyl, carb) #> # A tibble: 1 x 2 #> cyl carb #> <dbl> <dbl> #> 1 6.19 2.81 mtcars %>% my_mean(foo = cyl, bar = carb) #> # A tibble: 1 x 2 #> foo bar #> <dbl> <dbl> #> 1 6.19 2.81 mtcars %>% my_mean(starts_with("c"), mpg:disp) #> # A tibble: 1 x 4 #> cyl carb mpg disp #> <dbl> <dbl> <dbl> <dbl> #> 1 6.19 2.81 20.1 231.
Transforming inputs with if_all()
and if_any()
dplyr::filter()
requires a different operation than across()
because it needs to combine the logical expressions with &
or |
. To solve this problem dplyr introduced the if_all()
and if_any()
variants of across()
.
In the following example, we filter all rows for which a set of variables are not equal to their minimum value:
filter_non_baseline <- function(.data, ...) { .data %>% dplyr::filter(if_all(c(...), ~ .x != min(.x, na.rm = TRUE))) } mtcars %>% filter_non_baseline(vs, am, gear)
Defusing R expressions
Description
When a piece of R code is defused, R doesn't return its value like it normally would. Instead it returns the expression in a special tree-like object that describes how to compute a value. These defused expressions can be thought of as blueprints or recipes for computing values.
Using expr()
we can observe the difference between computing an expression and defusing it:
# Return the result of `1 + 1` 1 + 1 #> [1] 2 # Return the expression `1 + 1` expr(1 + 1) #> 1 + 1
Evaluation of a defused expression can be resumed at any time with eval()
(see also eval_tidy()
).
# Return the expression `1 + 1` e <- expr(1 + 1) # Return the result of `1 + 1` eval(e) #> [1] 2
The most common use case for defusing expressions is to resume its evaluation in a data mask. This makes it possible for the expression to refer to columns of a data frame as if they were regular objects.
e <- expr(mean(cyl)) eval(e, mtcars) #> [1] 6.1875
Do I need to know about defused expressions?
As a tidyverse user you will rarely need to defuse expressions manually with expr()
, and even more rarely need to resume evaluation with eval()
or eval_tidy()
. Instead, you call data-masking functions which take care of defusing your arguments and resuming them in the context of a data mask.
mtcars %>% dplyr::summarise( mean(cyl) # This is defused and data-masked ) #> # A tibble: 1 x 1 #> `mean(cyl)` #> <dbl> #> 1 6.19
It is important to know that a function defuses its arguments because it requires slightly different methods when called from a function. The main thing is that arguments must be transported with the embrace operator {{
. It allows the data-masking function to defuse the correct expression.
my_mean <- function(data, var) { dplyr::summarise(data, mean = mean({{ var }})) }
Read more about this in:
The booby trap analogy
The term "defusing" comes from an analogy to the evaluation model in R. As you may know, R uses lazy evaluation, which means that arguments are only evaluated when they are needed for a computation. Let's take two functions, ignore()
which doesn't do anything with its argument, and force()
which returns it:
ignore <- function(arg) NULL force <- function(arg) arg ignore(warning("boom")) #> NULL force(warning("boom")) #> Warning in force(warning("boom")): boom
A warning is only emitted when the function actually triggers evaluation of its argument. Evaluation of arguments can be chained by passing them to other functions. If one of the functions ignores its argument, it breaks the chain of evaluation.
f <- function(x) g(x) g <- function(y) h(y) h <- function(z) ignore(z) f(warning("boom")) #> NULL
In a way, arguments are like booby traps which explode (evaluate) when touched. Defusing an argument can be seen as defusing the booby trap.
expr(force(warning("boom"))) #> force(warning("boom"))
Types of defused expressions
-
Calls, like
f(1, 2, 3)
or1 + 1
represent the action of calling a function to compute a new value, such as a vector. -
Symbols, like
x
ordf
, represent named objects. When the object pointed to by the symbol was defined in a function or in the global environment, we call it an environment-variable. When the object is a column in a data frame, we call it a data-variable. -
Constants, like
1
orNULL
.
You can create new call or symbol objects by using the defusing function expr()
:
# Create a symbol representing objects called `foo` expr(foo) #> foo # Create a call representing the computation of the mean of `foo` expr(mean(foo, na.rm = TRUE)) #> mean(foo, na.rm = TRUE) # Return a constant expr(1) #> [1] 1 expr(NULL) #> NULL
Defusing is not the only way to create defused expressions. You can also assemble them from data:
# Assemble a symbol from a string var <- "foo" sym(var) # Assemble a call from strings, symbols, and constants call("mean", sym(var), na.rm = TRUE)
Local expressions versus function arguments
There are two main ways to defuse expressions, to which correspond two functions in rlang, expr()
and enquo()
:
You can defuse your own R expressions with
expr()
.You can defuse the expressions supplied by the user of your function with the
en
-prefixed operators, such asenquo()
andenquos()
. These operators defuse function arguments.
Defuse and inject
One purpose for defusing evaluation of an expression is to interface with data-masking functions by injecting the expression back into another function with !!
. This is the defuse-and-inject pattern.
my_summarise <- function(data, arg) { # Defuse the user expression in `arg` arg <- enquo(arg) # Inject the expression contained in `arg` # inside a `summarise()` argument data |> dplyr::summarise(mean = mean(!!arg, na.rm = TRUE)) }
Defuse-and-inject is usually performed in a single step with the embrace operator {{
.
my_summarise <- function(data, arg) { # Defuse and inject in a single step with the embracing operator data |> dplyr::summarise(mean = mean({{ arg }}, na.rm = TRUE)) }
Using enquo()
and !!
separately is useful in more complex cases where you need access to the defused expression instead of just passing it on.
Defused arguments and quosures
If you inspect the return values of expr()
and enquo()
, you'll notice that the latter doesn't return a raw expression like the former. Instead it returns a quosure, a wrapper containing an expression and an environment.
expr(1 + 1) #> 1 + 1 my_function <- function(arg) enquo(arg) my_function(1 + 1) #> <quosure> #> expr: ^1 + 1 #> env: global
R needs information about the environment to properly evaluate argument expressions because they come from a different context than the current function. For instance when a function in your package calls dplyr::mutate()
, the quosure environment indicates where all the private functions of your package are defined.
Read more about the role of quosures in What are quosures and when are they needed?.
Comparison with base R
Defusing is known as quoting in other frameworks.
The equivalent of
expr()
isbase::bquote()
.The equivalent of
enquo()
isbase::substitute()
. The latter returns a naked expression instead of a quosure.There is no equivalent for
enquos(...)
but you can defuse dots as a list of naked expressions witheval(substitute(alist(...)))
.
The double evaluation problem
Description
One inherent risk to metaprogramming is to evaluate multiple times a piece of code that appears to be evaluated only once. Take this data-masking function which takes a single input and produces two summaries:
summarise_stats <- function(data, var) { data %>% dplyr::summarise( mean = mean({{ var }}), sd = sd({{ var }}) ) } summarise_stats(mtcars, cyl) #> # A tibble: 1 x 2 #> mean sd #> <dbl> <dbl> #> 1 6.19 1.79
This function is perfectly fine if the user supplies simple column names. However, data-masked arguments may also include computations.
summarise_stats(mtcars, cyl * 100) #> # A tibble: 1 x 2 #> mean sd #> <dbl> <dbl> #> 1 619. 179.
Computations may be slow and may produce side effects. For these reasons, they should only be performed as many times as they appear in the code (unless explicitly documented, e.g. once per group with grouped data frames). Let's try again with a more complex computation:
times100 <- function(x) { message("Takes a long time...") Sys.sleep(0.1) message("And causes side effects such as messages!") x * 100 } summarise_stats(mtcars, times100(cyl)) #> Takes a long time... #> And causes side effects such as messages! #> Takes a long time... #> And causes side effects such as messages! #> # A tibble: 1 x 2 #> mean sd #> <dbl> <dbl> #> 1 619. 179.
Because of the side effects and the long running time, it is clear that summarise_stats()
evaluates its input twice. This is because we've injected a defused expression in two different places. The data-masked expression created down the line looks like this (with caret signs representing quosure boundaries):
dplyr::summarise( mean = ^mean(^times100(cyl)), sd = ^sd(^times100(cyl)) )
The times100(cyl)
expression is evaluated twice, even though it only appears once in the code. We have a double evaluation bug.
One simple way to fix it is to assign the defused input to a constant. You can then refer to that constant in the remaining of the code.
summarise_stats <- function(data, var) { data %>% dplyr::transmute( var = {{ var }}, ) %>% dplyr::summarise( mean = mean(var), sd = sd(var) ) }
The defused input is now evaluated only once because it is injected only once:
summarise_stats(mtcars, times100(cyl)) #> Takes a long time... #> And causes side effects such as messages! #> # A tibble: 1 x 2 #> mean sd #> <dbl> <dbl> #> 1 619. 179.
What about glue strings?
{{
embracing in glue strings doesn't suffer from the double evaluation problem:
summarise_stats <- function(data, var) { data %>% dplyr::transmute( var = {{ var }}, ) %>% dplyr::summarise( "mean_{{ var }}" := mean(var), "sd_{{ var }}" := sd(var) ) } summarise_stats(mtcars, times100(cyl)) #> Takes a long time... #> And causes side effects such as messages! #> # A tibble: 1 x 2 #> `mean_times100(cyl)` `sd_times100(cyl)` #> <dbl> <dbl> #> 1 619. 179.
Since a glue string doesn't need the result of an expression, only the original code converted (deparsed) to a string, it doesn't evaluate injected expressions.
Why are strings and other constants enquosed in the empty environment?
Description
Function arguments are defused into quosures that keep track of the environment of the defused expression.
quo(1 + 1) #> <quosure> #> expr: ^1 + 1 #> env: global
You might have noticed that when constants are supplied, the quosure tracks the empty environment instead of the current environmnent.
quos("foo", 1, NULL) #> <list_of<quosure>> #> #> [[1]] #> <quosure> #> expr: ^"foo" #> env: empty #> #> [[2]] #> <quosure> #> expr: ^1 #> env: empty #> #> [[3]] #> <quosure> #> expr: ^NULL #> env: empty
The reason for this has to do with compilation of R code which makes it impossible to consistently capture environments of constants from function arguments. Argument defusing relies on the promise mechanism of R for lazy evaluation of arguments. When functions are compiled and R notices that an argument is constant, it avoids creating a promise since they slow down function evaluation. Instead, the function is directly supplied a naked constant instead of constant wrapped in a promise.
Concrete case of promise unwrapping by compilation
We can observe this optimisation by calling into the C-level findVar()
function to capture promises.
# Return the object bound to `arg` without triggering evaluation of # promises f <- function(arg) { rlang:::find_var(current_env(), sym("arg")) } # Call `f()` with a symbol or with a constant g <- function(symbolic) { if (symbolic) { f(letters) } else { f("foo") } } # Make sure these small functions are compiled f <- compiler::cmpfun(f) g <- compiler::cmpfun(g)
When f()
is called with a symbolic argument, we get the promise object created by R.
g(symbolic = TRUE) #> <promise: 0x7ffd79bac130>
However, supplying a constant to "f"
returns the constant directly.
g(symbolic = FALSE) #> [1] "foo"
Without a promise, there is no way to figure out the original environment of an argument.
Do we need environments for constants?
Data-masking APIs in the tidyverse are intentionally designed so that they don't need an environment for constants.
Data-masking APIs should be able to interpret constants. These can arise from normal argument passing as we have seen, or by injection with
!!
. There should be no difference betweendplyr::mutate(mtcars, var = cyl)
anddplyr::mutate(mtcars, var = !!mtcars$cyl)
.Data-masking is an evaluation idiom, not an introspective one. The behaviour of data-masking function should not depend on the calling environment when a constant (or a symbol evaluating to a given value) is supplied.
Does {{
work on regular objects?
Description
The embrace operator {{
should be used exclusively with function arguments:
fn <- function(arg) { quo(foo({{ arg }})) } fn(1 + 1) #> <quosure> #> expr: ^foo(^1 + 1) #> env: 0x7ffd89aac518
However you may have noticed that it also works on regular objects:
fn <- function(arg) { arg <- force(arg) quo(foo({{ arg }})) } fn(1 + 1) #> <quosure> #> expr: ^foo(^2) #> env: 0x7ffd8a633398
In that case, {{
captures the value of the expression instead of a defused expression. That's because only function arguments can be defused.
Note that this issue also applies to enquo()
(on which {{
is based).
Why is this not an error?
Ideally we would have made {{
on regular objects an error. However this is not possible because in compiled R code it is not always possible to distinguish a regular variable from a function argument. See Why are strings and other constants enquosed in the empty environment? for more about this.
Including function calls in error messages
Description
Starting with rlang 1.0, abort()
includes the erroring function in the message by default:
my_function <- function() { abort("Can't do that.") } my_function() #> Error in `my_function()`: #> ! Can't do that.
This works well when abort()
is called directly within the failing function. However, when the abort()
call is exported to another function (which we call an "error helper"), we need to be explicit about which function abort()
is throwing an error for.
Passing the user context
There are two main kinds of error helpers:
Simple
abort()
wrappers. These often aim at adding classes and attributes to an error condition in a structured way:stop_my_class <- function(message) { abort(message, class = "my_class") }
Input checking functions. An input checker is typically passed an input and an argument name. It throws an error if the input doesn't conform to expectations:
check_string <- function(x, arg = "x") { if (!is_string(x)) { cli::cli_abort("{.arg {arg}} must be a string.") } }
In both cases, the default error call is not very helpful to the end user because it reflects an internal function rather than a user function:
my_function <- function(x) { check_string(x) stop_my_class("Unimplemented") }
my_function(NA) #> Error in `check_string()`: #> ! `x` must be a string.
my_function("foo") #> Error in `stop_my_class()`: #> ! Unimplemented
To fix this, let abort()
know about the function that it is throwing the error for by passing the corresponding function environment as the call
argument:
stop_my_class <- function(message, call = caller_env()) { abort(message, class = "my_class", call = call) } check_string <- function(x, arg = "x", call = caller_env()) { if (!is_string(x)) { cli::cli_abort("{.arg {arg}} must be a string.", call = call) } }
my_function(NA) #> Error in `my_function()`: #> ! `x` must be a string.
my_function("foo") #> Error in `my_function()`: #> ! Unimplemented
Input checkers and caller_arg()
The caller_arg()
helper is useful in input checkers which check an input on the behalf of another function. Instead of hard-coding arg = "x"
, and forcing the callers to supply it if "x"
is not the name of the argument being checked, use caller_arg()
.
check_string <- function(x, arg = caller_arg(x), call = caller_env()) { if (!is_string(x)) { cli::cli_abort("{.arg {arg}} must be a string.", call = call) } }
It is a combination of substitute()
and rlang::as_label()
which provides a more generally applicable default:
my_function <- function(my_arg) { check_string(my_arg) } my_function(NA) #> Error in `my_function()`: #> ! `my_arg` must be a string.
Side benefit: backtrace trimming
Another benefit of passing caller_env()
as call
is that it allows abort()
to automatically hide the error helpers
my_function <- function() { their_function() } their_function <- function() { error_helper1() } error_helper1 <- function(call = caller_env()) { error_helper2(call = call) } error_helper2 <- function(call = caller_env()) { if (use_call) { abort("Can't do this", call = call) } else { abort("Can't do this") } }
use_call <- FALSE their_function() #> Error in `error_helper2()`: #> ! Can't do this
rlang::last_error() #> <error/rlang_error> #> Error in `error_helper2()`: #> ! Can't do this #> --- #> Backtrace: #> x #> 1. \-rlang (local) their_function() #> 2. \-rlang (local) error_helper1() #> 3. \-rlang (local) error_helper2(call = call) #> Run rlang::last_trace(drop = FALSE) to see 1 hidden frame.
With the correct call
, the backtrace is much simpler and lets the user focus on the part of the stack that is relevant to them:
use_call <- TRUE their_function() #> Error in `their_function()`: #> ! Can't do this
rlang::last_error() #> <error/rlang_error> #> Error in `their_function()`: #> ! Can't do this #> --- #> Backtrace: #> x #> 1. \-rlang (local) their_function() #> Run rlang::last_trace(drop = FALSE) to see 3 hidden frames.
testthat workflow
Error snapshots are the main way of checking that the correct error call is included in an error message. However you'll need to opt into a new testthat display for warning and error snapshots. With the new display, these are printed by rlang, including the call
field. This makes it easy to monitor the full appearance of warning and error messages as they are displayed to users.
This display is not applied to all packages yet. With testthat 3.1.2, depend explicitly on rlang >= 1.0.0 to opt in. Starting from testthat 3.1.3, depending on rlang, no matter the version, is sufficient to opt in. In the future, the new display will be enabled for all packages.
Once enabled, create error snapshots with:
expect_snapshot(error = TRUE, { my_function() })
You'll have to make sure that the snapshot coverage for error messages is sufficient for your package.
Including contextual information with error chains
Description
Error chaining is a mechanism for providing contextual information when an error occurs. There are multiple situations in which you might be able to provide context that is helpful to quickly understand the cause or origin of an error:
Mentioning the high level context in which a low level error arised. E.g. chaining a low-level HTTP error to a high-level download error.
Mentioning the pipeline step in which a user error occured. This is a major use-case for NSE interfaces in the tidyverse, e.g. in dplyr, tidymodels or ggplot2.
Mentioning the iteration context in which a user error occurred. For instance, the input file when processing documents, or the iteration number or key when running user code in a loop.
Here is an example of a chained error from dplyr that shows the pipeline step (mutate()
) and the iteration context (group ID) in which a function called by the user failed:
add <- function(x, y) x + y mtcars |> dplyr::group_by(cyl) |> dplyr::mutate(new = add(disp, "foo")) #> Error in `dplyr::mutate()`: #> i In argument: `new = add(disp, "foo")`. #> i In group 1: `cyl = 4`. #> Caused by error in `x + y`: #> ! non-numeric argument to binary operator
In all these cases, there are two errors in play, chained together:
The causal error, which interrupted the current course of action.
The contextual error, which expresses higher-level information when something goes wrong.
There may be more than one contextual error in an error chain, but there is always only one causal error.
Rethrowing errors
To create an error chain, you must first capture causal errors when they occur. We recommend using try_fetch()
instead of tryCatch()
or withCallingHandlers()
.
Compared to
tryCatch()
,try_fetch()
fully preserves the context of the error. This is important for debugging because it ensures complete backtraces are reported to users (e.g. vialast_error()
) and allowsoptions(error = recover)
to reach into the deepest error context.Compared to
withCallingHandlers()
, which also preserves the error context,try_fetch()
is able to catch stack overflow errors on R versions >= 4.2.0.
In practice, try_fetch()
works just like tryCatch()
. It takes pairs of error class names and handling functions. To chain an error, simply rethrow it from an error handler by passing it as parent
argument.
In this example, we'll create a with_
function. That is, a function that sets up some configuration (in this case, chained errors) before executing code supplied as input:
with_chained_errors <- function(expr) { try_fetch( expr, error = function(cnd) { abort("Problem during step.", parent = cnd) } ) } with_chained_errors(1 + "") #> Error in `with_chained_errors()`: #> ! Problem during step. #> Caused by error in `1 + ""`: #> ! non-numeric argument to binary operator
Typically, you'll use this error helper from another user-facing function.
my_verb <- function(expr) { with_chained_errors(expr) } my_verb(add(1, "")) #> Error in `with_chained_errors()`: #> ! Problem during step. #> Caused by error in `x + y`: #> ! non-numeric argument to binary operator
Altough we have created a chained error, the error call of the contextual error is not quite right. It mentions the name of the error helper instead of the name of the user-facing function.
If you've read Including function calls in error messages, you may suspect that we need to pass a call
argument to abort()
. That's exactly what needs to happen to fix the call and backtrace issues:
with_chained_errors <- function(expr, call = caller_env()) { try_fetch( expr, error = function(cnd) { abort("Problem during step.", parent = cnd, call = call) } ) }
Now that we've passed the caller environment as call
argument, abort()
automatically picks up the correspondin function call from the execution frame:
my_verb(add(1, "")) #> Error in `my_verb()`: #> ! Problem during step. #> Caused by error in `x + y`: #> ! non-numeric argument to binary operator
Side note about missing arguments
my_verb()
is implemented with a lazy evaluation pattern. The user input kept unevaluated until the error chain context is set up. A downside of this arrangement is that missing argument errors are reported in the wrong context:
my_verb() #> Error in `my_verb()`: #> ! Problem during step. #> Caused by error in `my_verb()`: #> ! argument "expr" is missing, with no default
To fix this, simply require these arguments before setting up the chained error context, for instance with the check_required()
input checker exported from rlang:
my_verb <- function(expr) { check_required(expr) with_chained_errors(expr) } my_verb() #> Error in `my_verb()`: #> ! `expr` is absent but must be supplied.
Taking full ownership of a causal error
It is also possible to completely take ownership of a causal error and rethrow it with a more user-friendly error message. In this case, the original error is completely hidden from the end user. Opting for his approach instead of chaining should be carefully considered because hiding the causal error may deprive users from precious debugging information.
In general, hiding user errors (e.g. dplyr inputs) in this way is likely a bad idea.
It may be appropriate to hide low-level errors, e.g. replacing HTTP errors by a high-level download error. Similarly, tidyverse packages like dplyr are replacing low-level vctrs errors with higher level errors of their own crafting.
Hiding causal errors indiscriminately should likely be avoided because it may suppress information about unexpected errors. In general, rethrowing an unchained errors should only be done with specific error classes.
To rethow an error without chaining it, and completely take over the causal error from the user point of view, fetch it with try_fetch()
and throw a new error. The only difference with throwing a chained error is that the parent
argument is set to NA
. You could also omit the parent
argument entirely, but passing NA
lets abort()
know it is rethrowing an error from a handler and that it should hide the corresponding error helpers in the backtrace.
with_own_scalar_errors <- function(expr, call = caller_env()) { try_fetch( expr, vctrs_error_scalar_type = function(cnd) { abort( "Must supply a vector.", parent = NA, error = cnd, call = call ) } ) } my_verb <- function(expr) { check_required(expr) with_own_scalar_errors( vctrs::vec_assert(expr) ) } my_verb(env()) #> Error in `my_verb()`: #> ! Must supply a vector.
When a low-level error is overtaken, it is good practice to store it in the high-level error object, so that it can be inspected for debugging purposes. In the snippet above, we stored it in the error
field. Here is one way of accessing the original error by subsetting the object returned by last_error()
:
rlang::last_error()$error #> <error/vctrs_error_scalar_type> #> Error in `my_verb()`: #> ! `expr` must be a vector, not an environment. #> --- #> Backtrace: #> x #> 1. \-rlang (local) my_verb(env())
Case study: Mapping with chained errors
One good use case for chained errors is adding information about the iteration state when looping over a set of inputs. To illustrate this, we'll implement a version of map()
/ lapply()
that chains an iteration error to any captured user error.
Here is a minimal implementation of map()
:
my_map <- function(.xs, .fn, ...) { out <- new_list(length(.xs)) for (i in seq_along(.xs)) { out[[i]] <- .fn(.xs[[i]], ...) } out } list(1, 2) |> my_map(add, 100) #> [[1]] #> [1] 101 #> #> [[2]] #> [1] 102
With this implementation, the user has no idea which iteration failed when an error occurs:
list(1, "foo") |> my_map(add, 100) #> Error in `x + y`: #> ! non-numeric argument to binary operator
Rethrowing with iteration information
To improve on this we'll wrap the loop in a try_fetch()
call that rethrow errors with iteration information. Make sure to call try_fetch()
on the outside of the loop to avoid a massive performance hit:
my_map <- function(.xs, .fn, ...) { out <- new_list(length(.xs)) i <- 0L try_fetch( for (i in seq_along(.xs)) { out[[i]] <- .fn(.xs[[i]], ...) }, error = function(cnd) { abort( sprintf("Problem while mapping element %d.", i), parent = cnd ) } ) out }
And that's it, the error chain created by the rethrowing handler now provides users with the number of the failing iteration:
list(1, "foo") |> my_map(add, 100) #> Error in `my_map()`: #> ! Problem while mapping element 2. #> Caused by error in `x + y`: #> ! non-numeric argument to binary operator
Dealing with errors thrown from the mapped function
One problem though, is that the user error call is not very informative when the error occurs immediately in the function supplied to my_map()
:
my_function <- function(x) { if (!is_string(x)) { abort("`x` must be a string.") } } list(1, "foo") |> my_map(my_function) #> Error in `my_map()`: #> ! Problem while mapping element 1. #> Caused by error in `.fn()`: #> ! `x` must be a string.
Functions have no names by themselves. Only the variable that refers to the function has a name. In this case, the mapped function is passed by argument to the variable .fn
. So, when an error happens, this is the name that is reported to users.
One approach to fix this is to inspect the call
field of the error. When we detect a .fn
call, we replace it by the defused code supplied as .fn
argument:
my_map <- function(.xs, .fn, ...) { # Capture the defused code supplied as `.fn` fn_code <- substitute(.fn) out <- new_list(length(.xs)) for (i in seq_along(.xs)) { try_fetch( out[[i]] <- .fn(.xs[[i]], ...), error = function(cnd) { # Inspect the `call` field to detect `.fn` calls if (is_call(cnd$call, ".fn")) { # Replace `.fn` by the defused code. # Keep existing arguments. cnd$call[[1]] <- fn_code } abort( sprintf("Problem while mapping element %s.", i), parent = cnd ) } ) } out }
And voilà !
list(1, "foo") |> my_map(my_function) #> Error in `my_map()`: #> ! Problem while mapping element 1. #> Caused by error in `my_function()`: #> ! `x` must be a string.
Injecting with !!
, !!!
, and glue syntax
Description
The injection operators are extensions of R implemented by rlang to modify a piece of code before R processes it. There are two main families:
The dynamic dots operators,
!!!
and"{"
.The metaprogramming operators
!!
,{{
, and"{{"
. Splicing with!!!
can also be done in metaprogramming context.
Dots injection
Unlike regular ...
, dynamic dots are programmable with injection operators.
Splicing with !!!
For instance, take a function like rbind()
which takes data in ...
. To bind rows, you supply them as separate arguments:
rbind(a = 1:2, b = 3:4) #> [,1] [,2] #> a 1 2 #> b 3 4
But how do you bind a variable number of rows stored in a list? The base R solution is to invoke rbind()
with do.call()
:
rows <- list(a = 1:2, b = 3:4) do.call("rbind", rows) #> [,1] [,2] #> a 1 2 #> b 3 4
Functions that implement dynamic dots include a built-in way of folding a list of arguments in ...
. To illustrate this, we'll create a variant of rbind()
that takes dynamic dots by collecting ...
with list2()
:
rbind2 <- function(...) { do.call("rbind", list2(...)) }
It can be used just like rbind()
:
rbind2(a = 1:2, b = 3:4) #> [,1] [,2] #> a 1 2 #> b 3 4
And a list of arguments can be supplied by splicing the list with !!!
:
rbind2(!!!rows, c = 5:6) #> [,1] [,2] #> a 1 2 #> b 3 4 #> c 5 6
Injecting names with "{"
A related problem comes up when an argument name is stored in a variable. With dynamic dots, you can inject the name using glue syntax with "{"
:
name <- "foo" rbind2("{name}" := 1:2, bar = 3:4) #> [,1] [,2] #> foo 1 2 #> bar 3 4 rbind2("prefix_{name}" := 1:2, bar = 3:4) #> [,1] [,2] #> prefix_foo 1 2 #> bar 3 4
Metaprogramming injection
Data-masked arguments support the following injection operators. They can also be explicitly enabled with inject()
.
Embracing with {{
The embracing operator {{
is made specially for function arguments. It defuses the expression supplied as argument and immediately injects it in place. The injected argument is then evaluated in another context such as a data mask.
# Inject function arguments that might contain # data-variables by embracing them with {{ }} mean_by <- function(data, by, var) { data %>% dplyr::group_by({{ by }}) %>% dplyr::summarise(avg = mean({{ var }}, na.rm = TRUE)) } # The data-variables `cyl` and `disp` inside the # env-variables `by` and `var` are injected inside `group_by()` # and `summarise()` mtcars %>% mean_by(by = cyl, var = disp) #> # A tibble: 3 x 2 #> cyl avg #> <dbl> <dbl> #> 1 4 105. #> 2 6 183. #> 3 8 353.
Learn more about this pattern in Data mask programming patterns.
Injecting with !!
Unlike !!!
which injects a list of arguments, the injection operator !!
(pronounced "bang-bang") injects a single object. One use case for !!
is to substitute an environment-variable (created with <-
) with a data-variable (inside a data frame).
# The env-variable `var` contains a data-symbol object, in this # case a reference to the data-variable `height` var <- data_sym("disp") # We inject the data-variable contained in `var` inside `summarise()` mtcars %>% dplyr::summarise(avg = mean(!!var, na.rm = TRUE)) #> # A tibble: 1 x 1 #> avg #> <dbl> #> 1 231.
Another use case is to inject a variable by value to avoid name collisions.
df <- data.frame(x = 1) # This name conflicts with a column in `df` x <- 100 # Inject the env-variable df %>% dplyr::mutate(x = x / !!x) #> x #> 1 0.01
Note that in most cases you don't need injection with !!
. For instance, the .data
and .env
pronouns provide more intuitive alternatives to injecting a column name and injecting a value.
Splicing with !!!
The splice operator !!!
of dynamic dots can also be used in metaprogramming context (inside data-masked arguments and inside inject()
). For instance, we could reimplement the rbind2()
function presented above using inject()
instead of do.call()
:
rbind2 <- function(...) { inject(rbind(!!!list2(...))) }
There are two things going on here. We collect ...
with list2()
so that the callers of rbind2()
may use !!!
. And we use inject()
so that rbind2()
itself may use !!!
to splice the list of arguments passed to rbind2()
.
Injection in other languages
Injection is known as quasiquotation in other programming languages and in computer science. expr()
is similar to a quasiquotation operator and !!
is the unquote operator. These terms have a rich history in Lisp languages, and live on in modern languages like Julia and Racket. In base R, quasiquotation is performed with bquote()
.
The main difference between rlang and other languages is that quasiquotation is often implicit instead of explicit. You can use injection operators in any defusing / quoting function (unless that function defuses its argument with a special operator like enquo0()
). This is not the case in lisp languages for example where injection / unquoting is explicit and only enabled within a backquote.
See also
What happens if I use injection operators out of context?
Description
The injection operators {{
, !!
, and !!!
are an extension of the R syntax developed for tidyverse packages. Because they are not part of base R, they suffer from some limitations. In particular no specific error is thrown when they are used in unexpected places.
Using {{
out of context
The embrace operator {{
is a feature available in data-masked arguments powered by tidy eval. If you use it elsewhere, it is interpreted as a double {
wrapping.
In the R language, {
is like (
but takes multiple expressions instead of one:
{ 1 # Discarded 2 } #> [1] 2 list( { message("foo"); 2 } ) #> foo #> [[1]] #> [1] 2
Just like you can wrap an expression in as many parentheses as you'd like, you can wrap multiple times with braces:
((1)) #> [1] 1 {{ 2 }} #> [1] 2
So nothing prevents you from embracing a function argument in a context where this operation is not implemented. R will just treat the braces like a set of parentheses and silently return the result:
f <- function(arg) list({{ arg }}) f(1) #> [[1]] #> [1] 1
This sort of no-effect embracing should be avoided in real code because it falsely suggests that the function supports the tidy eval operator and that something special is happening.
However in many cases embracing is done to implement data masking. It is likely that the function will be called with data-variables references which R won't be able to resolve properly:
my_mean <- function(data, var) { with(data, mean({{ var }})) } my_mean(mtcars, cyl) #> Error: #> ! object 'cyl' not found
Since with()
is a base data-masking function that doesn't support tidy eval operators, the embrace operator does not work and we get an object not found error.
Using !!
and !!!
out of context
The injection operators !!
and !!!
are implemented in data-masked arguments, dynamic dots, and within inject()
. When used in other contexts, they are interpreted by R as double and triple negations.
Double negation can be used in ordinary code to convert an input to logical:
!!10 #> [1] TRUE !!0 #> [1] FALSE
Triple negation is essentially the same as simple negation:
!10 #> [1] FALSE !!!10 #> [1] FALSE
This means that when injection operators are used in the wrong place, they will be interpreted as negation. In the best case scenario you will get a type error:
!"foo" #> Error in `!"foo"`: #> ! invalid argument type !quote(foo) #> Error in `!quote(foo)`: #> ! invalid argument type !quote(foo()) #> Error in `!quote(foo())`: #> ! invalid argument type
In the worst case, R will silently convert the input to logical. Unfortunately there is no systematic way of checking for these errors.
Metaprogramming patterns
Description
The patterns covered in this article rely on metaprogramming, the ability to defuse, create, expand, and inject R expressions. A good place to start if you're new to programming on the language is the Metaprogramming chapter of the Advanced R book.
If you haven't already, read Data mask programming patterns which covers simpler patterns that do not require as much theory to get up to speed. It covers concepts like argument behaviours and the various patterns you can add to your toolbox (forwarding, names, bridge, and transformative patterns).
Forwarding patterns
Defuse and inject
{{
and ...
are sufficient for most purposes. Sometimes however, it is necessary to decompose the forwarding action into its two constitutive steps, defusing and injecting.
{{
is the combination of enquo()
and !!
. These functions are completely equivalent:
my_summarise <- function(data, var) { data %>% dplyr::summarise({{ var }}) } my_summarise <- function(data, var) { data %>% dplyr::summarise(!!enquo(var)) }
Passing ...
is equivalent to the combination of enquos()
and !!!
:
my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(...) } my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(!!!enquos(...)) }
The advantage of decomposing the steps is that you gain access to the defused expressions. Once defused, you can inspect or modify the expressions before injecting them in their target context.
Inspecting input labels
For instance, here is how to create an automatic name for a defused argument using as_label()
:
f <- function(var) { var <- enquo(var) as_label(var) } f(cyl) #> [1] "cyl" f(1 + 1) #> [1] "1 + 1"
This is essentially equivalent to formatting an argument using englue()
:
f2 <- function(var) { englue("{{ var }}") } f2(1 + 1) #> [1] "1 + 1"
With multiple arguments, use the plural variant enquos()
. Set .named
to TRUE
to automatically call as_label()
on the inputs for which the user has not provided a name (the same behaviour as in most dplyr verbs):
g <- function(...) { vars <- enquos(..., .named = TRUE) names(vars) } g(cyl, 1 + 1) #> [1] "cyl" "1 + 1"
Just like with dplyr::mutate()
, the user can override automatic names by supplying explicit names:
g(foo = cyl, bar = 1 + 1) #> [1] "foo" "bar"
Defuse-and-inject patterns are most useful for transforming inputs. Some applications are explored in the Transformation patterns section.
Names patterns
Symbolise and inject
The symbolise-and-inject pattern is a names pattern that you can use when across(all_of())
is not supported. It consists in creating defused expressions that refer to the data-variables represented in the names vector. These are then injected in the data mask context.
Symbolise a single string with sym()
or data_sym()
:
var <- "cyl" sym(var) #> cyl data_sym(var) #> .data$cyl
Symbolise a character vector with syms()
or data_syms()
.
vars <- c("cyl", "am") syms(vars) #> [[1]] #> cyl #> #> [[2]] #> am data_syms(vars) #> [[1]] #> .data$cyl #> #> [[2]] #> .data$am
Simple symbols returned by sym()
and syms()
work in a wider variety of cases (with base functions in particular) but we'll use mostly use data_sym()
and data_syms()
because they are more robust (see The data mask ambiguity). Note that these do not return symbols per se, instead they create calls to $
that subset the .data
pronoun.
Since the .data
pronoun is a tidy eval feature, you can't use it in base functions. As a rule, prefer the data_
-prefixed variants when you're injecting in tidy eval functions and the unprefixed functions for base functions.
A list of symbols can be injected in data-masked dots with the splice operator !!!
, which injects each element of the list as a separate argument. For instance, to implement a group_by()
variant that takes a character vector of column names, you might write:
my_group_by <- function(data, vars) { data %>% dplyr::group_by(!!!data_syms(vars)) } my_group_by(vars)
In more complex case, you might want to add R code around the symbols. This requires transformation patterns, see the section below.
Bridge patterns
mutate()
as a data-mask to selection bridge
This is a variant of the transmute()
bridge pattern described in Data mask programming patterns that does not materialise ...
in the intermediate step. Instead, the ...
expressions are defused and inspected. Then the expressions, rather than the columns, are spliced in mutate()
.
my_pivot_longer <- function(data, ...) { # Defuse the dots and inspect the names dots <- enquos(..., .named = TRUE) names <- names(dots) # Pass the inputs to `mutate()` data <- data %>% dplyr::mutate(!!!dots) # Select `...` inputs by name with `all_of()` data %>% tidyr::pivot_longer(cols = all_of(names)) } mtcars %>% my_pivot_longer(cyl, am = am * 100)
Defuse the
...
expressions. The.named
argument ensures unnamed inputs get a default name, just like they would if passed tomutate()
. Take the names of the list of inputs.Once we have the names, inject the argument expressions into
mutate()
to update the data frame.Finally, pass the names to the tidy selection via
all_of()
.
Transformation patterns
Transforming inputs manually
If across()
and variants are not available, you will need to transform the inputs yourself using metaprogramming techniques. To illustrate the technique we'll reimplement my_mean()
and without using across()
. The pattern consists in defusing the input expression, building larger calls around them, and finally inject the modified expressions inside the data-masking functions.
We'll start with a single named argument for simplicity:
my_mean <- function(data, var) { # Defuse the expression var <- enquo(var) # Wrap it in a call to `mean()` var <- expr(mean(!!var, na.rm = TRUE)) # Inject the expanded expression data %>% dplyr::summarise(mean = !!var) } mtcars %>% my_mean(cyl) #> # A tibble: 1 x 1 #> mean #> <dbl> #> 1 6.19
With ...
the technique is similar, though a little more involved. We'll use the plural variants enquos()
and !!!
. We'll also loop over the variable number of inputs using purrr::map()
. But the pattern is otherwise basically the same:
my_mean <- function(.data, ...) { # Defuse the dots. Make sure they are automatically named. vars <- enquos(..., .named = TRUE) # Map over each defused expression and wrap it in a call to `mean()` vars <- purrr::map(vars, ~ expr(mean(!!.x, na.rm = TRUE))) # Inject the expressions .data %>% dplyr::summarise(!!!vars) } mtcars %>% my_mean(cyl) #> # A tibble: 1 x 1 #> cyl #> <dbl> #> 1 6.19
Note that we are inheriting the data-masking behaviour of summarise()
because we have effectively forwarded ...
inside that verb. This is different than transformation patterns based on across()
which inherit tidy selection behaviour. In practice, this means the function doesn't support selection helpers and syntax. Instead, it gains the ability to create new vectors on the fly:
mtcars %>% my_mean(cyl = cyl * 100) #> # A tibble: 1 x 1 #> cyl #> <dbl> #> 1 619.
Base patterns
In this section, we review patterns for programming with base data-masking functions. They essentially consist in building and evaluating expressions in the data mask. We review these patterns and compare them to rlang idioms.
Data-masked get()
In the simplest version of this pattern, get()
is called with a variable name to retrieve objects from the data mask:
var <- "cyl" with(mtcars, mean(get(var))) #> [1] 6.1875
This sort of pattern is susceptible to names collisions. For instance, the input data frame might contain a variable called var
:
df <- data.frame(var = "wrong") with(df, mean(get(var))) #> Error in `get()`: #> ! object 'wrong' not found
In general, prefer symbol injection over get()
to prevent this sort of collisions. With base functions you will need to enable injection operators explicitly using inject()
:
inject( with(mtcars, mean(!!sym(var))) ) #> [1] 6.1875
See The data mask ambiguity for more information about names collisions.
Data-masked parse()
and eval()
A more involved pattern consists in building R code in a string and evaluating it in the mask:
var1 <- "am" var2 <- "vs" code <- paste(var1, "==", var2) with(mtcars, mean(eval(parse(text = code)))) #> [1] 0.59375
As before, the code
variable is vulnerable to names collisions. More importantly, if var1
and var2
are user inputs, they could contain adversarial code. Evaluating code assembled from strings is always a risky business:
var1 <- "(function() { Sys.sleep(Inf) # Could be a coin mining routine })()" var2 <- "vs" code <- paste(var1, "==", var2) with(mtcars, mean(eval(parse(text = code))))
This is not a big deal if your code is only used internally. However, this code could be part of a public Shiny app which Internet users could exploit. But even internally, parsing is a source of bugs when variable names contain syntactic symbols like -
or :
.
var1 <- ":var:" var2 <- "vs" code <- paste(var1, "==", var2) with(mtcars, mean(eval(parse(text = code)))) #> Error in `parse()`: #> ! <text>:1:1: unexpected ':' #> 1: : #> ^
For these reasons, always prefer to build code instead of parsing code. Building variable names with sym()
is a way of sanitising inputs.
var1 <- "(function() { Sys.sleep(Inf) # Could be a coin mining routine })()" var2 <- "vs" code <- call("==", sym(var1), sym(var2)) code #> `(function() {\n Sys.sleep(Inf) # Could be a coin mining routine\n})()` == #> vs
The adversarial input now produces an error:
with(mtcars, mean(eval(code))) #> Error: #> ! object '(function() {\n Sys.sleep(Inf) # Could be a coin mining routine\n})()' not found
Finally, it is recommended to inject the code instead of evaluating it to avoid names collisions:
var1 <- "am" var2 <- "vs" code <- call("==", sym(var1), sym(var2)) inject( with(mtcars, mean(!!code)) ) #> [1] 0.59375
Taking multiple columns without ...
Description
In this guide we compare ways of taking multiple columns in a single function argument.
As a refresher (see the programming patterns article), there are two common ways of passing arguments to data-masking functions. For single arguments, embrace with {{
:
my_group_by <- function(data, var) { data %>% dplyr::group_by({{ var }}) } my_pivot_longer <- function(data, var) { data %>% tidyr::pivot_longer({{ var }}) }
For multiple arguments in ...
, pass them on to functions that also take ...
like group_by()
, or pass them within c()
for functions taking tidy selection in a single argument like pivot_longer()
:
# Pass dots through my_group_by <- function(.data, ...) { .data %>% dplyr::group_by(...) } my_pivot_longer <- function(.data, ...) { .data %>% tidyr::pivot_longer(c(...)) }
But what if you want to take multiple columns in a single named argument rather than in ...
?
Using tidy selections
The idiomatic tidyverse way of taking multiple columns in a single argument is to take a tidy selection (see the Argument behaviours section). In tidy selections, the syntax for passing multiple columns in a single argument is c()
:
mtcars %>% tidyr::pivot_longer(c(am, cyl, vs))
Since {{
inherits behaviour, this implementation of my_pivot_longer()
automatically allows multiple columns passing:
my_pivot_longer <- function(data, var) { data %>% tidyr::pivot_longer({{ var }}) } mtcars %>% my_pivot_longer(c(am, cyl, vs))
For group_by()
, which takes data-masked arguments, we'll use across()
as a bridge (see Bridge patterns).
my_group_by <- function(data, var) { data %>% dplyr::group_by(across({{ var }})) } mtcars %>% my_group_by(c(am, cyl, vs))
When embracing in tidyselect context or using across()
is not possible, you might have to implement tidyselect behaviour manually with tidyselect::eval_select()
.
Using external defusal
To implement an argument with tidyselect behaviour, it is necessary to defuse the argument. However defusing an argument which had historically behaved like a regular argument is a rather disruptive breaking change. This is why we could not implement tidy selections in ggplot2 facetting functions like facet_grid()
and facet_wrap()
.
An alternative is to use external defusal of arguments. This is what formula interfaces do for instance. A modelling function takes a formula in a regular argument and the formula defuses the user code:
my_lm <- function(data, f, ...) { lm(f, data, ...) } mtcars %>% my_lm(disp ~ drat)
Once created, the defused expressions contained in the formula are passed around like a normal argument. A similar approach was taken to update facet_
functions to tidy eval. The vars()
function (a simple alias to quos()
) is provided so that users can defuse their arguments externally.
ggplot2::facet_grid( ggplot2::vars(cyl), ggplot2::vars(am, vs) )
You can implement this approach by simply taking a list of defused expressions as argument. This list can be passed the usual way to other functions taking such lists:
my_facet_grid <- function(rows, cols, ...) { ggplot2::facet_grid(rows, cols, ...) }
Or it can be spliced with !!!
:
my_group_by <- function(data, vars) { stopifnot(is_quosures(vars)) data %>% dplyr::group_by(!!!vars) } mtcars %>% my_group_by(dplyr::vars(cyl, am))
A non-approach: Parsing lists
Intuitively, many programmers who want to take a list of expressions in a single argument try to defuse an argument and parse it. The user is expected to supply multiple arguments within a list()
expression. When such a call is detected, the arguments are retrieved and spliced with !!!
. Otherwise, the user is assumed to have supplied a single argument which is injected with !!
. An implementation along these lines might look like this:
my_group_by <- function(data, vars) { vars <- enquo(vars) if (quo_is_call(vars, "list")) { expr <- quo_get_expr(vars) env <- quo_get_env(vars) args <- as_quosures(call_args(expr), env = env) data %>% dplyr::group_by(!!!args) } else { data %>% dplyr::group_by(!!vars) } }
This does work in simple cases:
mtcars %>% my_group_by(cyl) %>% dplyr::group_vars() #> [1] "cyl" mtcars %>% my_group_by(list(cyl, am)) %>% dplyr::group_vars() #> [1] "cyl" "am"
However this parsing approach quickly shows limits:
mtcars %>% my_group_by(list2(cyl, am)) #> Error in `group_by()`: Can't add columns. #> i `..1 = list2(cyl, am)`. #> i `..1` must be size 32 or 1, not 2.
Also, it would be better for overall consistency of interfaces to use the tidyselect syntax c()
for passing multiple columns. In general, we recommend to use either the tidyselect or the external defusal approaches.
What are quosures and when are they needed?
Description
A quosure is a special type of defused expression that keeps track of the original context the expression was written in. The tracking capabilities of quosures is important when interfacing data-masking functions together because the functions might come from two unrelated environments, like two different packages.
Blending environments
Let's take an example where the R user calls the function summarise_bmi()
from the foo package to summarise a data frame with statistics of a BMI value. Because the height
variable of their data frame is not in metres, they use a custom function div100()
to rescale the column.
# Global environment of user div100 <- function(x) { x / 100 } dplyr::starwars %>% foo::summarise_bmi(mass, div100(height))
The summarise_bmi()
function is a data-masking function defined in the namespace of the foo package which looks like this:
# Namespace of package foo bmi <- function(mass, height) { mass / height^2 } summarise_bmi <- function(data, mass, height) { data %>% bar::summarise_stats(bmi({{ mass }}, {{ height }})) }
The foo package uses the custom function bmi()
to perform a computation on two vectors. It interfaces with summarise_stats()
defined in bar, another package whose namespace looks like this:
# Namespace of package bar check_numeric <- function(x) { stopifnot(is.numeric(x)) x } summarise_stats <- function(data, var) { data %>% dplyr::transmute( var = check_numeric({{ var }}) ) %>% dplyr::summarise( mean = mean(var, na.rm = TRUE), sd = sd(var, na.rm = TRUE) ) }
Again the package bar uses a custom function, check_numeric()
, to validate its input. It also interfaces with data-masking functions from dplyr (using the define-a-constant trick to avoid issues of double evaluation).
There are three data-masking functions simultaneously interfacing in this snippet:
At the bottom,
dplyr::transmute()
takes a data-masked input, and creates a data frame of a single column namedvar
.Before this,
bar::summarise_stats()
takes a data-masked input insidedplyr::transmute()
and checks it is numeric.And first of all,
foo::summarise_bmi()
takes two data-masked inputs insidebar::summarise_stats()
and transforms them to a single BMI value.
There is a fourth context, the global environment where summarise_bmi()
is called with two columns defined in a data frame, one of which is transformed on the fly with the user function div100()
.
All of these contexts (except to some extent the global environment) contain functions that are private and invisible to foreign functions. Yet, the final expanded data-masked expression that is evaluated down the line looks like this (with caret characters indicating the quosure boundaries):
dplyr::transmute( var = ^check_numeric(^bmi(^mass, ^div100(height))) )
The role of quosures is to let R know that check_numeric()
should be found in the bar package, bmi()
in the foo package, and div100()
in the global environment.
When should I create quosures?
As a tidyverse user you generally don't need to worry about quosures because {{
and ...
will create them for you. Introductory texts like Programming with dplyr or the standard data-mask programming patterns don't even mention the term. In more complex cases you might need to create quosures with enquo()
or enquos()
(even though you generally don't need to know or care that these functions return quosures). In this section, we explore when quosures are necessary in these more advanced applications.
Foreign and local expressions
As a rule of thumb, quosures are only needed for arguments defused with enquo()
or enquos()
(or with {{
which calls enquo()
implicitly):
my_function <- function(var) { var <- enquo(var) their_function(!!var) } # Equivalently my_function <- function(var) { their_function({{ var }}) }
Wrapping defused arguments in quosures is needed because expressions supplied as argument comes from a different environment, the environment of your user. For local expressions created in your function, you generally don't need to create quosures:
my_mean <- function(data, var) { # `expr()` is sufficient, no need for `quo()` expr <- expr(mean({{ var }})) dplyr::summarise(data, !!expr) } my_mean(mtcars, cyl) #> # A tibble: 1 x 1 #> `mean(cyl)` #> <dbl> #> 1 6.19
Using quo()
instead of expr()
would have worked too but it is superfluous because dplyr::summarise()
, which uses enquos()
, is already in charge of wrapping your expression within a quosure scoped in your environment.
The same applies if you evaluate manually. By default, eval()
and eval_tidy()
capture your environment:
my_mean <- function(data, var) { expr <- expr(mean({{ var }})) eval_tidy(expr, data) } my_mean(mtcars, cyl) #> [1] 6.1875
External defusing
An exception to this rule of thumb (wrap foreign expressions in quosures, not your own expressions) arises when your function takes multiple expressions in a list instead of ...
. The preferred approach in that case is to take a tidy selection so that users can combine multiple columns using c()
. If that is not possible, you can take a list of externally defused expressions:
my_group_by <- function(data, vars) { stopifnot(is_quosures(vars)) data %>% dplyr::group_by(!!!vars) } mtcars %>% my_group_by(dplyr::vars(cyl, am))
In this pattern, dplyr::vars()
defuses expressions externally. It creates a list of quosures because the expressions are passed around from function to function like regular arguments. In fact, dplyr::vars()
and ggplot2::vars()
are simple aliases of quos()
.
dplyr::vars(cyl, am) #> <list_of<quosure>> #> #> [[1]] #> <quosure> #> expr: ^cyl #> env: global #> #> [[2]] #> <quosure> #> expr: ^am #> env: global
For more information about external defusing, see Taking multiple columns without ....
Technical description of quosures
A quosure carries two things:
An expression (get it with
quo_get_expr()
).An environment (get it with
quo_get_env()
).
And implements these behaviours:
It is callable. Evaluation produces a result.
For historical reasons,
base::eval()
doesn't support quosure evaluation. Quosures currently requireeval_tidy()
. We would like to fix this limitation in the future.It is hygienic. It evaluates in the tracked environment.
It is maskable. If evaluated in a data mask (currently only masks created with
eval_tidy()
ornew_data_mask()
), the mask comes first in scope before the quosure environment.Conceptually, a quosure inherits from two chains of environments, the data mask and the user environment. In practice rlang implements this special scoping by rechaining the top of the data mask to the quosure environment currently under evaluation.
There are similarities between promises (the ones R uses to implement lazy evaluation, not the async expressions from the promises package) and quosures. One important difference is that promises are only evaluated once and cache the result for subsequent evaluation. Quosures behave more like calls and can be evaluated repeatedly, potentially in a different data mask. This property is useful to implement split-apply-combine evaluations.
See also
-
enquo()
andenquos()
to defuse function arguments as quosures. This is the main way quosures are created. -
quo()
which is likeexpr()
but wraps in a quosure. Usually it is not needed to wrap local expressions yourself. -
quo_get_expr()
andquo_get_env()
to access quosure components. -
new_quosure()
andas_quosure()
to assemble a quosure from components.
Capture a backtrace
Description
A backtrace captures the sequence of calls that lead to the current
function (sometimes called the call stack). Because of lazy
evaluation, the call stack in R is actually a tree, which the
print()
method for this object will reveal.
Users rarely need to call trace_back()
manually. Instead,
signalling an error with abort()
or setting up global_entrace()
is the most common way to create backtraces when an error is
thrown. Inspect the backtrace created for the most recent error
with last_error()
.
trace_length()
returns the number of frames in a backtrace.
Usage
trace_back(top = NULL, bottom = NULL)
trace_length(trace)
Arguments
top |
The first frame environment to be included in the backtrace. This becomes the top of the backtrace tree and represents the oldest call in the backtrace. This is needed in particular when you call If not supplied, the |
bottom |
The last frame environment to be included in the backtrace. This becomes the rightmost leaf of the backtrace tree and represents the youngest call in the backtrace. Set this when you would like to capture a backtrace without the capture context. Can also be an integer that will be passed to |
trace |
A backtrace created by |
Examples
# Trim backtraces automatically (this improves the generated
# documentation for the rlang website and the same trick can be
# useful within knitr documents):
options(rlang_trace_top_env = current_env())
f <- function() g()
g <- function() h()
h <- function() trace_back()
# When no lazy evaluation is involved the backtrace is linear
# (i.e. every call has only one child)
f()
# Lazy evaluation introduces a tree like structure
identity(identity(f()))
identity(try(f()))
try(identity(f()))
# When printing, you can request to simplify this tree to only show
# the direct sequence of calls that lead to `trace_back()`
x <- try(identity(f()))
x
print(x, simplify = "branch")
# With a little cunning you can also use it to capture the
# tree from within a base NSE function
x <- NULL
with(mtcars, {x <<- f(); 10})
x
# Restore default top env for next example
options(rlang_trace_top_env = NULL)
# When code is executed indirectly, i.e. via source or within an
# RMarkdown document, you'll tend to get a lot of guff at the beginning
# related to the execution environment:
conn <- textConnection("summary(f())")
source(conn, echo = TRUE, local = TRUE)
close(conn)
# To automatically strip this off, specify which frame should be
# the top of the backtrace. This will automatically trim off calls
# prior to that frame:
top <- current_env()
h <- function() trace_back(top)
conn <- textConnection("summary(f())")
source(conn, echo = TRUE, local = TRUE)
close(conn)
Try an expression with condition handlers
Description
try_fetch()
establishes handlers for conditions of a given class
("error"
, "warning"
, "message"
, ...). Handlers are functions
that take a condition object as argument and are called when the
corresponding condition class has been signalled.
A condition handler can:
-
Recover from conditions with a value. In this case the computation of
expr
is aborted and the recovery value is returned fromtry_fetch()
. Error recovery is useful when you don't want errors to abruptly interrupt your program but resume at the catching site instead.# Recover with the value 0 try_fetch(1 + "", error = function(cnd) 0)
-
Rethrow conditions, e.g. using
abort(msg, parent = cnd)
. See theparent
argument ofabort()
. This is typically done to add information to low-level errors about the high-level context in which they occurred.try_fetch(1 + "", error = function(cnd) abort("Failed.", parent = cnd))
-
Inspect conditions, for instance to log data about warnings or errors. In this case, the handler must return the
zap()
sentinel to instructtry_fetch()
to ignore (or zap) that particular handler. The next matching handler is called if any, and errors bubble up to the user if no handler remains.log <- NULL try_fetch(1 + "", error = function(cnd) { log <<- cnd zap() })
Whereas tryCatch()
catches conditions (discarding any running
code along the way) and then calls the handler, try_fetch()
first
calls the handler with the condition on top of the currently
running code (fetches it where it stands) and then catches the
return value. This is a subtle difference that has implications
for the debuggability of your functions. See the comparison with
tryCatch()
section below.
Another difference between try_fetch()
and the base equivalent is
that errors are matched across chains, see the parent
argument of
abort()
. This is a useful property that makes try_fetch()
insensitive to changes of implementation or context of evaluation
that cause a classed error to suddenly get chained to a contextual
error. Note that some chained conditions are not inherited, see the
.inherit
argument of abort()
or warn()
. In particular,
downgraded conditions (e.g. from error to warning or from warning
to message) are not matched across parents.
Usage
try_fetch(expr, ...)
Arguments
expr |
An R expression. |
... |
< |
Stack overflows
A stack overflow occurs when a program keeps adding to itself until the stack memory (whose size is very limited unlike heap memory) is exhausted.
# A function that calls itself indefinitely causes stack overflows f <- function() f() f() #> Error: C stack usage 9525680 is too close to the limit
Because memory is very limited when these errors happen, it is not
possible to call the handlers on the existing program stack.
Instead, error conditions are first caught by try_fetch()
and only
then error handlers are called. Catching the error interrupts the
program up to the try_fetch()
context, which allows R to reclaim
stack memory.
The practical implication is that error handlers should never
assume that the whole call stack is preserved. For instance a
trace_back()
capture might miss frames.
Note that error handlers are only run for stack overflows on R >=
4.2. On older versions of R the handlers are simply not run. This
is because these errors do not inherit from the class
stackOverflowError
before R 4.2. Consider using tryCatch()
instead with critical error handlers that need to capture all
errors on old versions of R.
Comparison with tryCatch()
try_fetch()
generalises tryCatch()
and withCallingHandlers()
in a single function. It reproduces the behaviour of both calling
and exiting handlers depending on the return value of the handler.
If the handler returns the zap()
sentinel, it is taken as a
calling handler that declines to recover from a condition.
Otherwise, it is taken as an exiting handler which returns a value
from the catching site.
The important difference between tryCatch()
and try_fetch()
is
that the program in expr
is still fully running when an error
handler is called. Because the call stack is preserved, this makes
it possible to capture a full backtrace from within the handler,
e.g. when rethrowing the error with abort(parent = cnd)
.
Technically, try_fetch()
is more similar to (and implemented on
top of) base::withCallingHandlers()
than tryCatch().
Base type of an object
Description
This is equivalent to base::typeof()
with a few differences that
make dispatching easier:
The type of one-sided formulas is "quote".
The type of character vectors of length 1 is "string".
The type of special and builtin functions is "primitive".
Usage
type_of(x)
Arguments
x |
An R object. |
Examples
type_of(10L)
# Quosures are treated as a new base type but not formulas:
type_of(quo(10L))
type_of(~10L)
# Compare to base::typeof():
typeof(quo(10L))
# Strings are treated as a new base type:
type_of(letters)
type_of(letters[[1]])
# This is a bit inconsistent with the core language tenet that data
# types are vectors. However, treating strings as a different
# scalar type is quite helpful for switching on function inputs
# since so many arguments expect strings:
switch_type("foo", character = abort("vector!"), string = "result")
# Special and builtin primitives are both treated as primitives.
# That's because it is often irrelevant which type of primitive an
# input is:
typeof(list)
typeof(`$`)
type_of(list)
type_of(`$`)
Type predicates
Description
These type predicates aim to make type testing in R more
consistent. They are wrappers around base::typeof()
, so operate
at a level beneath S3/S4 etc.
Usage
is_list(x, n = NULL)
is_atomic(x, n = NULL)
is_vector(x, n = NULL)
is_integer(x, n = NULL)
is_double(x, n = NULL, finite = NULL)
is_complex(x, n = NULL, finite = NULL)
is_character(x, n = NULL)
is_logical(x, n = NULL)
is_raw(x, n = NULL)
is_bytes(x, n = NULL)
is_null(x)
Arguments
x |
Object to be tested. |
n |
Expected length of a vector. |
finite |
Whether all values of the vector are finite. The
non-finite values are |
Details
Compared to base R functions:
The predicates for vectors include the
n
argument for pattern-matching on the vector length.Unlike
is.atomic()
in R < 4.4.0,is_atomic()
does not returnTRUE
forNULL
. Starting in R 4.4.0is.atomic(NULL)
returns FALSE.Unlike
is.vector()
,is_vector()
tests if an object is an atomic vector or a list.is.vector
checks for the presence of attributes (other than name).
See Also
bare-type-predicates scalar-type-predicates
Deprecated UQ()
and UQS()
operators
Description
These operators are deprecated in favour of
!!
and !!!
.
Usage
UQ(x)
UQS(x)
Poke values into a vector
Description
These tools are for R experts only. They copy elements from y
into x
by mutation. You should only do this if you own x
,
i.e. if you have created it or if you are certain that it doesn't
exist in any other context. Otherwise you might create unintended
side effects that have undefined consequences.
Usage
vec_poke_n(x, start, y, from = 1L, n = length(y))
vec_poke_range(x, start, y, from = 1L, to = length(y) - from + 1L)
Arguments
x |
The destination vector. |
start |
The index indicating where to start modifying |
y |
The source vector. |
from |
The index indicating where to start copying from |
n |
How many elements should be copied from |
to |
The index indicating the end of the range to copy from |
Coerce an object to a base type
Description
These are equivalent to the base functions (e.g. as.logical()
,
as.list()
, etc), but perform coercion rather than conversion.
This means they are not generic and will not call S3 conversion
methods. They only attempt to coerce the base type of their
input. In addition, they have stricter implicit coercion rules and
will never attempt any kind of parsing. E.g. they will not try to
figure out if a character vector represents integers or booleans.
Finally, they treat attributes consistently, unlike the base R
functions: all attributes except names are removed.
Usage
as_logical(x)
as_integer(x)
as_double(x)
as_complex(x)
as_character(x, encoding = NULL)
as_list(x)
Arguments
x |
An object to coerce to a base type. |
encoding |
If non-null, set an encoding mark. This is only declarative, no encoding conversion is performed. |
Lifecycle
These functions are deprecated in favour of vctrs::vec_cast()
.
Coercion to logical and numeric atomic vectors
To logical vectors: Integer and integerish double vectors. See
is_integerish()
.To integer vectors: Logical and integerish double vectors.
To double vectors: Logical and integer vectors.
To complex vectors: Logical, integer and double vectors.
Coercion to character vectors
as_character()
and as_string()
have an optional encoding
argument to specify the encoding. R uses this information for
internal handling of strings and character vectors. Note that this
is only declarative, no encoding conversion is attempted.
Note that only as_string()
can coerce symbols to a scalar
character vector. This makes the code more explicit and adds an
extra type check.
Coercion to lists
as_list()
only coerces vector and dictionary types (environments
are an example of dictionary type). Unlike base::as.list()
,
as_list()
removes all attributes except names.
Effects of removing attributes
A technical side-effect of removing the attributes of the input is
that the underlying objects has to be copied. This has no
performance implications in the case of lists because this is a
shallow copy: only the list structure is copied, not the contents
(see duplicate()
). However, be aware that atomic vectors
containing large amounts of data will have to be copied.
In general, any attribute modification creates a copy, which is why it is better to avoid using attributes with heavy atomic vectors. Uncopyable objects like environments and symbols are an exception to this rule: in this case, attributes modification happens in place and has side-effects.
Examples
# Coercing atomic vectors removes attributes with both base R and rlang:
x <- structure(TRUE, class = "foo", bar = "baz")
as.logical(x)
# But coercing lists preserves attributes in base R but not rlang:
l <- structure(list(TRUE), class = "foo", bar = "baz")
as.list(l)
as_list(l)
# Implicit conversions are performed in base R but not rlang:
as.logical(l)
## Not run:
as_logical(l)
## End(Not run)
# Conversion methods are bypassed, making the result of the
# coercion more predictable:
as.list.foo <- function(x) "wrong"
as.list(l)
as_list(l)
# The input is never parsed. E.g. character vectors of numbers are
# not converted to numeric types:
as.integer("33")
## Not run:
as_integer("33")
## End(Not run)
# With base R tools there is no way to convert an environment to a
# list without either triggering method dispatch, or changing the
# original environment. as_list() makes it easy:
x <- structure(as_environment(mtcars[1:2]), class = "foobar")
as.list.foobar <- function(x) abort("dont call me")
as_list(x)
Create vectors
Description
The atomic vector constructors are equivalent to c()
but:
They allow you to be more explicit about the output type. Implicit coercions (e.g. from integer to logical) follow the rules described in vector-coercion.
They use dynamic dots.
Usage
lgl(...)
int(...)
dbl(...)
cpl(...)
chr(...)
bytes(...)
Arguments
... |
Components of the new vector. Bare lists and explicitly spliced lists are spliced. |
Life cycle
All the abbreviated constructors such as
lgl()
will probably be moved to the vctrs package at some point. This is why they are marked as questioning.Automatic splicing is soft-deprecated and will trigger a warning in a future version. Please splice explicitly with
!!!
.
Examples
# These constructors are like a typed version of c():
c(TRUE, FALSE)
lgl(TRUE, FALSE)
# They follow a restricted set of coercion rules:
int(TRUE, FALSE, 20)
# Lists can be spliced:
dbl(10, !!! list(1, 2L), TRUE)
# They splice names a bit differently than c(). The latter
# automatically composes inner and outer names:
c(a = c(A = 10), b = c(B = 20, C = 30))
# On the other hand, rlang's constructors use the inner names and issue a
# warning to inform the user that the outer names are ignored:
dbl(a = c(A = 10), b = c(B = 20, C = 30))
dbl(a = c(1, 2))
# As an exception, it is allowed to provide an outer name when the
# inner vector is an unnamed scalar atomic:
dbl(a = 1)
# Spliced lists behave the same way:
dbl(!!! list(a = 1))
dbl(!!! list(a = c(A = 1)))
# bytes() accepts integerish inputs
bytes(1:10)
bytes(0x01, 0xff, c(0x03, 0x05), list(10, 20, 30L))
Evaluate an expression within a given environment
Description
These functions evaluate expr
within a given environment (env
for with_env()
, or the child of the current environment for
locally
). They rely on eval_bare()
which features a lighter
evaluation mechanism than base R base::eval()
, and which also has
some subtle implications when evaluting stack sensitive functions
(see help for eval_bare()
).
locally()
is equivalent to the base function
base::local()
but it produces a much cleaner
evaluation stack, and has stack-consistent semantics. It is thus
more suited for experimenting with the R language.
Usage
with_env(env, expr)
locally(expr)
Arguments
env |
An environment within which to evaluate |
expr |
An expression to evaluate. |
Examples
# with_env() is handy to create formulas with a given environment:
env <- child_env("rlang")
f <- with_env(env, ~new_formula())
identical(f_env(f), env)
# Or functions with a given enclosure:
fn <- with_env(env, function() NULL)
identical(get_env(fn), env)
# Unlike eval() it doesn't create duplicates on the evaluation
# stack. You can thus use it e.g. to create non-local returns:
fn <- function() {
g(current_env())
"normal return"
}
g <- function(env) {
with_env(env, return("early return"))
}
fn()
# Since env is passed to as_environment(), it can be any object with an
# as_environment() method. For strings, the pkg_env() is returned:
with_env("base", ~mtcars)
# This can be handy to put dictionaries in scope:
with_env(mtcars, cyl)
Establish handlers on the stack
Description
As of rlang 1.0.0, with_handlers()
is deprecated. Use the base
functions or the experimental try_fetch()
function instead.
Usage
with_handlers(.expr, ...)
calling(handler)
exiting(handler)
Arguments
.expr , ... , handler |
Get key/value from a weak reference object
Description
Get key/value from a weak reference object
Usage
wref_key(x)
wref_value(x)
Arguments
x |
A weak reference object. |
See Also
is_weakref()
and new_weakref()
.
Create zap objects
Description
zap()
creates a sentinel object that indicates that an object
should be removed. For instance, named zaps instruct env_bind()
and call_modify()
to remove those objects from the environment or
the call.
The advantage of zap objects is that they unambiguously signal the
intent of removing an object. Sentinels like NULL
or
missing_arg()
are ambiguous because they represent valid R
objects.
Usage
zap()
is_zap(x)
Arguments
x |
An object to test. |
Examples
# Create one zap object:
zap()
# Create a list of zaps:
rep(list(zap()), 3)
rep_named(c("foo", "bar"), list(zap()))
Zap source references
Description
There are a number of situations where R creates source references:
Reading R code from a file with
source()
andparse()
might save source references inside calls tofunction
and{
.-
sys.call()
includes a source reference if possible. Creating a closure stores the source reference from the call to
function
, if any.
These source references take up space and might cause a number of
issues. zap_srcref()
recursively walks through expressions and
functions to remove all source references.
Usage
zap_srcref(x)
Arguments
x |
An R object. Functions and calls are walked recursively. |