Type: | Package |
Encoding: | UTF-8 |
Title: | Labelled Data Utility Functions |
Version: | 1.2.0 |
Maintainer: | Daniel Lüdecke <d.luedecke@uke.de> |
Description: | Collection of functions dealing with labelled data, like reading and writing data between R and other statistical software packages like 'SPSS', 'SAS' or 'Stata', and working with labelled data. This includes easy ways to get, set or change value and variable label attributes, to convert labelled vectors into factors or numeric (and vice versa), or to deal with multiple declared missing values. |
License: | GPL-3 |
Depends: | R (≥ 3.4) |
Imports: | insight, datawizard, stats, tools, utils |
Suggests: | dplyr, haven (≥ 1.1.2), magrittr, sjmisc, sjPlot, knitr, rlang, rmarkdown, snakecase, testthat |
URL: | https://strengejacke.github.io/sjlabelled/ |
BugReports: | https://github.com/strengejacke/sjlabelled/issues |
RoxygenNote: | 7.1.2 |
VignetteBuilder: | knitr |
NeedsCompilation: | no |
Packaged: | 2022-04-10 09:09:11 UTC; DL |
Author: | Daniel Lüdecke |
Repository: | CRAN |
Date/Publication: | 2022-04-10 09:30:02 UTC |
Labelled Data Utility Functions
Description
Purpose of this package
Collection of miscellaneous utility functions (especially intended for people coming from other statistical software packages like 'SPSS', and/or who are new to R), supporting following common tasks when working with labelled data:
Reading and writing data between R and other statistical software packages like 'SPSS', 'SAS' or 'Stata'
Easy ways to get, set and change value and variable label attributes, to convert labelled vectors into factors (and vice versa), or to deal with multiple declared missing values etc.
Author(s)
Daniel Lüdecke d.luedecke@uke.de
Add, replace or remove value labels of variables
Description
These functions add, replace or remove value labels to or from variables.
Usage
add_labels(x, ..., labels)
replace_labels(x, ..., labels)
remove_labels(x, ..., labels)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
labels |
|
Details
add_labels()
adds labels
to the existing value
labels of x
, however, unlike set_labels
, it
does not remove labels that were not specified in
labels
. add_labels()
also replaces existing
value labels, but preserves the remaining labels.
remove_labels()
is the counterpart to add_labels()
.
It removes labels from a label attribute of x
.
replace_labels()
is an alias for add_labels()
.
Value
x
with additional or removed value labels. If x
is a data frame, the complete data frame x
will be returned,
with removed or added to variables specified in ...
;
if ...
is not specified, applies to all variables in the
data frame.
See Also
set_label
to manually set variable labels or
get_label
to get variable labels; set_labels
to
add value labels, replacing the existing ones (and removing non-specified
value labels).
Examples
# add_labels()
data(efc)
get_labels(efc$e42dep)
x <- add_labels(efc$e42dep, labels = c(`nothing` = 5))
get_labels(x)
if (require("dplyr")) {
x <- efc %>%
# select three variables
dplyr::select(e42dep, c172code, c161sex) %>%
# only add new label to two of those
add_labels(e42dep, c172code, labels = c(`nothing` = 5))
# see data frame, with selected variables having new labels
get_labels(x)
}
x <- add_labels(efc$e42dep, labels = c(`nothing` = 5, `zero value` = 0))
get_labels(x, values = "p")
# replace old value labels
x <- add_labels(
efc$e42dep,
labels = c(`not so dependent` = 4, `lorem ipsum` = 5)
)
get_labels(x, values = "p")
# replace specific missing value (tagged NA)
if (require("haven")) {
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current NA values
x
# tagged NA(c) has currently the value label "First", will be
# replaced by "Second" now.
replace_labels(x, labels = c("Second" = tagged_na("c")))
}
# remove_labels()
x <- remove_labels(efc$e42dep, labels = 2)
get_labels(x, values = "p")
x <- remove_labels(efc$e42dep, labels = "independent")
get_labels(x, values = "p")
if (require("haven")) {
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current NA values
get_na(x)
get_na(remove_labels(x, labels = tagged_na("c")))
}
Convert variable into factor with associated value labels
Description
as_label()
converts (replaces) values of a variable (also of factors
or character vectors) with their associated value labels. Might
be helpful for factor variables.
For instance, if you have a Gender variable with 0/1 value, and associated
labels are male/female, this function would convert all 0 to male and
all 1 to female and returns the new variable as factor.
as_character()
does the same as as_label()
, but returns
a character vector.
Usage
as_character(x, ...)
to_character(x, ...)
## S3 method for class 'data.frame'
as_character(
x,
...,
add.non.labelled = FALSE,
prefix = FALSE,
var.label = NULL,
drop.na = TRUE,
drop.levels = FALSE,
keep.labels = FALSE
)
as_label(x, ...)
to_label(x, ...)
## S3 method for class 'data.frame'
as_label(
x,
...,
add.non.labelled = FALSE,
prefix = FALSE,
var.label = NULL,
drop.na = TRUE,
drop.levels = FALSE,
keep.labels = FALSE
)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
add.non.labelled |
Logical, if |
prefix |
Logical, if |
var.label |
Optional string, to set variable label attribute for the
returned variable (see vignette Labelled Data and the sjlabelled-Package).
If |
drop.na |
Logical, if |
drop.levels |
Logical, if |
keep.labels |
Logical, if |
Details
See 'Details' in get_na
.
Value
A factor with the associated value labels as factor levels. If x
is a data frame, the complete data frame x
will be returned,
where variables specified in ...
are coerced to factors;
if ...
is not specified, applies to all variables in the
data frame. as_character()
returns a character vector.
Note
Value label attributes (see get_labels
)
will be removed when converting variables to factors.
Examples
data(efc)
print(get_labels(efc)['c161sex'])
head(efc$c161sex)
head(as_label(efc$c161sex))
print(get_labels(efc)['e42dep'])
table(efc$e42dep)
table(as_label(efc$e42dep))
head(efc$e42dep)
head(as_label(efc$e42dep))
# structure of numeric values won't be changed
# by this function, it only applies to labelled vectors
# (typically categorical or factor variables)
str(efc$e17age)
str(as_label(efc$e17age))
# factor with non-numeric levels
as_label(factor(c("a", "b", "c")))
# factor with non-numeric levels, prefixed
x <- factor(c("a", "b", "c"))
x <- set_labels(x, labels = c("ape", "bear", "cat"))
as_label(x, prefix = TRUE)
# create vector
x <- c(1, 2, 3, 2, 4, NA)
# add less labels than values
x <- set_labels(
x,
labels = c("yes", "maybe", "no"),
force.labels = FALSE,
force.values = FALSE
)
# convert to label w/o non-labelled values
as_label(x)
# convert to label, including non-labelled values
as_label(x, add.non.labelled = TRUE)
# create labelled integer, with missing flag
if (require("haven")) {
x <- labelled(
c(1:3, tagged_na("a", "c", "z"), 4:1, 2:3),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
)
# to labelled factor, with missing labels
as_label(x, drop.na = FALSE)
# to labelled factor, missings removed
as_label(x, drop.na = TRUE)
# keep missings, and use non-labelled values as well
as_label(x, add.non.labelled = TRUE, drop.na = FALSE)
}
# convert labelled character to factor
dummy <- c("M", "F", "F", "X")
dummy <- set_labels(
dummy,
labels = c(`M` = "Male", `F` = "Female", `X` = "Refused")
)
get_labels(dummy,, "p")
as_label(dummy)
# drop unused factor levels, but preserve variable label
x <- factor(c("a", "b", "c"), levels = c("a", "b", "c", "d"))
x <- set_labels(x, labels = c("ape", "bear", "cat"))
set_label(x) <- "A factor!"
x
as_label(x, drop.levels = TRUE)
# change variable label
as_label(x, var.label = "New variable label!", drop.levels = TRUE)
# convert to numeric and back again, preserving label attributes
# *and* values in numeric vector
x <- c(0, 1, 0, 4)
x <- set_labels(x, labels = c(`null` = 0, `one` = 1, `four` = 4))
# to factor
as_label(x)
# to factor, back to numeric - values are 1, 2 and 3,
# instead of original 0, 1 and 4
as_numeric(as_label(x))
# preserve label-attributes when converting to factor, use these attributes
# to restore original numeric values when converting back to numeric
as_numeric(as_label(x, keep.labels = TRUE), use.labels = TRUE)
# easily coerce specific variables in a data frame to factor
# and keep other variables, with their class preserved
as_label(efc, e42dep, e16sex, c172code)
Convert variable into factor and keep value labels
Description
This function converts a variable into a factor, but preserves variable and value label attributes.
Usage
as_factor(x, ...)
to_factor(x, ...)
## S3 method for class 'data.frame'
as_factor(x, ..., add.non.labelled = FALSE)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
add.non.labelled |
Logical, if |
Details
as_factor
converts numeric values into a factor with numeric
levels. as_label
, however, converts a vector into
a factor and uses value labels as factor levels.
Value
A factor, including variable and value labels. If x
is a data frame, the complete data frame x
will be returned,
where variables specified in ...
are coerced
to factors (including variable and value labels);
if ...
is not specified, applies to all variables in the
data frame.
Note
This function is intended for use with vectors that have value and variable
label attributes. Unlike as.factor
, as_factor
converts
a variable into a factor and preserves the value and variable label attributes.
Adding label attributes is automatically done by importing data sets
with one of the read_*
-functions, like read_spss
.
Else, value and variable labels can be manually added to vectors
with set_labels
and set_label
.
Examples
if (require("sjmisc") && require("magrittr")) {
data(efc)
# normal factor conversion, loses value attributes
x <- as.factor(efc$e42dep)
frq(x)
# factor conversion, which keeps value attributes
x <- as_factor(efc$e42dep)
frq(x)
# create partially labelled vector
x <- set_labels(
efc$e42dep,
labels = c(
`1` = "independent",
`4` = "severe dependency",
`9` = "missing value"
))
# only copy existing value labels
as_factor(x) %>% head()
get_labels(as_factor(x), values = "p")
# also add labels to non-labelled values
as_factor(x, add.non.labelled = TRUE) %>% head()
get_labels(as_factor(x, add.non.labelled = TRUE), values = "p")
# easily coerce specific variables in a data frame to factor
# and keep other variables, with their class preserved
as_factor(efc, e42dep, e16sex, c172code) %>% head()
# use select-helpers from dplyr-package
if (require("dplyr")) {
as_factor(efc, contains("cop"), c161sex:c175empl) %>% head()
}
}
Convert vector to labelled class
Description
Converts a (labelled) vector of any class into a labelled
class vector, resp. adds a labelled
class-attribute.
Usage
as_labelled(
x,
add.labels = FALSE,
add.class = FALSE,
skip.strings = FALSE,
tag.na = FALSE
)
Arguments
x |
Variable (vector), |
add.labels |
Logical, if |
add.class |
Logical, if |
skip.strings |
Logical, if |
tag.na |
Logical, if |
Value
x
, as labelled
-class object.
Examples
data(efc)
str(efc$e42dep)
x <- as_labelled(efc$e42dep)
str(x)
x <- as_labelled(efc$e42dep, add.class = TRUE)
str(x)
a <- c(1, 2, 4)
x <- as_labelled(a, add.class = TRUE)
str(x)
data(efc)
x <- set_labels(efc$e42dep,
labels = c(`1` = "independent", `4` = "severe dependency"))
x1 <- as_labelled(x, add.labels = FALSE)
x2 <- as_labelled(x, add.labels = TRUE)
str(x1)
str(x2)
get_values(x1)
get_values(x2)
Convert factors to numeric variables
Description
This function converts (replaces) factor levels with the related factor level index number, thus the factor is converted to a numeric variable.
Usage
as_numeric(x, ...)
to_numeric(x, ...)
## S3 method for class 'data.frame'
as_numeric(x, ..., start.at = NULL, keep.labels = TRUE, use.labels = FALSE)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
start.at |
Starting index, i.e. the lowest numeric value of the variable's
value range. By default, this argument is |
keep.labels |
Logical, if |
use.labels |
Logical, if |
Value
A numeric variable with values ranging either from start.at
to
start.at
+ length of factor levels, or to the corresponding
factor levels (if these were numeric). If x
is a data frame,
the complete data frame x
will be returned, where variables
specified in ...
are coerced to numeric; if ...
is
not specified, applies to all variables in the data frame.
Examples
data(efc)
test <- as_label(efc$e42dep)
table(test)
table(as_numeric(test))
hist(as_numeric(test, start.at = 0))
# set lowest value of new variable to "5".
table(as_numeric(test, start.at = 5))
# numeric factor keeps values
dummy <- factor(c("3", "4", "6"))
table(as_numeric(dummy))
# do not drop unused factor levels
dummy <- ordered(c(rep("No", 5), rep("Maybe", 3)),
levels = c("Yes", "No", "Maybe"))
as_numeric(dummy)
# non-numeric factor is converted to numeric
# starting at 1
dummy <- factor(c("D", "F", "H"))
table(as_numeric(dummy))
# for numeric factor levels, value labels will be used, if present
dummy1 <- factor(c("3", "4", "6"))
dummy1 <- set_labels(dummy1, labels = c("first", "2nd", "3rd"))
dummy1
as_numeric(dummy1)
# for non-numeric factor levels, these will be used.
# value labels will be ignored
dummy2 <- factor(c("D", "F", "H"))
dummy2 <- set_labels(dummy2, labels = c("first", "2nd", "3rd"))
dummy2
as_numeric(dummy2)
# easily coerce specific variables in a data frame to numeric
# and keep other variables, with their class preserved
data(efc)
efc$e42dep <- as.factor(efc$e42dep)
efc$e16sex <- as.factor(efc$e16sex)
efc$e17age <- as.factor(efc$e17age)
# convert back "sex" and "age" into numeric
head(as_numeric(efc, e16sex, e17age))
x <- factor(c("None", "Little", "Some", "Lots"))
x <- set_labels(x,
labels = c(None = "0.5", Little = "1.3", Some = "1.8", Lots = ".2")
)
x
as_numeric(x)
as_numeric(x, use.labels = TRUE)
as_numeric(x, use.labels = TRUE, keep.labels = FALSE)
Generic case conversion for labels
Description
This function wraps to_any_case()
from the snakecase
package with certain defaults for the sep_in
and
sep_out
arguments, used for instance to convert cases in
term_labels
.
Usage
convert_case(lab, case = NULL, verbose = FALSE, ...)
Arguments
lab |
Character vector that should be case converted. |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
verbose |
Toggle warnings and messages on or off. |
... |
Further arguments passed down to |
Details
When calling to_any_case()
from snakecase, the
sep_in
argument is set to "(?<!\\d)\\."
, and the
sep_out
to " "
. This gives feasible results from variable
labels for plot annotations.
Value
lab
, with converted case.
Examples
data(iris)
convert_case(colnames(iris))
convert_case(colnames(iris), case = "snake")
Copy value and variable labels to (subsetted) data frames
Description
Subsetting-functions usually drop value and variable labels from
subsetted data frames (if the original data frame has value and variable
label attributes). This function copies these value and variable
labels back to subsetted data frames that have been subsetted, for instance,
with subset
.
Usage
copy_labels(df_new, df_origin = NULL, ...)
Arguments
df_new |
The new, subsetted data frame. |
df_origin |
The original data frame where the subset ( |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
Value
Returns df_new
with either removed value and variable label attributes
(if df_origin = NULL
) or with copied value and variable label
attributes (if df_origin
was the original subsetted data frame).
Note
In case df_origin = NULL
, all possible label attributes
from df_new
are removed.
Examples
data(efc)
# create subset - drops label attributes
efc.sub <- subset(efc, subset = e16sex == 1, select = c(4:8))
str(efc.sub)
# copy back attributes from original dataframe
efc.sub <- copy_labels(efc.sub, efc)
str(efc.sub)
# remove all labels
efc.sub <- copy_labels(efc.sub)
str(efc.sub)
# create subset - drops label attributes
efc.sub <- subset(efc, subset = e16sex == 1, select = c(4:8))
if (require("dplyr")) {
# create subset with dplyr's select - attributes are preserved
efc.sub2 <- select(efc, c160age, e42dep, neg_c_7, c82cop1, c84cop3)
# copy labels from those columns that are available
copy_labels(efc.sub, efc.sub2) %>% str()
}
# copy labels from only some columns
str(copy_labels(efc.sub, efc, e42dep))
str(copy_labels(efc.sub, efc, -e17age))
Drop, add or convert (non-)labelled values
Description
For (partially) labelled vectors, zap_labels()
will replace
all values that have a value label attribute with NA
;
zap_unlabelled()
, as counterpart, will replace all values
that don't have a value label attribute with NA
.
drop_labels()
drops all value labels for unused values,
i.e. values that are not present in a vector. fill_labels()
is the
counterpart to drop_labels()
and adds value labels to
a partially labelled vector, i.e. if not all values are
labelled, non-labelled values get labels.
Usage
drop_labels(x, ..., drop.na = TRUE)
fill_labels(x, ...)
zap_labels(x, ...)
zap_unlabelled(x, ...)
Arguments
x |
(partially) |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
drop.na |
Logical, whether existing value labels of tagged NA values
(see |
Value
For
zap_labels()
,x
, where all labelled values are converted toNA
.For
zap_unlabelled()
,x
, where all non-labelled values are converted toNA
.For
drop_labels()
,x
, where value labels for non-existing values are removed.For
fill_labels()
,x
, where labels for non-labelled values are added.
If x
is a data frame, the complete data frame x
will be
returned, with variables specified in ...
being converted;
if ...
is not specified, applies to all variables in the
data frame.
Examples
if (require("sjmisc") && require("dplyr")) {
# zap_labels() ----
data(efc)
str(efc$e42dep)
x <- set_labels(
efc$e42dep,
labels = c("independent" = 1, "severe dependency" = 4)
)
table(x)
get_values(x)
str(x)
# zap all labelled values
table(zap_labels(x))
get_values(zap_labels(x))
str(zap_labels(x))
# zap all unlabelled values
table(zap_unlabelled(x))
get_values(zap_unlabelled(x))
str(zap_unlabelled(x))
# in a pipe-workflow
efc %>%
select(c172code, e42dep) %>%
set_labels(
e42dep,
labels = c("independent" = 1, "severe dependency" = 4)
) %>%
zap_labels()
# drop_labels() ----
rp <- rec_pattern(1, 100)
rp
# sample data
data(efc)
# recode carers age into groups of width 5
x <- rec(efc$c160age, rec = rp$pattern)
# add value labels to new vector
x <- set_labels(x, labels = rp$labels)
# watch result. due to recode-pattern, we have age groups with
# no observations (zero-counts)
frq(x)
# now, let's drop zero's
frq(drop_labels(x))
# drop labels, also drop NA value labels, then also zap tagged NA
if (require("haven")) {
x <- labelled(c(1:3, tagged_na("z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "Unused" = 5,
"Not home" = tagged_na("z")))
x
drop_labels(x, drop.na = FALSE)
drop_labels(x)
zap_na_tags(drop_labels(x))
# fill_labels() ----
# create labelled integer, with tagged missings
x <- labelled(
c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
)
# get current values and labels
x
get_labels(x)
fill_labels(x)
get_labels(fill_labels(x))
# same as
get_labels(x, non.labelled = TRUE)
}
}
Sample dataset from the EUROFAMCARE project
Description
A SPSS sample data set, imported with the read_spss
function.
Examples
# Attach EFC-data
data(efc)
# Show structure
str(efc)
# show first rows
head(efc)
# show variables
## Not run:
library(sjPlot)
view_df(efc)
# show variable labels
get_label(efc)
# plot efc-data frame summary
sjt.df(efc, altr.row.col = TRUE)
## End(Not run)
Retrieve variable label(s) of labelled data
Description
This function returns the variable labels of labelled data.
Usage
get_label(x, ..., def.value = NULL, case = NULL)
Arguments
x |
A data frame with variables that have label attributes (e.g.
from an imported SPSS, SAS or STATA data set, via |
... |
Optional, names of variables, where labels should be retrieved.
Required, if either data is a data frame and no vector, or if only
selected variables from |
def.value |
Optional, a character string which will be returned as label
if |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
Value
A named character vector with all variable labels from the data frame or list;
or a simple character vector (of length 1) with the variable label, if x
is a variable.
If x
is a single vector and has no label attribute, the value
of def.value
will be returned (which is by default NULL
).
Note
var_labels
is an alternative way to set variable labels,
which follows the philosophy of tidyvers API design (data as first argument,
dots as value pairs indicating variables)
See Also
See vignette Labelled Data and the sjlabelled-Package
for more details; set_label
to manually set variable labels or get_labels
to get value labels; var_labels
to set multiple variable
labels at once.
Examples
# import SPSS data set
# mydat <- read_spss("my_spss_data.sav", enc="UTF-8")
# retrieve variable labels
# mydat.var <- get_label(mydat)
# retrieve value labels
# mydat.val <- get_labels(mydat)
data(efc)
# get variable lable
get_label(efc$e42dep)
# alternative way
get_label(efc)["e42dep"]
# 'get_label()' also works within pipe-chains
library(magrittr)
efc %>% get_label(e42dep, e16sex)
# set default values
get_label(mtcars, mpg, cyl, def.value = "no var labels")
# simple barplot
barplot(table(efc$e42dep))
# get value labels to annotate barplot
barplot(table(efc$e42dep),
names.arg = get_labels(efc$e42dep),
main = get_label(efc$e42dep))
# get labels from multiple variables
get_label(list(efc$e42dep, efc$e16sex, efc$e15relat))
# use case conversion for human-readable labels
data(iris)
get_label(iris, def.value = colnames(iris))
get_label(iris, def.value = colnames(iris), case = "parsed")
Retrieve value labels of labelled data
Description
This function returns the value labels of labelled data.
Usage
get_labels(
x,
attr.only = FALSE,
values = NULL,
non.labelled = FALSE,
drop.na = TRUE,
drop.unused = FALSE
)
Arguments
x |
A data frame with variables that have value label attributes (e.g.
from an imported SPSS, SAS or STATA data set, via |
attr.only |
Logical, if |
values |
String, indicating whether the values associated with the
value labels are returned as well. If |
non.labelled |
Logical, if |
drop.na |
Logical, whether labels of tagged NA values (see |
drop.unused |
Logical, if |
Value
Either a list with all value labels from all variables if x
is a data.frame
or list
; a string with the value
labels, if x
is a variable;
or NULL
if no value label attribute was found.
See Also
See vignette Labelled Data and the sjlabelled-Package
for more details; set_labels
to manually set value
labels, get_label
to get variable labels and
get_values
to retrieve the values associated
with value labels.
Examples
# import SPSS data set
# mydat <- read_spss("my_spss_data.sav")
# retrieve variable labels
# mydat.var <- get_label(mydat)
# retrieve value labels
# mydat.val <- get_labels(mydat)
data(efc)
get_labels(efc$e42dep)
# simple barplot
barplot(table(efc$e42dep))
# get value labels to annotate barplot
barplot(table(efc$e42dep),
names.arg = get_labels(efc$e42dep),
main = get_label(efc$e42dep))
# include associated values
get_labels(efc$e42dep, values = "as.name")
# include associated values
get_labels(efc$e42dep, values = "as.prefix")
# get labels from multiple variables
get_labels(list(efc$e42dep, efc$e16sex, efc$e15relat))
# create a dummy factor
f1 <- factor(c("hi", "low", "mid"))
# search for label attributes only
get_labels(f1, attr.only = TRUE)
# search for factor levels as well
get_labels(f1)
# same for character vectors
c1 <- c("higher", "lower", "mid")
# search for label attributes only
get_labels(c1, attr.only = TRUE)
# search for string values as well
get_labels(c1)
# create vector
x <- c(1, 2, 3, 2, 4, NA)
# add less labels than values
x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = FALSE)
# get labels for labelled values only
get_labels(x)
# get labels for all values
get_labels(x, non.labelled = TRUE)
# get labels, including tagged NA values
library(haven)
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current NA values
x
get_labels(x, values = "n", drop.na = FALSE)
# create vector with unused labels
data(efc)
efc$e42dep <- set_labels(
efc$e42dep,
labels = c("independent" = 1, "dependent" = 4, "not used" = 5)
)
get_labels(efc$e42dep)
get_labels(efc$e42dep, drop.unused = TRUE)
get_labels(efc$e42dep, non.labelled = TRUE, drop.unused = TRUE)
Retrieve tagged NA values of labelled variables
Description
This function retrieves tagged NA values and their associated value labels from a labelled vector.
Usage
get_na(x, as.tag = FALSE)
Arguments
x |
Variable (vector) with value label attributes, including
tagged missing values (see |
as.tag |
Logical, if |
Details
Other statistical software packages (like 'SPSS' or 'SAS') allow to define
multiple missing values, e.g. not applicable, refused answer
or "real" missing. These missing types may be assigned with
different values, so it is possible to distinguish between these
missing types. In R, multiple declared missings cannot be represented
in a similar way with the regular missing values. However,
tagged_na()
values can do this.
Tagged NA
s work exactly like regular R missing values
except that they store one additional byte of information: a tag,
which is usually a letter ("a" to "z") or character number ("0" to "9").
This allows to indicate different missings.
Furthermore, see 'Details' in get_values
.
Value
The tagged missing values and their associated value labels from x
,
or NULL
if x
has no tagged missing values.
Examples
library(haven)
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get current NA values
x
get_na(x)
# which NA has which tag?
get_na(x, as.tag = TRUE)
# replace only the NA, which is tagged as NA(c)
if (require("sjmisc")) {
replace_na(x, value = 2, tagged.na = "c")
get_na(replace_na(x, value = 2, tagged.na = "c"))
# data frame as input
y <- labelled(c(2:3, 3:1, tagged_na("y"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "Why" = tagged_na("y")))
get_na(data.frame(x, y))
}
Retrieve values of labelled variables
Description
This function retrieves the values associated with value labels
from labelled
vectors. Data is also labelled
when imported from SPSS, SAS or STATA via read_spss
,
read_sas
or read_stata
.
Usage
get_values(x, sort.val = TRUE, drop.na = FALSE)
Arguments
x |
Variable (vector) with value label attributes; or a data frame or list with such variables. |
sort.val |
Logical, if |
drop.na |
Logical, if |
Details
labelled
vectors are numeric by default (when imported with read-functions
like read_spss
) and have variable and value labels attributes.
The value labels are associated with the values from the labelled vector.
This function returns the values associated with the vector's value labels,
which may differ from actual values in the vector (e.g. if not all
values have a related label).
Value
The values associated with value labels from x
,
or NULL
if x
has no label attributes.
See Also
get_labels
for getting value labels and get_na
to get values for missing values.
Examples
data(efc)
str(efc$e42dep)
get_values(efc$e42dep)
get_labels(efc$e42dep)
library(haven)
x <- labelled(c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z")))
# get all values
get_values(x)
# drop NA
get_values(x, drop.na = TRUE)
# data frame as input
y <- labelled(c(2:3, 3:1, tagged_na("y"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "Why" = tagged_na("y")))
get_values(data.frame(x, y))
Check whether object is of class "labelled"
Description
This function checks whether x
is of class labelled
.
Usage
is_labelled(x)
Arguments
x |
An object. |
Value
Logical, TRUE
if x
inherits from class labelled
,
FALSE
otherwise.
Use variable labels as column names
Description
This function sets variable labels as column names, to use "labelled data" also for those functions that cannot cope with labelled data by default.
Usage
label_to_colnames(x, ...)
Arguments
x |
A data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
Value
x
with variable labels as column names. For variables without
variable labels, the column name is left unchanged.
Examples
data(iris)
iris <- var_labels(
iris,
Petal.Length = "Petal length (cm)",
Petal.Width = "Petal width (cm)"
)
colnames(iris)
plot(iris)
colnames(label_to_colnames(iris))
plot(label_to_colnames(iris))
Import data from other statistical software packages
Description
Import data from SPSS, SAS or Stata, including NA's, value and variable labels.
Usage
read_spss(
path,
convert.factors = TRUE,
drop.labels = FALSE,
tag.na = FALSE,
encoding = NULL,
verbose = FALSE,
atomic.to.fac = convert.factors
)
read_sas(
path,
path.cat = NULL,
convert.factors = TRUE,
drop.labels = FALSE,
encoding = NULL,
verbose = FALSE,
atomic.to.fac = convert.factors
)
read_stata(
path,
convert.factors = TRUE,
drop.labels = FALSE,
encoding = NULL,
verbose = FALSE,
atomic.to.fac = convert.factors
)
read_data(
path,
convert.factors = TRUE,
drop.labels = FALSE,
encoding = NULL,
verbose = FALSE,
atomic.to.fac = convert.factors
)
Arguments
path |
File path to the data file. |
convert.factors |
Logical, if |
drop.labels |
Logical, if |
tag.na |
Logical, if |
encoding |
The character encoding used for the file. This defaults to the encoding specified in the file, or UTF-8. Use this argument to override the default encoding stored in the file. |
verbose |
Logical, if |
atomic.to.fac |
Deprecated, please use 'convert.factors' instead. |
path.cat |
Optional, the file path to the SAS catalog file. |
Details
These read-functions behave slightly differently from haven's read-functions:
The vectors in the returned data frame are of class
atomic
, not of classlabelled
. The labelled-class might cause issues with other packages.When importing SPSS data, variables with user defined missings won't be read into
labelled_spss
objects, but imported as tagged NA values.
The convert.factors
option only
converts those variables into factors that are of class atomic
and
which have value labels after import. Atomic vectors without value labels
are considered as continuous and not converted to factors.
Value
A data frame containing the imported, labelled data. Retrieve value labels with
get_labels
and variable labels with get_label
.
Note
These are wrapper functions for haven's read_*
-functions.
See Also
Vignette Labelled Data and the sjlabelled-Package.
Examples
## Not run:
# import SPSS data set. uses haven's read function
mydat <- read_spss("my_spss_data.sav")
# use haven's read function, convert atomic to factor
mydat <- read_spss("my_spss_data.sav", convert.factors = TRUE)
# retrieve variable labels
mydat.var <- get_label(mydat)
# retrieve value labels
mydat.val <- get_labels(mydat)
## End(Not run)
Remove value and variable labels from vector or data frame
Description
This function removes value and variable label attributes
from a vector or data frame. These attributes are typically
added to variables when importing foreign data (see
read_spss
) or manually adding label attributes
with set_labels
.
Usage
remove_all_labels(x)
Arguments
x |
Vector or |
Value
x
with removed value and variable label attributes.
See Also
See vignette Labelled Data and the sjlabelled-Package,
and copy_labels
for adding label attributes
(subsetted) data frames.
Examples
data(efc)
str(efc)
str(remove_all_labels(efc))
Remove variable labels from variables
Description
Remove variable labels from variables.
Usage
remove_label(x, ...)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
Value
x
with removed variable labels
See Also
set_label
to manually set variable labels or
get_label
to get variable labels; set_labels
to
add value labels, replacing the existing ones (and removing non-specified
value labels).
Examples
data(efc)
x <- efc[, 1:5]
get_label(x)
str(x)
x <- remove_label(x)
get_label(x)
str(x)
Add variable label(s) to variables
Description
This function adds variable labels as attribute
(named "label"
) to the variable x
, resp. to a
set of variables in a data frame or a list-object. var_labels()
is intended for use within pipe-workflows and has a tidyverse-consistent
syntax, including support for quasi-quotation (see 'Examples').
Usage
set_label(x, label)
set_label(x) <- value
var_labels(x, ...)
Arguments
x |
Variable (vector), list of variables or a data frame where variables
labels should be added as attribute. For |
label |
If |
value |
See |
... |
Pairs of named vectors, where the name equals the variable name, which should be labelled, and the value is the new variable label. |
Value
x
, with variable label attribute(s), which contains the
variable name(s); or with removed label-attribute if
label = ""
.
See Also
See vignette Labelled Data and the sjlabelled-Package
for more details; set_labels
to manually set value labels or get_label
to get variable labels.
Examples
# manually set value and variable labels
dummy <- sample(1:4, 40, replace = TRUE)
dummy <- set_labels(dummy, labels = c("very low", "low", "mid", "hi"))
dummy <- set_label(dummy, label = "Dummy-variable")
# or use:
# set_label(dummy) <- "Dummy-variable"
# auto-detection of value labels by default, auto-detection of
# variable labels if argument "title" set to NULL.
## Not run:
library(sjPlot)
sjp.frq(dummy, title = NULL)
## End(Not run)
# Set variable labels for data frame
dummy <- data.frame(
a = sample(1:4, 10, replace = TRUE),
b = sample(1:4, 10, replace = TRUE),
c = sample(1:4, 10, replace = TRUE)
)
dummy <- set_label(dummy, c("Variable A", "Variable B", "Variable C"))
str(dummy)
# remove one variable label
dummy <- set_label(dummy, c("Variable A", "", "Variable C"))
str(dummy)
# setting same variable labels to multiple vectors
# create a set of dummy variables
dummy1 <- sample(1:4, 40, replace = TRUE)
dummy2 <- sample(1:4, 40, replace = TRUE)
dummy3 <- sample(1:4, 40, replace = TRUE)
# put them in list-object
dummies <- list(dummy1, dummy2, dummy3)
# and set variable labels for all three dummies
dummies <- set_label(dummies, c("First Dummy", "2nd Dummy", "Third dummy"))
# see result...
get_label(dummies)
# use 'var_labels()' to set labels within a pipe-workflow, and
# when you need "tidyverse-consistent" api.
# Set variable labels for data frame
dummy <- data.frame(
a = sample(1:4, 10, replace = TRUE),
b = sample(1:4, 10, replace = TRUE),
c = sample(1:4, 10, replace = TRUE)
)
library(magrittr)
dummy %>%
var_labels(a = "First variable", c = "third variable") %>%
get_label()
# with quasi-quotation
library(rlang)
v1 <- "First variable"
v2 <- "Third variable"
dummy %>%
var_labels(a = !!v1, c = !!v2) %>%
get_label()
x1 <- "a"
x2 <- "c"
dummy %>%
var_labels(!!x1 := !!v1, !!x2 := !!v2) %>%
get_label()
Add value labels to variables
Description
This function adds labels as attribute (named "labels"
)
to a variable or vector x
, resp. to a set of variables in a
data frame or a list-object. A use-case is, for instance, the
sjPlot-package, which supports labelled data and automatically
assigns labels to axes or legends in plots or to be used in tables.
val_labels()
is intended for use within pipe-workflows and has a
tidyverse-consistent syntax, including support for quasi-quotation
(see 'Examples').
Usage
set_labels(
x,
...,
labels,
force.labels = FALSE,
force.values = TRUE,
drop.na = TRUE
)
val_labels(x, ..., force.labels = FALSE, force.values = TRUE, drop.na = TRUE)
Arguments
x |
A vector or data frame. |
... |
For |
labels |
(Named) character vector of labels that will be added to
Use |
force.labels |
Logical; if |
force.values |
Logical, if |
drop.na |
Logical, whether existing value labels of tagged NA values
(see |
Value
x
with value label attributes; or with removed label-attributes if
labels = ""
. If x
is a data frame, the complete data
frame x
will be returned, with removed or added to variables
specified in ...
; if ...
is not specified, applies
to all variables in the data frame.
Note
if
labels
is a named vector,force.labels
andforce.values
will be ignored, and only values defined inlabels
will be labelled;if
x
has less unique values thanlabels
, redundant labels will be dropped, seeforce.labels
;if
x
has more unique values thanlabels
, only matching values will be labelled, other values remain unlabelled, seeforce.values
;
If you only want to change partial value labels, use add_labels
instead.
Furthermore, see 'Note' in get_labels
.
See Also
See vignette Labelled Data and the sjlabelled-Package
for more details; set_label
to manually set variable labels or
get_label
to get variable labels; add_labels
to
add additional value labels without replacing the existing ones.
Examples
if (require("sjmisc")) {
dummy <- sample(1:4, 40, replace = TRUE)
frq(dummy)
dummy <- set_labels(dummy, labels = c("very low", "low", "mid", "hi"))
frq(dummy)
# assign labels with named vector
dummy <- sample(1:4, 40, replace = TRUE)
dummy <- set_labels(dummy, labels = c("very low" = 1, "very high" = 4))
frq(dummy)
# force using all labels, even if not all labels
# have associated values in vector
x <- c(2, 2, 3, 3, 2)
# only two value labels
x <- set_labels(x, labels = c("1", "2", "3"))
x
frq(x)
# all three value labels
x <- set_labels(x, labels = c("1", "2", "3"), force.labels = TRUE)
x
frq(x)
# create vector
x <- c(1, 2, 3, 2, 4, NA)
# add less labels than values
x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = FALSE)
x
# add all necessary labels
x <- set_labels(x, labels = c("yes", "maybe", "no"), force.values = TRUE)
x
# set labels and missings
x <- c(1, 1, 1, 2, 2, -2, 3, 3, 3, 3, 3, 9)
x <- set_labels(x, labels = c("Refused", "One", "Two", "Three", "Missing"))
x
set_na(x, na = c(-2, 9))
}
if (require("haven") && require("sjmisc")) {
x <- labelled(
c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
)
# get current NA values
x
get_na(x)
# lose value labels from tagged NA by default, if not specified
set_labels(x, labels = c("New Three" = 3))
# do not drop na
set_labels(x, labels = c("New Three" = 3), drop.na = FALSE)
# set labels via named vector,
# not using all possible values
data(efc)
get_labels(efc$e42dep)
x <- set_labels(
efc$e42dep,
labels = c(`independent` = 1,
`severe dependency` = 2,
`missing value` = 9)
)
get_labels(x, values = "p")
get_labels(x, values = "p", non.labelled = TRUE)
# labels can also be set for tagged NA value
# create numeric vector
x <- c(1, 2, 3, 4)
# set 2 and 3 as missing, which will automatically set as
# tagged NA by 'set_na()'
x <- set_na(x, na = c(2, 3))
x
# set label via named vector just for tagged NA(3)
set_labels(x, labels = c(`New Value` = tagged_na("3")))
# setting same value labels to multiple vectors
dummies <- data.frame(
dummy1 = sample(1:4, 40, replace = TRUE),
dummy2 = sample(1:4, 40, replace = TRUE),
dummy3 = sample(1:4, 40, replace = TRUE)
)
# and set same value labels for two of three variables
test <- set_labels(
dummies, dummy1, dummy2,
labels = c("very low", "low", "mid", "hi")
)
# see result...
get_labels(test)
}
# using quasi-quotation
if (require("rlang") && require("dplyr")) {
dummies <- data.frame(
dummy1 = sample(1:4, 40, replace = TRUE),
dummy2 = sample(1:4, 40, replace = TRUE),
dummy3 = sample(1:4, 40, replace = TRUE)
)
x1 <- "dummy1"
x2 <- c("so low", "rather low", "mid", "very hi")
dummies %>%
val_labels(
!!x1 := c("really low", "low", "a bit mid", "hi"),
dummy3 = !!x2
) %>%
get_labels()
# ... and named vectors to explicitly set value labels
x2 <- c("so low" = 4, "rather low" = 3, "mid" = 2, "very hi" = 1)
dummies %>%
val_labels(
!!x1 := c("really low" = 1, "low" = 3, "a bit mid" = 2, "hi" = 4),
dummy3 = !!x2
) %>% get_labels(values = "p")
}
Replace specific values in vector with NA
Description
This function replaces specific values of variables with NA
.
Usage
set_na(x, ..., na, drop.levels = TRUE, as.tag = FALSE)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
na |
Numeric vector with values that should be replaced with NA values,
or a character vector if values of factors or character vectors should be
replaced. For labelled vectors, may also be the name of a value label. In
this case, the associated values for the value labels in each vector
will be replaced with |
drop.levels |
Logical, if |
as.tag |
Logical, if |
Details
set_na()
converts all values defined in na
with
a related NA
or tagged NA value (see tagged_na()
).
Tagged NA
s work exactly like regular R missing values
except that they store one additional byte of information: a tag,
which is usually a letter ("a" to "z") or character number ("0" to "9").
Different NA values for different variables
If na
is a named vector and as.tag = FALSE
, the names
indicate variable names, and the associated values indicate those values
that should be replaced by NA
in the related variable. For instance,
set_na(x, na = c(v1 = 4, v2 = 3))
would replace all 4 in v1
with NA
and all 3 in v2
with NA
.
If na
is a named list and as.tag = FALSE
, it is possible
to replace different multiple values by NA
for different variables
separately. For example, set_na(x, na = list(v1 = c(1, 4), v2 = 5:7))
would replace all 1 and 4 in v1
with NA
and all 5 to 7 in
v2
with NA
.
Furthermore, see also 'Details' in get_na
.
Value
x
, with all values in na
being replaced by NA
.
If x
is a data frame, the complete data frame x
will
be returned, with NA's set for variables specified in ...
;
if ...
is not specified, applies to all variables in the
data frame.
Note
Labels from values that are replaced with NA and no longer used will be
removed from x
, however, other value and variable label
attributes are preserved. For more details on labelled data,
see vignette Labelled Data and the sjlabelled-Package.
Examples
if (require("sjmisc") && require("dplyr") && require("haven")) {
# create random variable
dummy <- sample(1:8, 100, replace = TRUE)
# show value distribution
table(dummy)
# set value 1 and 8 as missings
dummy <- set_na(dummy, na = c(1, 8))
# show value distribution, including missings
table(dummy, useNA = "always")
# add named vector as further missing value
set_na(dummy, na = c("Refused" = 5), as.tag = TRUE)
# see different missing types
print_tagged_na(set_na(dummy, na = c("Refused" = 5), as.tag = TRUE))
# create sample data frame
dummy <- data.frame(var1 = sample(1:8, 100, replace = TRUE),
var2 = sample(1:10, 100, replace = TRUE),
var3 = sample(1:6, 100, replace = TRUE))
# set value 2 and 4 as missings
dummy %>% set_na(na = c(2, 4)) %>% head()
dummy %>% set_na(na = c(2, 4), as.tag = TRUE) %>% get_na()
dummy %>% set_na(na = c(2, 4), as.tag = TRUE) %>% get_values()
data(efc)
dummy <- data.frame(
var1 = efc$c82cop1,
var2 = efc$c83cop2,
var3 = efc$c84cop3
)
# check original distribution of categories
lapply(dummy, table, useNA = "always")
# set 3 to NA for two variables
lapply(set_na(dummy, var1, var3, na = 3), table, useNA = "always")
# if 'na' is a named vector *and* 'as.tag = FALSE', different NA-values
# can be specified for each variable
set.seed(1)
dummy <- data.frame(
var1 = sample(1:8, 10, replace = TRUE),
var2 = sample(1:10, 10, replace = TRUE),
var3 = sample(1:6, 10, replace = TRUE)
)
dummy
# Replace "3" in var1 with NA, "5" in var2 and "6" in var3
set_na(dummy, na = c(var1 = 3, var2 = 5, var3 = 6))
# if 'na' is a named list *and* 'as.tag = FALSE', for each
# variable different multiple NA-values can be specified
set_na(dummy, na = list(var1 = 1:3, var2 = c(7, 8), var3 = 6))
# drop unused factor levels when being set to NA
x <- factor(c("a", "b", "c"))
x
set_na(x, na = "b", as.tag = TRUE)
set_na(x, na = "b", drop.levels = FALSE, as.tag = TRUE)
# set_na() can also remove a missing by defining the value label
# of the value that should be replaced with NA. This is in particular
# helpful if a certain category should be set as NA, however, this category
# is assigned with different values accross variables
x1 <- sample(1:4, 20, replace = TRUE)
x2 <- sample(1:7, 20, replace = TRUE)
x1 <- set_labels(x1, labels = c("Refused" = 3, "No answer" = 4))
x2 <- set_labels(x2, labels = c("Refused" = 6, "No answer" = 7))
tmp <- data.frame(x1, x2)
get_labels(tmp)
table(tmp, useNA = "always")
get_labels(set_na(tmp, na = "No answer"))
table(set_na(tmp, na = "No answer"), useNA = "always")
# show values
tmp
set_na(tmp, na = c("Refused", "No answer"))
}
Retrieve labels of model terms from regression models
Description
This function retrieves variable labels from model terms. In case of categorical variables, where one variable has multiple dummies, variable name and category value is returned.
Usage
term_labels(
models,
mark.cat = FALSE,
case = NULL,
prefix = c("none", "varname", "label"),
...
)
get_term_labels(
models,
mark.cat = FALSE,
case = NULL,
prefix = c("none", "varname", "label"),
...
)
response_labels(models, case = NULL, multi.resp = FALSE, mv = FALSE, ...)
get_dv_labels(models, case = NULL, multi.resp = FALSE, mv = FALSE, ...)
Arguments
models |
One or more fitted regression models. May also be glm's or mixed models. |
mark.cat |
Logical, if |
case |
Desired target case. Labels will automatically converted into the
specified character case. See |
prefix |
Indicates whether the value labels of categorical variables should be prefixed, e.g. with the variable name or variable label. May be abbreviated. See 'Examples', |
... |
Further arguments passed down to |
mv , multi.resp |
Logical, if |
Details
Typically, the variable labels from model terms are returned. However,
for categorical terms that have estimates for each category, the
value labels are returned as well. As the return value is a named
vector, you can easily use it with ggplot2's scale_*()
functions to annotate plots.
Value
For term_labels()
, a (named) character vector with
variable labels of all model terms, which can be used, for instance,
as axis labels to annotate plots.
For response_labels()
,
a character vector with variable labels from all dependent variables
of models
.
Examples
# use data set with labelled data
data(efc)
fit <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
term_labels(fit)
# make "education" categorical
if (require("sjmisc")) {
efc$c172code <- to_factor(efc$c172code)
fit <- lm(barthtot ~ c160age + c12hour + c161sex + c172code, data = efc)
term_labels(fit)
# prefix value of categorical variables with variable name
term_labels(fit, prefix = "varname")
# prefix value of categorical variables with value label
term_labels(fit, prefix = "label")
# get label of dv
response_labels(fit)
}
Repair value labels
Description
Duplicated value labels in variables may cause troubles when
saving labelled data, or computing cross tabs (cf.
sjmisc::flat_table()
or sjPlot::plot_xtab()
).
tidy_labels()
repairs duplicated value labels by suffixing
them with the associated value.
Usage
tidy_labels(x, ..., sep = "_", remove = FALSE)
Arguments
x |
A vector or data frame. |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
sep |
String that will be used to separate the suffixed value from the old label when creating the new value label. |
remove |
Logical, if |
Value
x
, with "repaired" (unique) value labels for each variable.
Examples
if (require("sjmisc")) {
set.seed(123)
x <- set_labels(
sample(1:5, size = 20, replace = TRUE),
labels = c("low" = 1, ".." = 2, ".." = 3, ".." = 4, "high" = 5)
)
frq(x)
z <- tidy_labels(x)
frq(z)
z <- tidy_labels(x, sep = ".")
frq(z)
z <- tidy_labels(x, remove = TRUE)
frq(z)
}
Convert labelled vectors into normal classes
Description
This function converts labelled
class vectors
into a generic data format, which means that simply all labelled
class attributes will be removed, so all vectors / variables will most
likely become atomic
.
Usage
unlabel(x, verbose = FALSE)
Arguments
x |
A data frame, which contains |
verbose |
Logical, if |
Value
A data frame or single vector (depending on x
) with common object classes.
Note
This function is currently only used to avoid possible compatibility issues
with labelled
class vectors. Some known issues with
labelled
class vectors have already been fixed, so
it might be that this function will become redundant in the future.
Write data to other statistical software packages
Description
These functions write the content of a data frame to an SPSS, SAS or Stata-file.
Usage
write_spss(x, path, drop.na = FALSE, compress = FALSE)
write_stata(x, path, drop.na = FALSE, version = 14)
write_sas(x, path, drop.na = FALSE)
Arguments
x |
A data frame that should be saved as file. |
path |
File path of the output file. |
drop.na |
Logical, if |
compress |
Logical, if |
version |
File version to use. Supports versions 8-14. |
Convert tagged NA values into regular NA
Description
Replaces all tagged_na()
values with
regular NA
.
Usage
zap_na_tags(x, ...)
Arguments
x |
A |
... |
Optional, unquoted names of variables that should be selected for
further processing. Required, if |
Value
x
, where all tagged_na
values are converted to NA
.
Examples
if (require("haven")) {
x <- labelled(
c(1:3, tagged_na("a", "c", "z"), 4:1),
c("Agreement" = 1, "Disagreement" = 4, "First" = tagged_na("c"),
"Refused" = tagged_na("a"), "Not home" = tagged_na("z"))
)
# get current NA values
x
get_na(x)
zap_na_tags(x)
get_na(zap_na_tags(x))
# also works with non-labelled vector that have tagged NA values
x <- c(1:5, tagged_na("a"), tagged_na("z"), NA)
haven::print_tagged_na(x)
haven::print_tagged_na(zap_na_tags(x))
}