Help for package sirt

Type:

Package

Title:

Supplementary Item Response Theory Models

Version:

4.1-15

Date:

2024-02-06 00:05:40

Author:

Alexander Robitzsch [aut,cre] (<https://orcid.org/0000-0002-8226-3132>)

Maintainer:

Alexander Robitzsch <robitzsch@ipn.uni-kiel.de>

Description:

Supplementary functions for item response models aiming to complement existing R packages. The functionality includes among others multidimensional compensatory and noncompensatory IRT models (Reckase, 2009, <doi:10.1007/978-0-387-89976-3>), MCMC for hierarchical IRT models and testlet models (Fox, 2010, <doi:10.1007/978-1-4419-0742-4>), NOHARM (McDonald, 1982, <doi:10.1177/014662168200600402>), Rasch copula model (Braeken, 2011, <doi:10.1007/s11336-010-9190-4>; Schroeders, Robitzsch & Schipolowski, 2014, <doi:10.1111/jedm.12054>), faceted and hierarchical rater models (DeCarlo, Kim & Johnson, 2011, <doi:10.1111/j.1745-3984.2011.00143.x>), ordinal IRT model (ISOP; Scheiblechner, 1995, <doi:10.1007/BF02301417>), DETECT statistic (Stout, Habing, Douglas & Kim, 1996, <doi:10.1177/014662169602000403>), local structural equation modeling (LSEM; Hildebrandt, Luedtke, Robitzsch, Sommer & Wilhelm, 2016, <doi:10.1080/00273171.2016.1142856>).

Depends:

R (≥ 3.5)

Imports:

CDM, graphics, methods, parallel, pbapply, Rcpp, stats, TAM, utils

Suggests:

coda, igraph, lavaan, MASS, Matrix, miceadds, minqa, mirt, mvtnorm, nloptr, optimx, pbivnorm, pbv, psych, sfsmisc, sm, survey

LinkingTo:

pbv, Rcpp, RcppArmadillo

URL:

https://github.com/alexanderrobitzsch/sirt, https://sites.google.com/view/alexander-robitzsch/software

License:

GPL-2 | GPL-3 [expanded from: GPL (≥ 2)]

NeedsCompilation:

yes

Packaged:

2024-02-05 23:13:53 UTC; sunpn563

Repository:

CRAN

Date/Publication:

2024-02-06 00:10:03 UTC

Supplementary Item Response Theory Models

Description

Details

The sirt package enables the estimation of following models:

Multidimensional marginal maximum likelihood estimation (MML) of generalized logistic Rasch type models using the generalized logistic link function (Stukel, 1988) can be conducted with rasch.mml2 and the argument itemtype="raschtype". This model also allows the estimation of the 4PL item response model (Loken & Rulison, 2010). Multiple group estimation, latent regression models and plausible value imputation are supported. In addition, pseudo-likelihood estimation for fractional item response data can be conducted.
Multidimensional noncompensatory, compensatory and partially compensatory item response models for dichotomous item responses (Reckase, 2009) can be estimated with the smirt function and the options irtmodel="noncomp" , irtmodel="comp" and irtmodel="partcomp".
The unidimensional quotient model (Ramsay, 1989) can be estimated using rasch.mml2 with itemtype="ramsay.qm".
Unidimensional nonparametric item response models can be estimated employing MML estimation (Rossi, Wang & Ramsay, 2002) by making use of rasch.mml2 with itemtype="npirt". Kernel smoothing for item response function estimation (Ramsay, 1991) is implemented in np.dich.
The multidimensional IRT copula model (Braeken, 2011) can be applied for handling local dependencies, see rasch.copula3.
Unidimensional joint maximum likelihood estimation (JML) of the Rasch model is possible with the rasch.jml function. Bias correction methods for item parameters are included in rasch.jml.jackknife1 and rasch.jml.biascorr.
The multidimensional latent class Rasch and 2PL model (Bartolucci, 2007) which employs a discrete trait distribution can be estimated with rasch.mirtlc.
The unidimensional 2PL rater facets model (Lincare, 1994) can be estimated with rm.facets. A hierarchical rater model based on signal detection theory (DeCarlo, Kim & Johnson, 2011) can be conducted with rm.sdt. A simple latent class model for two exchangeable raters is implemented in lc.2raters. See Robitzsch and Steinfeld (2018) for more details.
The discrete grade of membership model (Erosheva, Fienberg & Joutard, 2007) and the Rasch grade of membership model can be estimated by gom.em.
Some hierarchical IRT models and random item models for dichotomous and normally distributed data (van den Noortgate, de Boeck & Meulders, 2003; Fox & Verhagen, 2010) can be estimated with mcmc.2pno.ml.
Unidimensional pairwise conditional likelihood estimation (PCML; Zwinderman, 1995) is implemented in rasch.pairwise or rasch.pairwise.itemcluster.
Unidimensional pairwise marginal likelihood estimation (PMML; Renard, Molenberghs & Geys, 2004) can be conducted using rasch.pml3. In this function local dependence can be handled by imposing residual error structure or omitting item pairs within a dependent item cluster from the estimation.
The function rasch.evm.pcm estimates the multiple group partial credit model based on the pairwise eigenvector approach which avoids iterative estimation.
Some item response models in sirt can be estimated via Markov Chain Monte Carlo (MCMC) methods. In mcmc.2pno the two-parameter normal ogive model can be estimated. A hierarchical version of this model (Janssen, Tuerlinckx, Meulders & de Boeck, 2000) is implemented in mcmc.2pnoh. The 3PNO testlet model (Wainer, Bradlow & Wang, 2007; Glas, 2012) can be estimated with mcmc.3pno.testlet. Some hierarchical IRT models and random item models (van den Noortgate, de Boeck & Meulders, 2003) can be estimated with mcmc.2pno.ml.
For dichotomous response data, the free NOHARM software (McDonald, 1982, 1997) estimates the multidimensional compensatory 3PL model and the function R2noharm runs NOHARM from within R. Note that NOHARM must be downloaded from http://noharm.niagararesearch.ca/nh4cldl.html at first. A pure R implementation of the NOHARM model with some extensions can be found in noharm.sirt.
The measurement theoretic founded nonparametric item response models of Scheiblechner (1995, 1999) – the ISOP and the ADISOP model – can be estimated with isop.dich or isop.poly. Item scoring within this theory can be conducted with isop.scoring.
The functional unidimensional item response model (Ip et al., 2013) can be estimated with f1d.irt.
The Rasch model can be estimated by variational approximation (Rijmen & Vomlel, 2008) using rasch.va.
The unidimensional probabilistic Guttman model (Proctor, 1970) can be specified with prob.guttman.
A jackknife method for the estimation of standard errors of the weighted likelihood trait estimate (Warm, 1989) is available in wle.rasch.jackknife.
Model based reliability for dichotomous data can be calculated by the method of Green and Yang (2009) with greenyang.reliability and the marginal true score method of Dimitrov (2003) using the function marginal.truescore.reliability.
Essential unidimensionality can be assessed by the DETECT index (Stout, Habing, Douglas & Kim, 1996), see the function conf.detect.
Item parameters from several studies can be linked using the Haberman method (Haberman, 2009) in linking.haberman. See also equating.rasch and linking.robust. The alignment procedure (Asparouhov & Muthen, 2013) invariance.alignment is originally for comfirmatory factor analysis and aims at obtaining approximate invariance.
Some person fit statistics in the Rasch model (Meijer & Sijtsma, 2001) are included in personfit.stat.
An alternative to the linear logistic test model (LLTM), the so called least squares distance model for cognitive diagnosis (LSDM; Dimitrov, 2007), can be estimated with the function lsdm.
Local structural equation models (LSEM) can be estimated with the lsem.estimate function (Hildebrandt et al., 2016).

Author(s)

Alexander Robitzsch [aut,cre] (<https://orcid.org/0000-0002-8226-3132>)

Maintainer: Alexander Robitzsch <robitzsch@ipn.uni-kiel.de>

References

Asparouhov, T., & Muthen, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling, 21(4), 1-14. doi:10.1080/10705511.2014.919210

Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72, 141-157.

Braeken, J. (2011). A boundary mixture approach to violations of conditional independence. Psychometrika, 76(1), 57-76. doi:10.1007/s11336-010-9190-4

DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48(3), 333-356. doi:10.1111/j.1745-3984.2011.00143.x

Dimitrov, D. (2003). Marginal true-score measures and reliability for binary items as a function of their IRT parameters. Applied Psychological Measurement, 27, 440-458.

Dimitrov, D. M. (2007). Least squares distance method of cognitive validation and analysis for binary items using their item response theory parameters. Applied Psychological Measurement, 31, 367-387.

Erosheva, E. A., Fienberg, S. E., & Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. Annals of Applied Statistics, 1, 502-537.

Fox, J.-P. (2010). Bayesian item response modeling. New York: Springer. doi:10.1007/978-1-4419-0742-4

Fox, J.-P., & Verhagen, A.-J. (2010). Random item effects modeling for cross-national survey data. In E. Davidov, P. Schmidt, & J. Billiet (Eds.), Cross-cultural Analysis: Methods and Applications (pp. 467-488), London: Routledge Academic.

Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269.

Glas, C. A. W. (2012). Estimating and testing the extended testlet model. LSAC Research Report Series, RR 12-03.

Green, S.B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74, 155-167.

Haberman, S. J. (2009). Linking parameter estimates derived from an item response model through separate calibrations. ETS Research Report ETS RR-09-40. Princeton, ETS. doi:10.1002/j.2333-8504.2009.tb02197.x

Hildebrandt, A., Luedtke, O., Robitzsch, A., Sommer, C., & Wilhelm, O. (2016). Exploring factor model parameters across continuous variables with local structural equation models. Multivariate Behavioral Research, 51(2-3), 257-278. doi:10.1080/00273171.2016.1142856

Ip, E. H., Molenberghs, G., Chen, S. H., Goegebeur, Y., & De Boeck, P. (2013). Functionally unidimensional item response models for multivariate binary data. Multivariate Behavioral Research, 48, 534-562.

Janssen, R., Tuerlinckx, F., Meulders, M., & de Boeck, P. (2000). A hierarchical IRT model for criterion-referenced measurement. Journal of Educational and Behavioral Statistics, 25, 285-306.

Jeon, M., & Rijmen, F. (2016). A modular approach for item response theory modeling with the R package flirt. Behavior Research Methods, 48(2), 742-755. doi:10.3758/s13428-015-0606-z

Linacre, J. M. (1994). Many-Facet Rasch Measurement. Chicago: MESA Press.

Loken, E. & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63, 509-525.

McDonald, R. P. (1982). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379-396. doi:10.1177/014662168200600402

McDonald, R. P. (1997). Normal-ogive multidimensional model. In W. van der Linden & R. K. Hambleton (1997): Handbook of modern item response theory (pp. 257-269). New York: Springer. doi:10.1007/978-1-4757-2691-6_15

Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.

Proctor, C. H. (1970). A probabilistic formulation and statistical analysis for Guttman scaling. Psychometrika, 35, 73-78.

Ramsay, J. O. (1989). A comparison of three simple test theory models. Psychometrika, 54, 487-499.

Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611-630.

Reckase, M. (2009). Multidimensional item response theory. New York: Springer. doi:10.1007/978-0-387-89976-3

Renard, D., Molenberghs, G., & Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models. Computational Statistics & Data Analysis, 44, 649-667.

Rijmen, F., & Vomlel, J. (2008). Assessing the performance of variational methods for mixed logistic regression models. Journal of Statistical Computation and Simulation, 78, 765-779.

Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.

Rossi, N., Wang, X. & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27, 291-317.

Rusch, T., Mair, P., & Hatzinger, R. (2013). Psychometrics with R: A Review of CRAN Packages for Item Response Theory. http://epub.wu.ac.at/4010/1/resrepIRThandbook.pdf

Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP). Psychometrika, 60(2), 281-304. doi:10.1007/BF02301417

Scheiblechner, H. (1999). Additive conjoint isotonic probabilistic models (ADISOP). Psychometrika, 64, 295-316.

Schroeders, U., Robitzsch, A., & Schipolowski, S. (2014). A comparison of different psychometric approaches to modeling testlet structures: An example with C-tests. Journal of Educational Measurement, 51(4), 400-418. doi:10.1111/jedm.12054

Stout, W., Habing, B., Douglas, J., & Kim, H. R. (1996). Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20(4), 331-354. doi:10.1177/014662169602000403

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

Uenlue, A., & Yanagida, T. (2011). R you ready for R?: The CRAN psychometrics task view. British Journal of Mathematical and Statistical Psychology, 64(1), 182-186. doi:10.1348/000711010X519320

van den Noortgate, W., De Boeck, P., & Meulders, M. (2003). Cross-classification multilevel logistic models in psychometrics. Journal of Educational and Behavioral Statistics, 28, 369-386.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.

Wainer, H., Bradlow, E. T., & Wang, X. (2007). Testlet response theory and its applications. Cambridge: Cambridge University Press.

Zwinderman, A. H. (1995). Pairwise parameter estimation in Rasch models. Applied Psychological Measurement, 19, 369-375.

Examples

##
##   |-----------------------------------------------------------------|
##   | sirt 0.40-4 (2013-11-26)                                        |
##   | Supplementary Item Response Theory                              |
##   | Maintainer: Alexander Robitzsch <a.robitzsch at bifie.at >      |
##   | https://sites.google.com/site/alexanderrobitzsch/software       |
##   |-----------------------------------------------------------------|
##
##                       _/              _/
##              _/_/_/      _/  _/_/  _/_/_/_/
##           _/_/      _/  _/_/        _/
##              _/_/  _/  _/          _/
##         _/_/_/    _/  _/            _/_/
##

Automatic Method of Finding Keys in a Dataset with Raw Item Responses

Description

This function calculates keys of a dataset with raw item responses. It starts with setting the most frequent category of an item to 1. Then, in each iteration keys are changed such that the highest item discrimination is found.

Usage

automatic.recode(data, exclude=NULL, pstart.min=0.6, allocate=200,
    maxiter=20, progress=TRUE)

Arguments

data

Dataset with raw item responses

exclude

Vector with categories to be excluded for searching the key

pstart.min

Minimum probability for an initial solution of keys.

allocate

Maximum number of categories per item. This argument is used in the function tam.ctt3 of the TAM package.

maxiter

Maximum number of iterations

progress

A logical which indicates if iteration progress should be displayed

Value

A list with following entries

item.stat

Data frame with item name, p value, item discrimination and the calculated key

data.scored

Scored data frame using calculated keys in item.stat

categ.stats

Data frame with statistics for all categories of all items

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.raw1
#############################################################################
data(data.raw1)

# recode data.raw1 and exclude keys 8 and 9 (missing codes) and
# start with initially setting all categories larger than 50 
res1 <- sirt::automatic.recode( data.raw1, exclude=c(8,9), pstart.min=.50 )
# inspect calculated keys
res1$item.stat

#############################################################################
# EXAMPLE 2: data.timssAusTwn from TAM package
#############################################################################

miceadds::library_install("TAM")
data(data.timssAusTwn,package="TAM")
raw.resp <- data.timssAusTwn[,1:11]
res2 <- sirt::automatic.recode( data=raw.resp )

## End(Not run)

Functions for the Beta Item Response Model

Description

Functions for simulating and estimating the Beta item response model (Noel & Dauvier, 2007). brm.sim can be used for simulating the model, brm.irf computes the item response function. The Beta item response model is estimated as a discrete version to enable estimation in standard IRT software like mirt or TAM packages.

Usage

# simulating the beta item response model
brm.sim(theta, delta, tau, K=NULL)

# computing the item response function of the beta item response model
brm.irf( Theta, delta, tau, ncat, thdim=1, eps=1E-10 )

Arguments

theta

Ability vector of \theta values

delta

Vector of item difficulty parameters

tau

Vector item dispersion parameters

K

Number of discretized categories. The default is NULL which means that the simulated item responses are real number values between 0 and 1. If an integer K chosen, then values are discretized such that values of 0, 1, ..., K-1 arise.

Theta

Matrix of the ability vector \bold{\theta}

ncat

Number of categories

thdim

Theta dimension in the matrix Theta on which the item loads.

eps

Nuisance parameter which stabilize probabilities.

Details

The discrete version of the beta item response model is defined as follows. Assume that for item i there are K categories resulting in values k=0,1,\dots,K-1. Each value k is associated with a corresponding the transformed value in [0,1], namely q (k)=1/(2 \cdot K), 1/(2 \cdot K) + 1/K, \ldots, 1 - 1/(2 \cdot K) . The item response model is defined as

P( X_{pi}=x_{pi} | \theta_p) \propto q( x_{pi} )^{ m_{pi} - 1 } [ 1- q( x_{pi} ) ]^{ n_{pi} - 1 }

This density is a discrete version of a Beta distribution with shape parameters m_{pi} and n_{pi}. These parameters are defined as

m_{pi}=\mathrm{exp} \left[ ( \theta_p - \delta_i + \tau_i ) / 2 \right] \qquad \mbox{and} \qquad n_{pi}=\mathrm{exp} \left[ ( - \theta_p + \delta_i + \tau_i ) / 2 \right]

The item response function can also be formulated as

\mathrm{log} \left[ P( X_{pi}=x_{pi} | \theta_p) \right] \propto ( m_{pi} - 1 ) \cdot \mathrm{log} [ q( x_{pi} ) ] + ( n_{pi} - 1 ) \cdot \mathrm{log} [ 1- q( x_{pi} ) ]

The item parameters can be reparameterized as a_{i}=\mathrm{exp} \left[ ( - \delta_i + \tau_i ) / 2 \right] and b_{i}=\mathrm{exp} \left[ ( \delta_i + \tau_i ) / 2 \right].

Then, the original item parameters can be retrieved by \tau_i=\mathrm{log} ( a_i b_i) and \delta_i=\mathrm{log} ( b_i / a_i). Using \gamma _p=\mathrm{exp} ( \theta_p / 2) , we obtain

\mathrm{log} \left[ P( X_{pi}=x_{pi} | \theta_p) \right] \propto a_{i} \gamma_p \cdot \mathrm{log} [ q( x_{pi} ) ] + b_i / \gamma_p \cdot \mathrm{log} [ 1- q( x_{pi} ) ] - \left[ \mathrm{log} q( x_{pi} ) + \mathrm{log} [ 1- q( x_{pi} ) ] \right]

This formulation enables the specification of the Beta item response model as a structured latent class model (see TAM::tam.mml.3pl; Example 1).

See Smithson and Verkuilen (2006) for motivations for treating continuous indicators not as normally distributed variables.

Value

A simulated dataset of item responses if brm.sim is applied.

A matrix of item response probabilities if brm.irf is applied.

References

Gruen, B., Kosmidis, I., & Zeileis, A. (2012). Extended Beta regression in R: Shaken, stirred, mixed, and partitioned. Journal of Statistical Software, 48(11), 1-25. doi:10.18637/jss.v048.i11

Noel, Y., & Dauvier, B. (2007). A beta item response model for continuous bounded responses. Applied Psychological Measurement, 31(1), 47-73. doi:10.1177/0146621605287691

Smithson, M., & Verkuilen, J. (2006). A better lemon squeezer? Maximum-likelihood regression with beta-distributed dependent variables. Psychological Methods, 11(1), 54-71. doi: 10.1037/1082-989X.11.1.54

Examples

#############################################################################
# EXAMPLE 1: Simulated data beta response model
#############################################################################

#*** (1) Simulation of the beta response model
# Table 3 (p. 65) of Noel and Dauvier (2007)
delta <- c( -.942, -.649, -.603, -.398, -.379, .523, .649, .781, .907 )
tau <- c( .382, .166, 1.799, .615, 2.092, 1.988, 1.899, 1.439, 1.057 )
K <- 5        # number of categories for discretization
N <- 500        # number of persons
I <- length(delta) # number of items

set.seed(865)
theta <- stats::rnorm( N )
dat <- sirt::brm.sim( theta=theta, delta=delta, tau=tau, K=K)
psych::describe(dat)

#*** (2) some preliminaries for estimation of the model in mirt
#*** define a mirt function
library(mirt)
Theta <- matrix( seq( -4, 4, len=21), ncol=1 )

# compute item response function
ii <- 1     # item ii=1
b1 <- sirt::brm.irf( Theta=Theta, delta=delta[ii], tau=tau[ii],  ncat=K )
# plot item response functions
graphics::matplot( Theta[,1], b1, type="l" )

#*** defining the beta item response function for estimation in mirt
par <- c( 0, 1,  1)
names(par) <- c( "delta", "tau","thdim")
est <- c( TRUE, TRUE, FALSE )
names(est) <- names(par)
brm.icc <- function( par, Theta, ncat ){
     delta <- par[1]
     tau <- par[2]
     thdim <- par[3]
     probs <- sirt::brm.irf( Theta=Theta, delta=delta, tau=tau,  ncat=ncat,
            thdim=thdim)
     return(probs)
            }
name <- "brm"
# create item response function
brm.itemfct <- mirt::createItem(name, par=par, est=est, P=brm.icc)
#*** define model in mirt
mirtmodel <- mirt::mirt.model("
           F1=1-9
            " )
itemtype <- rep("brm", I )
customItems <- list("brm"=brm.itemfct)

# define parameters to be estimated
mod1.pars <- mirt::mirt(dat, mirtmodel, itemtype=itemtype,
                   customItems=customItems, pars="values")

## Not run: 
#*** (3) estimate beta item response model in mirt
mod1 <- mirt::mirt(dat,mirtmodel, itemtype=itemtype, customItems=customItems,
               pars=mod1.pars, verbose=TRUE  )
# model summaries
print(mod1)
summary(mod1)
coef(mod1)
# estimated coefficients and comparison with simulated data
cbind( sirt::mirt.wrapper.coef( mod1 )$coef, delta, tau )
mirt.wrapper.itemplot(mod1,ask=TRUE)

#---------------------------
# estimate beta item response model in TAM
library(TAM)

# define the skill space: standard normal distribution
TP <- 21                   # number of theta points
theta.k <- diag(TP)
theta.vec <-  seq( -6,6, len=TP)
d1 <- stats::dnorm(theta.vec)
d1 <- d1 / sum(d1)
delta.designmatrix <- matrix( log(d1), ncol=1 )
delta.fixed <- cbind( 1, 1, 1 )

# define design matrix E
E <- array(0, dim=c(I,K,TP,2*I + 1) )
dimnames(E)[[1]] <- items <- colnames(dat)
dimnames(E)[[4]] <- c( paste0( rep( items, each=2 ),
        rep( c("_a","_b" ), I) ), "one" )
for (ii in 1:I){
    for (kk in 1:K){
      for (tt in 1:TP){
        qk <- (2*(kk-1)+1)/(2*K)
        gammap <- exp( theta.vec[tt] / 2 )
        E[ii, kk, tt, 2*(ii-1) + 1 ] <- gammap * log( qk )
        E[ii, kk, tt, 2*(ii-1) + 2 ] <- 1 / gammap * log( 1 - qk )
        E[ii, kk, tt, 2*I+1 ] <- - log(qk) - log( 1 - qk )
                    }
            }
        }
gammaslope.fixed <- cbind( 2*I+1, 1 )
gammaslope <- exp( rep(0,2*I+1) )

# estimate model in TAM
mod2 <- TAM::tam.mml.3pl(resp=dat, E=E,control=list(maxiter=100),
              skillspace="discrete", delta.designmatrix=delta.designmatrix,
              delta.fixed=delta.fixed, theta.k=theta.k, gammaslope=gammaslope,
              gammaslope.fixed=gammaslope.fixed, notA=TRUE )
summary(mod2)

# extract original tau and delta parameters
m1 <- matrix( mod2$gammaslope[1:(2*I) ], ncol=2, byrow=TRUE )
m1 <- as.data.frame(m1)
colnames(m1) <- c("a","b")
m1$delta.TAM <- log( m1$b / m1$a)
m1$tau.TAM <- log( m1$a * m1$b )

# compare estimated parameter
m2 <- cbind( sirt::mirt.wrapper.coef( mod1 )$coef, delta, tau )[,-1]
colnames(m2) <- c(  "delta.mirt", "tau.mirt", "thdim","delta.true","tau.true"   )
m2 <- cbind(m1,m2)
round( m2, 3 )

## End(Not run)

Extended Bradley-Terry Model

Description

The function btm estimates an extended Bradley-Terry model (Hunter, 2004; see Details). Parameter estimation uses a bias corrected joint maximum likelihood estimation method based on \varepsilon-adjustment (see Bertoli-Barsotti, Lando & Punzo, 2014). See Details for the algorithm.

The function btm_sim simulated data from the extended Bradley-Terry model.

Usage

btm(data, judge=NULL, ignore.ties=FALSE, fix.eta=NULL, fix.delta=NULL, fix.theta=NULL,
       maxiter=100, conv=1e-04, eps=0.3, wgt.ties=.5)

## S3 method for class 'btm'
summary(object, file=NULL, digits=4,...)

## S3 method for class 'btm'
predict(object, data=NULL, ...)

btm_sim(theta, eta=0, delta=-99, repeated=FALSE)

Arguments

data

Data frame with three columns. The first two columns contain labels from the units in the pair comparison. The third column contains the result of the comparison. "1" means that the first units wins, "0" means that the second unit wins and "0.5" means a draw (a tie).

judge

Optional vector of judge identifiers (if multiple judges are available)

ignore.ties

Logical indicating whether ties should be ignored.

fix.eta

Numeric value for a fixed \eta value

fix.delta

Numeric value for a fixed \delta value

fix.theta

A vector with entries for fixed theta values.

maxiter

Maximum number of iterations

conv

Convergence criterion

eps

The \varepsilon parameter for the \varepsilon-adjustment method (see Bertoli-Barsotti, Lando & Punzo, 2014) which reduces bias in ability estimates. In case of \varepsilon=0, persons with extreme scores are removed from the pairwise comparison.

wgt.ties

Weighting parameter for ties, see formula in Details. The default is .5

object

Object of class btm

file

Optional file name for sinking the summary into

digits

Number of digits after decimal to print

...

Further arguments to be passed.

theta

Vector of abilities

eta

Value of \eta parameter

delta

Value of \delta parameter

repeated

Logical indicating whether repeated ratings of dyads (for home advantage effect) should be simulated

Details

The extended Bradley-Terry model for the comparison of individuals i and j is defined as

P(X_{ij}=1 ) \propto \exp( \eta + \theta_i )

P(X_{ij}=0 ) \propto \exp( \theta_j )

P(X_{ij}=0.5) \propto \exp( \delta + w_T ( \eta + \theta_i +\theta_j ) )

The parameters \theta_i denote the abilities, \delta is the tendency of the occurrence of ties and \eta is the home-advantage effect. The weighting parameter w_T governs the importance of ties and can be chosen in the argument wgt.ties.

A joint maximum likelihood (JML) estimation is applied for simulataneous estimation of \eta, \delta and all \theta_i parameters. In the Rasch model, it was shown that JML can result in biased parameter estimates. The \varepsilon-adjustment approach has been proposed to reduce the bias in parameter estimates (Bertoli-Bersotti, Lando & Punzo, 2014). This estimation approach is adapted to the Bradley-Terry model in the btm function. To this end, the likelihood function is modified for the purpose of bias reduction. It can be easily shown that there exist sufficient statistics for \eta, \delta and all \theta_i parameters. In the \varepsilon-adjustment approach, the sufficient statistic for the \theta_i parameter is modified. In JML estimation of the Bradley-Terry model, S_i=\sum_{j \ne i} ( x_{ij} + x_{ji} ) is a sufficient statistic for \theta_i. Let M_i the maximum score for person i which is the number of x_{ij} terms appearing in S_i. In the \varepsilon-adjustment approach, the sufficient statistic S_i is modified to

S_{i, \varepsilon}=\varepsilon + \frac{M_i - 2 \varepsilon}{M_i} S_i

and S_{i, \varepsilon} instead of S_{i} is used in JML estimation. Hence, original scores S_i are linearly transformed for all persons i.

Value

List with following entries

pars

Parameter summary for \eta and \delta

effects

Parameter estimates for \theta and outfit and infit statistics

summary.effects

Summary of \theta parameter estimates

mle.rel

MLE reliability, also known as separation reliability

sepG

Separation index G

probs

Estimated probabilities

data

Used dataset with integer identifiers

fit_judges

Fit statistics (outfit and infit) for judges if judge is provided. In addition, average agreement of the rating with the mode of the ratings is calculated for each judge (at least three ratings per dyad has to be available for computing the agreement).

residuals

Unstandardized and standardized residuals for each observation

References

Bertoli-Barsotti, L., Lando, T., & Punzo, A. (2014). Estimating a Rasch Model via fuzzy empirical probability functions. In D. Vicari, A. Okada, G. Ragozini & C. Weihs (Eds.). Analysis and Modeling of Complex Data in Behavioral and Social Sciences. Springer. doi:10.1007/978-3-319-06692-9_4

Hunter, D. R. (2004). MM algorithms for generalized Bradley-Terry models. Annals of Statistics, 32, 384-406. doi: 10.1214/aos/1079120141

Examples

#############################################################################
# EXAMPLE 1: Bradley-Terry model | data.pw01
#############################################################################

data(data.pw01)

dat <- data.pw01
dat <- dat[, c("home_team", "away_team", "result") ]

# recode results according to needed input
dat$result[ dat$result==0 ] <- 1/2   # code for ties
dat$result[ dat$result==2 ] <- 0     # code for victory of away team

#********************
# Model 1: Estimation with ties and home advantage
mod1 <- sirt::btm( dat)
summary(mod1)

## Not run: 
#*** Model 2: Estimation with ties, no epsilon adjustment
mod2 <- sirt::btm( dat, eps=0)
summary(mod2)

#*** Model 3: Estimation with ties, no epsilon adjustment, weight for ties of .333 which
#    corresponds to the rule of 3 points for a victory and 1 point of a draw in football
mod3 <- sirt::btm( dat, eps=0, wgt.ties=1/3)
summary(mod3)

#*** Model 4: Some fixed abilities
fix.theta <- c("Anhalt Dessau"=-1 )
mod4 <- sirt::btm( dat, eps=0, fix.theta=fix.theta)
summary(mod4)

#*** Model 5: Ignoring ties, no home advantage effect
mod5 <- sirt::btm( dat, ignore.ties=TRUE, fix.eta=0)
summary(mod5)

#*** Model 6: Ignoring ties, no home advantage effect (JML approach and eps=0)
mod6 <- sirt::btm( dat, ignore.ties=TRUE, fix.eta=0, eps=0)
summary(mod5)

#############################################################################
# EXAMPLE 2: Venice chess data
#############################################################################

# See http://www.rasch.org/rmt/rmt113o.htm
# Linacre, J. M. (1997). Paired Comparisons with Standard Rasch Software.
# Rasch Measurement Transactions, 11:3, 584-585.

# dataset with chess games -> "D" denotes a draw (tie)
chessdata <- scan( what="character")
    1D.0..1...1....1.....1......D.......D........1.........1.......... Browne
    0.1.D..0...1....1.....1......D.......1........D.........1......... Mariotti
    .D0..0..1...D....D.....1......1.......1........1.........D........ Tatai
    ...1D1...D...D....1.....D......D.......D........1.........0....... Hort
    ......010D....D....D.....1......D.......1........1.........D...... Kavalek
    ..........00DDD.....D.....D......D.......1........D.........1..... Damjanovic
    ...............00D0DD......D......1.......1........1.........0.... Gligoric
    .....................000D0DD.......D.......1........D.........1... Radulov
    ............................DD0DDD0D........0........0.........1.. Bobotsov
    ....................................D00D00001.........1.........1. Cosulich
    .............................................0D000D0D10..........1 Westerinen
    .......................................................00D1D010000 Zichichi

L <- length(chessdata) / 2
games <- matrix( chessdata, nrow=L, ncol=2, byrow=TRUE )
G <- nchar(games[1,1])
# create matrix with results
results <- matrix( NA, nrow=G, ncol=3 )
for (gg in 1:G){
    games.gg <- substring( games[,1], gg, gg )
    ind.gg <- which( games.gg !="." )
    results[gg, 1:2 ] <- games[ ind.gg, 2]
    results[gg, 3 ] <- games.gg[ ind.gg[1] ]
}
results <- as.data.frame(results)
results[,3] <- paste(results[,3] )
results[ results[,3]=="D", 3] <- 1/2
results[,3] <- as.numeric( results[,3] )

# fit model ignoring draws
mod1 <- sirt::btm( results, ignore.ties=TRUE, fix.eta=0, eps=0 )
summary(mod1)

# fit model with draws
mod2 <- sirt::btm( results, fix.eta=0, eps=0 )
summary(mod2)

#############################################################################
# EXAMPLE 3: Simulated data from the Bradley-Terry model
#############################################################################

set.seed(9098)
N <- 22
theta <- seq(2,-2, len=N)

#** simulate and estimate data without repeated dyads
dat1 <- sirt::btm_sim(theta=theta)
mod1 <- sirt::btm( dat1, ignore.ties=TRUE, fix.delta=-99, fix.eta=0)
summary(mod1)

#*** simulate data with home advantage effect and ties
dat2 <- sirt::btm_sim(theta=theta, eta=.8, delta=-.6, repeated=TRUE)
mod2 <- sirt::btm(dat2)
summary(mod2)

#############################################################################
# EXAMPLE 4: Estimating the Bradley-Terry model with multiple judges
#############################################################################

#*** simulating data with multiple judges
set.seed(987)
N <- 26  # number of objects to be rated
theta <- seq(2,-2, len=N)
s1 <- stats::sd(theta)
dat <- NULL
# judge discriminations which define tendency to provide reliable ratings
discrim <- c( rep(.9,10), rep(.5,2), rep(0,2) )
#=> last four raters provide less reliable ratings

RR <- length(discrim)
for (rr in 1:RR){
    theta1 <- discrim[rr]*theta + stats::rnorm(N, mean=0, sd=s1*sqrt(1-discrim[rr]))
    dat1 <- sirt::btm_sim(theta1)
    dat1$judge <- rr
    dat <- rbind(dat, dat1)
}

#** estimate the Bradley-Terry model and compute judge-specific fit statistics
mod <- sirt::btm( dat[,1:3], judge=paste0("J",100+dat[,4]), fix.eta=0, ignore.ties=TRUE)
summary(mod)

## End(Not run)

Categorize and Decategorize Variables in a Data Frame

Description

The function categorize defines categories for variables in a data frame, starting with a user-defined index (e.g. 0 or 1). Continuous variables can be categorized by defining categories by discretizing the variables in different quantile groups.

The function decategorize does the reverse operation.

Usage

categorize(dat, categorical=NULL, quant=NULL, lowest=0)

decategorize(dat, categ_design=NULL)

Arguments

dat

Data frame

categorical

Vector with variable names which should be converted into categories, beginning with integer lowest

quant

Vector with number of classes for each variables. Variables are categorized among quantiles. The vector must have names containing variable names.

lowest

Lowest category index. Default is 0.

categ_design

Data frame containing informations about categorization which is the output of categorize.

Value

For categorize, it is a list with entries

data

Converted data frame

categ_design

Data frame containing some informations about categorization

For decategorize it is a data frame.

Examples

## Not run: 
library(mice)
library(miceadds)

#############################################################################
# EXAMPLE 1: Categorize questionnaire data
#############################################################################

data(data.smallscale, package="miceadds")
dat <- data.smallscale

# (0) select dataset
dat <- dat[, 9:20 ]
summary(dat)
categorical <- colnames(dat)[2:6]

# (1) categorize data
res <- sirt::categorize( dat, categorical=categorical )

# (2) multiple imputation using the mice package
dat2 <- res$data
VV <- ncol(dat2)
impMethod <- rep( "sample", VV )    # define random sampling imputation method
names(impMethod) <- colnames(dat2)
imp <- mice::mice( as.matrix(dat2), impMethod=impMethod, maxit=1, m=1 )
dat3 <- mice::complete(imp,action=1)

# (3) decategorize dataset
dat3a <- sirt::decategorize( dat3, categ_design=res$categ_design )

#############################################################################
# EXAMPLE 2: Categorize ordinal and continuous data
#############################################################################

data(data.ma01,package="miceadds")
dat <- data.ma01
summary(dat[,-c(1:2)] )

# define variables to be categorized
categorical <- c("books", "paredu" )
# define quantiles
quant <-  c(6,5,11)
names(quant) <- c("math", "read", "hisei")

# categorize data
res <- sirt::categorize( dat, categorical=categorical, quant=quant)
str(res)

## End(Not run)

Nonparametric Estimation of Conditional Covariances of Item Pairs

Description

This function estimates conditional covariances of itempairs (Stout, Habing, Douglas & Kim, 1996; Zhang & Stout, 1999a). The function is used for the estimation of the DETECT index. The ccov.np function has the (default) option to smooth item response functions (argument smooth) in the computation of conditional covariances (Douglas, Kim, Habing, & Gao, 1998).

Usage

ccov.np(data, score, bwscale=1.1, thetagrid=seq(-3, 3, len=200),
    progress=TRUE, scale_score=TRUE, adjust_thetagrid=TRUE, smooth=TRUE,
    use_sum_score=FALSE, bias_corr=TRUE)

Arguments

data

An N \times I data frame of dichotomous responses. Missing responses are allowed.

score

An ability estimate, e.g. the WLE

bwscale

Bandwidth factor for calculation of conditional covariance. The bandwidth used in the estimation is bwscale times N^{-1/5}.

thetagrid

A vector which contains theta values where conditional covariances are evaluated.

progress

Display progress?

scale_score

Logical indicating whether score should be z standardized in advance of the calculation of conditional covariances

adjust_thetagrid

Logical indicating whether thetagrid should be adjusted if observed values in score are outside of thetagrid.

smooth

Logical indicating whether smoothing should be applied for conditional covariance estimation

use_sum_score

Logical indicating whether sum score should be used. With this option, the bias corrected conditional covariance of Zhang and Stout (1999) is used.

bias_corr

Logical indicating whether bias correction (Zhang & Stout, 1999) should be utilized if use_sum_score=TRUE.

Note

This function is used in conf.detect and expl.detect.

References

Douglas, J., Kim, H. R., Habing, B., & Gao, F. (1998). Investigating local dependence with conditional covariance functions. Journal of Educational and Behavioral Statistics, 23(2), 129-151. doi:10.3102/10769986023002129

Zhang, J., & Stout, W. (1999). Conditional covariance structure of generalized compensatory multidimensional items. Psychometrika, 64(2), 129-152. doi:10.1007/BF02294532

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.read | different settings for computing conditional covariance
#############################################################################

data(data.read, package="sirt")
dat <- data.read

#* fit Rasch model
mod <- sirt::rasch.mml2(dat)
score <- sirt::wle.rasch(dat=dat, b=mod$item$b)$theta

#* ccov with smoothing
cmod1 <- sirt::ccov.np(data=dat, score=score, bwscale=1.1)
#* ccov without smoothing
cmod2 <- sirt::ccov.np(data=dat, score=score, smooth=FALSE)

#- compare results
100*cbind( cmod1$ccov.table[1:6, "ccov"], cmod2$ccov.table[1:6, "ccov"])

## End(Not run)

Estimation of a Unidimensional Factor Model under Full and Partial Measurement Invariance

Description

Estimates a unidimensional factor model based on the normal distribution fitting function under full and partial measurement invariance. Item loadings and item intercepts are successively freed based on the largest modification index and a chosen significance level alpha.

Usage

cfa_meas_inv(dat, group, weights=NULL, alpha=0.01, verbose=FALSE, op=c("~1","=~"))

Arguments

dat

Data frame containing items

group

Vector of group identifiers

weights

Optional vector of sampling weights

alpha

Significance level

verbose

Logical indicating whether progress should be shown

op

Operators (intercepts or loadings) for which estimation should be freed

Value

List with several entries

pars_mi

Model parameters under full invariance

pars_pi

Model parameters under partial invariance

mod_mi

Fitted model under full invariance

mod_pi

Fitted model under partial invariance

...

More output

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Factor model under full and partial invariance
#############################################################################

#--- data simulation

set.seed(65)
G <- 3  # number of groups
I <- 5  # number of items
# define lambda and nu parameters
lambda <- matrix(1, nrow=G, ncol=I)
nu <- matrix(0, nrow=G, ncol=I)
err_var <- matrix(1, nrow=G, ncol=I)

# define size of noninvariance
dif <- 1
#- 1st group: N(0,1)
lambda[1,3] <- 1+dif*.4; nu[1,5] <- dif*.5
#- 2nd group: N(0.3,1.5)
gg <- 2 ;
lambda[gg,5] <- 1-.5*dif; nu[gg,1] <- -.5*dif
#- 3nd group: N(.8,1.2)
gg <- 3
lambda[gg,4] <- 1-.7*dif; nu[gg,2] <- -.5*dif
#- define distributions of groups
mu <- c(0,.3,.8)
sigma <- sqrt(c(1,1.5,1.2))
N <- rep(1000,3) # sample sizes per group

#* use simulation function
dat <- sirt::invariance_alignment_simulate(nu, lambda, err_var, mu, sigma, N,
                exact=TRUE)

#--- estimate CFA
mod <- sirt::cfa_meas_inv(dat=dat[,-1], group=dat$group, verbose=TRUE, alpha=0.05)
mod$pars_mi
mod$pars_pi

## End(Not run)

Classification Accuracy in the Rasch Model

Description

This function computes the classification accuracy in the Rasch model for the maximum likelihood (person parameter) estimate according to the method of Rudner (2001).

Usage

class.accuracy.rasch(cutscores, b, meantheta, sdtheta, theta.l, n.sims=0)

Arguments

cutscores

Vector of cut scores

b

Vector of item difficulties

meantheta

Mean of the trait distribution

sdtheta

Standard deviation of the trait distribution

theta.l

Discretized theta distribution

n.sims

Number of simulated persons in a data set. The default is 0 which means that no simulation is performed.

Value

A list with following entries:

class.stats

Data frame containing classification accuracy statistics. The column agree0 refers to absolute agreement, agree1 to the agreement of at most a difference of one level.

class.prob

Probability table of classification

References

Rudner, L.M. (2001). Computing the expected proportions of misclassified examinees. Practical Assessment, Research & Evaluation, 7(14).

Examples

#############################################################################
# EXAMPLE 1: Reading dataset
#############################################################################
data( data.read, package="sirt")
dat <- data.read

# estimate the Rasch model
mod <- sirt::rasch.mml2( dat )

# estimate classification accuracy (3 levels)
cutscores <- c( -1, .3 )    # cut scores at theta=-1 and theta=.3
sirt::class.accuracy.rasch( cutscores=cutscores, b=mod$item$b,
           meantheta=0,  sdtheta=mod$sd.trait,
           theta.l=seq(-4,4,len=200), n.sims=3000)
  ##   Cut Scores
  ##   [1] -1.0  0.3
  ##
  ##   WLE reliability (by simulation)=0.671
  ##   WLE consistency (correlation between two parallel forms)=0.649
  ##
  ##   Classification accuracy and consistency
  ##              agree0 agree1 kappa consistency
  ##   analytical   0.68  0.990 0.492          NA
  ##   simulated    0.70  0.997 0.489       0.599
  ##
  ##   Probability classification table
  ##               Est_Class1 Est_Class2 Est_Class3
  ##   True_Class1      0.136      0.041      0.001
  ##   True_Class2      0.081      0.249      0.093
  ##   True_Class3      0.009      0.095      0.294

Confirmatory DETECT and polyDETECT Analysis

Description

This function computes the DETECT statistics for dichotomous item responses and the polyDETECT statistic for polytomous item responses under a confirmatory specification of item clusters (Stout, Habing, Douglas & Kim, 1996; Zhang & Stout, 1999a, 1999b; Zhang, 2007; Bonifay, Reise, Scheines, & Meijer, 2015).

Item responses in a multi-matrix design are allowed (Zhang, 2013).

An exploratory DETECT analysis can be conducted using the expl.detect function.

Usage

conf.detect(data, score, itemcluster, bwscale=1.1, progress=TRUE,
        thetagrid=seq(-3, 3, len=200), smooth=TRUE, use_sum_score=FALSE, bias_corr=TRUE)

## S3 method for class 'conf.detect'
summary(object, digits=3, file=NULL, ...)

Arguments

data

An N \times I data frame of dichotomous or polytomous responses. Missing responses are allowed.

score

An ability estimate, e.g. the WLE, sum score or mean score

itemcluster

Item cluster for each item. The order of entries must correspond to the columns in data.

bwscale

Bandwidth factor for calculation of conditional covariance (see ccov.np)

progress

Display progress?

smooth

Logical indicating whether smoothing should be applied for conditional covariance estimation

thetagrid

A vector which contains theta values where conditional covariances are evaluated.

use_sum_score

Logical indicating whether sum score should be used. With this option, the bias corrected conditional covariance of Zhang and Stout (1999) is used.

bias_corr

Logical indicating whether bias correction (Zhang & Stout, 1999) should be utilized if use_sum_score=TRUE.

object

Object of class conf.detect

digits

Number of digits for rounding in summary

file

Optional file name to be sunk for summary

...

Further arguments to be passed

Details

The result of DETECT are the indices DETECT, ASSI and RATIO (see Zhang 2007 for details) calculated for the options unweighted and weighted. The option unweighted means that all conditional covariances of item pairs are equally weighted, weighted means that these covariances are weighted by the sample size of item pairs. In case of multi matrix item designs, both types of indices can differ.

The classification scheme of these indices are as follows (Jang & Roussos, 2007; Zhang, 2007):

Strong multidimensionality	DETECT > 1.00
Moderate multidimensionality	.40 < DETECT < 1.00
Weak multidimensionality	.20 < DETECT < .40
Essential unidimensionality	DETECT < .20

Maximum value under simple structure	ASSI=1	RATIO=1
Essential deviation from unidimensionality	ASSI > .25	RATIO > .36
Essential unidimensionality	ASSI < .25	RATIO < .36

Note that the expected value of a conditional covariance for an item pair is negative when a unidimensional model holds. In consequence, the DETECT index can become negative for unidimensional data (see Example 3). This can be also seen in the statistic MCOV100 in the value detect.

Value

A list with following entries:

detect

Data frame with statistics DETECT, ASSI, RATIO, MADCOV100 and MCOV100

ccovtable

Individual contributions to conditional covariance

ccov.matrix

Evaluated conditional covariance

References

Bonifay, W. E., Reise, S. P., Scheines, R., & Meijer, R. R. (2015). When are multidimensional data unidimensional enough for structural equation modeling? An evaluation of the DETECT multidimensionality index. Structural Equation Modeling, 22(4), 504-516. doi:10.1080/10705511.2014.938596

Jang, E. E., & Roussos, L. (2007). An investigation into the dimensionality of TOEFL using conditional covariance-based nonparametric approach. Journal of Educational Measurement, 44(1), 1-21. doi:10.1111/j.1745-3984.2007.00024.x

Zhang, J. (2007). Conditional covariance theory and DETECT for polytomous items. Psychometrika, 72(1), 69-91. doi:10.1007/s11336-004-1257-7

Zhang, J. (2013). A procedure for dimensionality analyses of response data from various test designs. Psychometrika, 78(1), 37-58. doi:10.1007/s11336-012-9287-z

Zhang, J., & Stout, W. (1999a). Conditional covariance structure of generalized compensatory multidimensional items. Psychometrika, 64(2), 129-152. doi:10.1007/BF02294532

Zhang, J., & Stout, W. (1999b). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64(2), 213-249. doi:10.1007/BF02294536

Examples

#############################################################################
# EXAMPLE 1: TIMSS mathematics data set (dichotomous data)
#############################################################################
data(data.timss)

# extract data
dat <- data.timss$data
dat <- dat[, substring( colnames(dat),1,1)=="M" ]
# extract item informations
iteminfo <- data.timss$item
# estimate Rasch model
mod1 <- sirt::rasch.mml2( dat )
# estimate WLEs
wle1 <- sirt::wle.rasch( dat, b=mod1$item$b )$theta

# DETECT for content domains
detect1 <- sirt::conf.detect( data=dat, score=wle1,
                    itemcluster=iteminfo$Content.Domain )
  ##          unweighted weighted
  ##   DETECT      0.316    0.316
  ##   ASSI        0.273    0.273
  ##   RATIO       0.355    0.355

## Not run: 
# DETECT cognitive domains
detect2 <- sirt::conf.detect( data=dat, score=wle1,
                    itemcluster=iteminfo$Cognitive.Domain )
  ##          unweighted weighted
  ##   DETECT      0.251    0.251
  ##   ASSI        0.227    0.227
  ##   RATIO       0.282    0.282

# DETECT for item format
detect3 <- sirt::conf.detect( data=dat, score=wle1,
                    itemcluster=iteminfo$Format )
  ##          unweighted weighted
  ##   DETECT      0.056    0.056
  ##   ASSI        0.060    0.060
  ##   RATIO       0.062    0.062

# DETECT for item blocks
detect4 <- sirt::conf.detect( data=dat, score=wle1,
                    itemcluster=iteminfo$Block )
  ##          unweighted weighted
  ##   DETECT      0.301    0.301
  ##   ASSI        0.193    0.193
  ##   RATIO       0.339    0.339 
## End(Not run)

# Exploratory DETECT: Application of a cluster analysis employing the Ward method
detect5 <- sirt::expl.detect( data=dat, score=wle1,
                nclusters=10, N.est=nrow(dat)  )
# Plot cluster solution
pl <- graphics::plot( detect5$clusterfit, main="Cluster solution" )
stats::rect.hclust(detect5$clusterfit, k=4, border="red")

## Not run: 
#############################################################################
# EXAMPLE 2: Big 5 data set (polytomous data)
#############################################################################

# attach Big5 Dataset
data(data.big5)

# select 6 items of each dimension
dat <- data.big5
dat <- dat[, 1:30]

# estimate person score by simply using a transformed sum score
score <- stats::qnorm( ( rowMeans( dat )+.5 )  / ( 30 + 1 ) )

# extract item cluster (Big 5 dimensions)
itemcluster <- substring( colnames(dat), 1, 1 )

# DETECT Item cluster
detect1 <- sirt::conf.detect( data=dat, score=score, itemcluster=itemcluster )
  ##        unweighted weighted
  ## DETECT      1.256    1.256
  ## ASSI        0.384    0.384
  ## RATIO       0.597    0.597

# Exploratory DETECT
detect5 <- sirt::expl.detect( data=dat, score=score,
                     nclusters=9, N.est=nrow(dat)  )
  ## DETECT (unweighted)
  ## Optimal Cluster Size is  6  (Maximum of DETECT Index)
  ##   N.Cluster N.items N.est N.val      size.cluster DETECT.est ASSI.est RATIO.est
  ## 1         2      30   500     0              6-24      1.073    0.246     0.510
  ## 2         3      30   500     0           6-10-14      1.578    0.457     0.750
  ## 3         4      30   500     0         6-10-11-3      1.532    0.444     0.729
  ## 4         5      30   500     0        6-8-11-2-3      1.591    0.462     0.757
  ## 5         6      30   500     0       6-8-6-2-5-3      1.610    0.499     0.766
  ## 6         7      30   500     0     6-3-6-2-5-5-3      1.557    0.476     0.740
  ## 7         8      30   500     0   6-3-3-2-3-5-5-3      1.540    0.462     0.732
  ## 8         9      30   500     0 6-3-3-2-3-5-3-3-2      1.522    0.444     0.724

# Plot Cluster solution
pl <- graphics::plot( detect5$clusterfit, main="Cluster solution" )
stats::rect.hclust(detect5$clusterfit, k=6, border="red")

#############################################################################
# EXAMPLE 3: DETECT index for unidimensional data
#############################################################################

set.seed(976)
N <- 1000
I <- 20
b <- sample( seq( -2, 2, len=I) )
dat <- sirt::sim.raschtype( stats::rnorm(N), b=b )

# estimate Rasch model and corresponding WLEs
mod1 <- TAM::tam.mml( dat )
wmod1 <- TAM::tam.wle(mod1)$theta

# define item cluster
itemcluster <- c( rep(1,5), rep(2,I-5) )

# compute DETECT statistic
detect1 <- sirt::conf.detect( data=dat, score=wmod1, itemcluster=itemcluster)
  ##            unweighted weighted
  ##  DETECT        -0.184   -0.184
  ##  ASSI          -0.147   -0.147
  ##  RATIO         -0.226   -0.226
  ##  MADCOV100      0.816    0.816
  ##  MCOV100       -0.786   -0.786

## End(Not run)

Item Parameters Cultural Activities

Description

List with item parameters for cultural activities of Austrian students for 9 Austrian countries.

Usage

data(data.activity.itempars)

Format

The format is a list with number of students per group (N), item loadings (lambda) and item intercepts (nu):

List of 3
$ N : 'table' int [1:9(1d)] 2580 5279 15131 14692 5525 11005 7080 ...
..- attr(*, "dimnames")=List of 1
.. ..$ : chr [1:9] "1" "2" "3" "4" ...
$ lambda: num [1:9, 1:5] 0.423 0.485 0.455 0.437 0.502 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:9] "country1" "country2" "country3" "country4" ...
.. ..$ : chr [1:5] "act1" "act2" "act3" "act4" ...
$ nu : num [1:9, 1:5] 1.65 1.53 1.7 1.59 1.7 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:9] "country1" "country2" "country3" "country4" ...
.. ..$ : chr [1:5] "act1" "act2" "act3" "act4" ...

BEFKI Dataset (Schroeders, Schipolowski, & Wilhelm, 2015)

Description

The synthetic dataset is based on the standardization sample of the Berlin Test of Fluid and Crystallized Intelligence (BEFKI, Wilhelm, Schroeders, & Schipolowski, 2014). The underlying sample consists of N=11,756 students from all German federal states (except for the smallest one) and all school types of the general educational system attending Grades 5 to 12. A detailed description of the study, the sample, and the measure is given in Schroeders, Schipolowski, and Wilhelm (2015).

Usage

data(data.befki)
data(data.befki_resp)

Format

The dataset data.befki contains 11756 students, nested within 581 classes.

'data.frame': 11756 obs. of 12 variables:
$ idclass: int 1276 1276 1276 1276 1276 1276 1276 1276 1276 1276 ...
$ idstud : int 127601 127602 127603 127604 127605 127606 127607 127608 127609 127610 ...
$ grade : int 5 5 5 5 5 5 5 5 5 5 ...
$ gym : int 0 0 0 0 0 0 0 0 0 0 ...
$ female : int 0 1 0 0 0 0 1 0 0 0 ...
$ age : num 12.2 11.8 11.5 10.8 10.9 ...
$ sci : num -3.14 -3.44 -2.62 -2.16 -1.01 -1.91 -1.01 -4.13 -2.16 -3.44 ...
$ hum : num -1.71 -1.29 -2.29 -2.48 -0.65 -0.92 -1.71 -2.31 -1.99 -2.48 ...
$ soc : num -2.87 -3.35 -3.81 -2.35 -1.32 -1.11 -1.68 -2.96 -2.69 -3.35 ...
$ gfv : num -2.25 -2.19 -2.25 -1.17 -2.19 -3.05 -1.7 -2.19 -3.05 -1.7 ...
$ gfn : num -2.2 -1.85 -1.85 -1.85 -1.85 -0.27 -1.37 -2.58 -1.85 -3.13 ...
$ gff : num -0.91 -0.43 -1.17 -1.45 -0.61 -1.78 -1.17 -1.78 -1.78 -3.87 ...
The dataset data.befki_resp contains response indicators for observed data points in the dataset data.befki.

num [1:11756, 1:12] 1 1 1 1 1 1 1 1 1 1 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:12] "idclass" "idstud" "grade" "gym" ...

Details

The procedure for generating this dataset is based on a factorization of the joint distribution. All variables are simulated from unidimensional conditional parametric regression models including several interaction and quadratic terms. The multilevel structure is approximated by including cluster means as predictors in the regression models.

Source

Synthetic dataset

References

Schroeders, U., Schipolowski, S., & Wilhelm, O. (2015). Age-related changes in the mean and covariance structure of fluid and crystallized intelligence in childhood and adolescence. Intelligence, 48, 15-29. doi:10.1016/j.intell.2014.10.006

Wilhelm, O., Schroeders, U., & Schipolowski, S. (2014). Berliner Test zur Erfassung fluider und kristalliner Intelligenz fuer die 8. bis 10. Jahrgangsstufe [Berlin test of fluid and crystallized intelligence for grades 8-10]. Goettingen: Hogrefe.

Dataset Big 5 from qgraph Package

Description

This is a Big 5 dataset from the qgraph package (Dolan, Oorts, Stoel, Wicherts, 2009). It contains 500 subjects on 240 items.

Usage

data(data.big5)
data(data.big5.qgraph)

Format

The format of data.big5 is:
num [1:500, 1:240] 1 0 0 0 0 1 1 2 0 1 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:240] "N1" "E2" "O3" "A4" ...
The format of data.big5.qgraph is:

num [1:500, 1:240] 2 3 4 4 5 2 2 1 4 2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:240] "N1" "E2" "O3" "A4" ...

Details

In these datasets, there exist 48 items for each dimension. The Big 5 dimensions are Neuroticism (N), Extraversion (E), Openness (O), Agreeableness (A) and Conscientiousness (C). Note that the data.big5 differs from data.big5.qgraph in a way that original items were recoded into three categories 0,1 and 2.

Source

See big5 in qgraph package.

References

Dolan, C. V., Oort, F. J., Stoel, R. D., & Wicherts, J. M. (2009). Testing measurement invariance in the target rotates multigroup exploratory factor model. Structural Equation Modeling, 16, 295-314.

Examples

## Not run: 
# list of needed packages for the following examples
packages <- scan(what="character")
     sirt   TAM   eRm   CDM   mirt  ltm   mokken  psychotools  psychomix
     psych

# load packages. make an installation if necessary
miceadds::library_install(packages)

#############################################################################
# EXAMPLE 1: Unidimensional models openness scale
#############################################################################

data(data.big5)
# extract first 10 openness items
items <- which( substring( colnames(data.big5), 1, 1 )=="O"  )[1:10]
dat <- data.big5[, items ]
I <- ncol(dat)
summary(dat)
  ##   > colnames(dat)
  ##    [1] "O3"  "O8"  "O13" "O18" "O23" "O28" "O33" "O38" "O43" "O48"
# descriptive statistics
psych::describe(dat)

#****************
# Model 1: Partial credit model
#****************

#-- M1a: rm.facets (in sirt)
m1a <- sirt::rm.facets( dat )
summary(m1a)

#-- M1b: tam.mml (in TAM)
m1b <- TAM::tam.mml( resp=dat )
summary(m1b)

#-- M1c: gdm (in CDM)
theta.k <- seq(-6,6,len=21)
m1c <- CDM::gdm( dat, irtmodel="1PL",theta.k=theta.k, skillspace="normal")
summary(m1c)
# compare results with loglinear skillspace
m1c2 <- CDM::gdm( dat, irtmodel="1PL",theta.k=theta.k, skillspace="loglinear")
summary(m1c2)

#-- M1d: PCM (in eRm)
m1d <- eRm::PCM( dat )
summary(m1d)

#-- M1e: gpcm (in ltm)
m1e <- ltm::gpcm( dat, constraint="1PL", control=list(verbose=TRUE))
summary(m1e)

#-- M1f: mirt (in mirt)
m1f <- mirt::mirt( dat, model=1, itemtype="1PL", verbose=TRUE)
summary(m1f)
coef(m1f)

#-- M1g: PCModel.fit (in psychotools)
mod1g <- psychotools::PCModel.fit(dat)
summary(mod1g)
plot(mod1g)

#****************
# Model 2: Generalized partial credit model
#****************

#-- M2a: rm.facets (in sirt)
m2a <- sirt::rm.facets( dat, est.a.item=TRUE)
summary(m2a)
# Note that in rm.facets the mean of item discriminations is fixed to 1

#-- M2b: tam.mml.2pl (in TAM)
m2b <- TAM::tam.mml.2pl( resp=dat, irtmodel="GPCM")
summary(m2b)

#-- M2c: gdm (in CDM)
m2c <- CDM::gdm( dat, irtmodel="2PL",theta.k=seq(-6,6,len=21),
                   skillspace="normal", standardized.latent=TRUE)
summary(m2c)

#-- M2d: gpcm (in ltm)
m2d <- ltm::gpcm( dat, control=list(verbose=TRUE))
summary(m2d)

#-- M2e: mirt (in mirt)
m2e <- mirt::mirt( dat, model=1,  itemtype="GPCM", verbose=TRUE)
summary(m2e)
coef(m2e)

#****************
# Model 3: Nonparametric item response model
#****************

#-- M3a: ISOP and ADISOP model - isop.poly (in sirt)
m3a <- sirt::isop.poly( dat )
summary(m3a)
plot(m3a)

#-- M3b: Mokken scale analysis (in mokken)
# Scalability coefficients
mokken::coefH(dat)
# Assumption of monotonicity
monotonicity.list <- mokken::check.monotonicity(dat)
summary(monotonicity.list)
plot(monotonicity.list)
# Assumption of non-intersecting ISRFs using method restscore
restscore.list <- mokken::check.restscore(dat)
summary(restscore.list)
plot(restscore.list)

#****************
# Model 4: Graded response model
#****************

#-- M4a: mirt (in mirt)
m4a <- mirt::mirt( dat, model=1,  itemtype="graded", verbose=TRUE)
print(m4a)
mirt.wrapper.coef(m4a)

#----  M4b: WLSMV estimation with cfa (in lavaan)
lavmodel <- "F=~ O3__O48
             F ~~ 1*F
                "
# transform lavaan syntax with lavaanify.IRT
lavmodel <- TAM::lavaanify.IRT( lavmodel, items=colnames(dat) )$lavaan.syntax
mod4b <- lavaan::cfa( data=as.data.frame(dat), model=lavmodel, std.lv=TRUE,
                 ordered=colnames(dat),  parameterization="theta")
summary(mod4b, standardized=TRUE, fit.measures=TRUE, rsquare=TRUE)
coef(mod4b)

#****************
# Model 5: Normally distributed residuals
#****************

#----  M5a: cfa (in lavaan)
lavmodel <- "F=~ O3__O48
             F ~~ 1*F
             F ~ 0*1
             O3__O48 ~ 1
                "
lavmodel <- TAM::lavaanify.IRT( lavmodel, items=colnames(dat) )$lavaan.syntax
mod5a <- lavaan::cfa( data=as.data.frame(dat), model=lavmodel, std.lv=TRUE,
                 estimator="MLR" )
summary(mod5a, standardized=TRUE, fit.measures=TRUE, rsquare=TRUE)

#----  M5b: mirt (in mirt)

# create user defined function
name <- 'normal'
par <- c("d"=1, "a1"=0.8, "vy"=1)
est <- c(TRUE, TRUE,FALSE)
P.normal <- function(par,Theta,ncat){
     d <- par[1]
     a1 <- par[2]
     vy <- par[3]
     psi <- vy - a1^2
     # expected values given Theta
     mui <- a1*Theta[,1] + d
     TP <- nrow(Theta)
     probs <- matrix( NA, nrow=TP, ncol=ncat )
     eps <- .01
     for (cc in 1:ncat){
        probs[,cc] <- stats::dnorm( cc, mean=mui, sd=sqrt( abs( psi + eps) ) )
                    }
     psum <- matrix( rep(rowSums( probs ),each=ncat), nrow=TP, ncol=ncat, byrow=TRUE)
     probs <- probs / psum
     return(probs)
}

# create item response function
normal <- mirt::createItem(name, par=par, est=est, P=P.normal)
customItems <- list("normal"=normal)
itemtype <- rep( "normal",I)
# define parameters to be estimated
mod5b.pars <- mirt::mirt(dat, 1, itemtype=itemtype,
                   customItems=customItems, pars="values")
ind <- which( mod5b.pars$name=="vy")
vy <- apply( dat, 2, var, na.rm=TRUE )
mod5b.pars[ ind, "value" ] <- vy
ind <- which( mod5b.pars$name=="a1")
mod5b.pars[ ind, "value" ] <- .5* sqrt(vy)
ind <- which( mod5b.pars$name=="d")
mod5b.pars[ ind, "value" ] <- colMeans( dat, na.rm=TRUE )

# estimate model
mod5b <- mirt::mirt(dat, 1, itemtype=itemtype, customItems=customItems,
                 pars=mod5b.pars, verbose=TRUE    )
sirt::mirt.wrapper.coef(mod5b)$coef

# some item plots
    par(ask=TRUE)
plot(mod5b, type='trace', layout=c(1,1))
    par(ask=FALSE)
# Alternatively:
sirt::mirt.wrapper.itemplot(mod5b)

## End(Not run)

Datasets from Borg and Staufenbiel (2007)

Description

Datasets of the book of Borg and Staufenbiel (2007) Lehrbuch Theorien and Methoden der Skalierung.

Usage

data(data.bs07a)

Format

The dataset data.bs07a contains the data Gefechtsangst (p. 130) and contains 8 of the original 9 items. The items are symptoms of anxiety in engagement.
GF1: starkes Herzklopfen, GF2: flaues Gefuehl in der Magengegend, GF3: Schwaechegefuehl, GF4: Uebelkeitsgefuehl, GF5: Erbrechen, GF6: Schuettelfrost, GF7: in die Hose urinieren/einkoten, GF9: Gefuehl der Gelaehmtheit

The format is

'data.frame': 100 obs. of 9 variables:
$ idpatt: int 44 29 1 3 28 50 50 36 37 25 ...
$ GF1 : int 1 1 1 1 1 0 0 1 1 1 ...
$ GF2 : int 0 1 1 1 1 0 0 1 1 1 ...
$ GF3 : int 0 0 1 1 0 0 0 0 0 1 ...
$ GF4 : int 0 0 1 1 0 0 0 1 0 1 ...
$ GF5 : int 0 0 1 1 0 0 0 0 0 0 ...
$ GF6 : int 1 1 1 1 1 0 0 0 0 0 ...
$ GF7 : num 0 0 1 1 0 0 0 0 0 0 ...
$ GF9 : int 0 0 1 1 1 0 0 0 0 0 ...
MORE DATASETS

References

Borg, I., & Staufenbiel, T. (2007). Lehrbuch Theorie und Methoden der Skalierung. Bern: Hogrefe.

Examples

## Not run: 
#############################################################################
# EXAMPLE 07a: Dataset Gefechtsangst
#############################################################################

data(data.bs07a)
dat <- data.bs07a
items <- grep( "GF", colnames(dat), value=TRUE )

#************************
# Model 1: Rasch model
mod1 <- TAM::tam.mml(dat[,items] )
summary(mod1)
IRT.WrightMap(mod1)

#************************
# Model 2: 2PL model
mod2 <- TAM::tam.mml.2pl(dat[,items] )
summary(mod2)

#************************
# Model 3: Latent class analysis (LCA) with two classes
tammodel <- "
ANALYSIS:
  TYPE=LCA;
  NCLASSES(2)
  NSTARTS(5,10)
LAVAAN MODEL:
  F=~ GF1__GF9
  "
mod3 <- TAM::tamaan( tammodel, dat )
summary(mod3)

#************************
# Model 4: LCA with three classes
tammodel <- "
ANALYSIS:
  TYPE=LCA;
  NCLASSES(3)
  NSTARTS(5,10)
LAVAAN MODEL:
  F=~ GF1__GF9
  "
mod4 <- TAM::tamaan( tammodel, dat )
summary(mod4)

#************************
# Model 5: Located latent class model (LOCLCA) with two classes
tammodel <- "
ANALYSIS:
  TYPE=LOCLCA;
  NCLASSES(2)
  NSTARTS(5,10)
LAVAAN MODEL:
  F=~ GF1__GF9
  "
mod5 <- TAM::tamaan( tammodel, dat )
summary(mod5)

#************************
# Model 6: Located latent class model with three classes
tammodel <- "
ANALYSIS:
  TYPE=LOCLCA;
  NCLASSES(3)
  NSTARTS(5,10)
LAVAAN MODEL:
  F=~ GF1__GF9
  "
mod6 <- TAM::tamaan( tammodel, dat )
summary(mod6)

#************************
# Model 7: Probabilistic Guttman model
mod7 <- sirt::prob.guttman( dat[,items] )
summary(mod7)

#-- model comparison
IRT.compareModels( mod1, mod2, mod3, mod4, mod5, mod6, mod7 )

## End(Not run)

Examples with Datasets from Eid and Schmidt (2014)

Description

Examples with datasets from Eid and Schmidt (2014), illustrations with several R packages. The examples follow closely the online material of Hosoya (2014). The datasets are completely synthetic datasets which were resimulated from the originally available data.

Usage

data(data.eid.kap4)
data(data.eid.kap5)
data(data.eid.kap6)
data(data.eid.kap7)

Format

data.eid.kap4 is the dataset from Chapter 4.

'data.frame': 193 obs. of 11 variables:
$ sex : int 0 0 0 0 0 0 1 0 0 1 ...
$ Freude_1: int 1 1 1 0 1 1 1 1 1 1 ...
$ Wut_1 : int 1 1 1 0 1 1 1 1 1 1 ...
$ Angst_1 : int 1 0 0 0 1 1 1 0 1 0 ...
$ Trauer_1: int 1 1 1 0 1 1 1 1 1 1 ...
$ Ueber_1 : int 1 1 1 0 1 1 0 1 1 1 ...
$ Trauer_2: int 0 1 1 1 1 1 1 1 1 0 ...
$ Angst_2 : int 0 0 1 0 0 1 0 0 0 0 ...
$ Wut_2 : int 1 1 1 1 1 1 1 1 1 1 ...
$ Ueber_2 : int 1 0 1 0 1 1 1 0 1 1 ...
$ Freude_2: int 1 1 1 0 1 1 1 1 1 1 ...
data.eid.kap5 is the dataset from Chapter 5.

'data.frame': 499 obs. of 7 variables:
$ sex : int 0 0 0 0 1 1 1 0 0 0 ...
$ item_1: int 2 3 3 2 4 1 0 0 0 2 ...
$ item_2: int 1 1 4 1 3 3 2 1 2 3 ...
$ item_3: int 1 3 3 2 3 3 0 0 0 1 ...
$ item_4: int 2 4 3 4 3 3 3 2 0 2 ...
$ item_5: int 1 3 2 2 0 0 0 0 1 2 ...
$ item_6: int 4 3 4 3 4 3 2 1 1 3 ...
data.eid.kap6 is the dataset from Chapter 6.

'data.frame': 238 obs. of 7 variables:
$ geschl: int 1 1 0 0 0 1 0 1 1 0 ...
$ item_1: int 3 3 3 3 2 0 1 4 3 3 ...
$ item_2: int 2 2 2 2 2 0 2 3 1 3 ...
$ item_3: int 2 2 1 3 2 0 0 3 1 3 ...
$ item_4: int 2 3 3 3 3 0 2 4 3 4 ...
$ item_5: int 1 2 1 2 2 0 1 2 2 2 ...
$ item_6: int 2 2 2 2 2 0 1 2 1 2 ...
data.eid.kap7 is the dataset Emotionale Klarheit from Chapter 7.

'data.frame': 238 obs. of 9 variables:
$ geschl : int 1 0 1 1 0 1 0 1 0 1 ...
$ reakt_1: num 2.13 1.78 1.28 1.82 1.9 1.63 1.73 1.49 1.43 1.27 ...
$ reakt_2: num 1.2 1.73 0.95 1.5 1.99 1.75 1.58 1.71 1.41 0.96 ...
$ reakt_3: num 1.77 1.42 0.76 1.54 2.36 1.84 2.06 1.21 1.75 0.92 ...
$ reakt_4: num 2.18 1.28 1.39 1.82 2.09 2.15 2.1 1.13 1.71 0.78 ...
$ reakt_5: num 1.47 1.7 1.08 1.77 1.49 1.73 1.96 1.76 1.88 1.1 ...
$ reakt_6: num 1.63 0.9 0.82 1.63 1.79 1.37 1.79 1.11 1.27 1.06 ...
$ kla_th1: int 8 11 11 8 10 11 12 5 6 12 ...
$ kla_th2: int 7 11 12 8 10 11 12 5 8 11 ...

Source

The material and original datasets can be downloaded from http://www.hogrefe.de/buecher/lehrbuecher/psychlehrbuchplus/lehrbuecher/ testtheorie-und-testkonstruktion/zusatzmaterial/.

References

Eid, M., & Schmidt, K. (2014). Testtheorie und Testkonstruktion. Goettingen, Hogrefe.

Hosoya, G. (2014). Einfuehrung in die Analyse testtheoretischer Modelle mit R. Available at http://www.hogrefe.de/buecher/lehrbuecher/psychlehrbuchplus/lehrbuecher/testtheorie-und-testkonstruktion/zusatzmaterial/.

Examples

## Not run: 
miceadds::library_install("foreign")
#---- load some IRT packages in R
miceadds::library_install("TAM")        # package (a)
miceadds::library_install("mirt")       # package (b)
miceadds::library_install("sirt")       # package (c)
miceadds::library_install("eRm")        # package (d)
miceadds::library_install("ltm")        # package (e)
miceadds::library_install("psychomix")  # package (f)

#############################################################################
# EXAMPLES Ch. 4: Unidimensional IRT models | dichotomous data
#############################################################################

data(data.eid.kap4)
data0 <- data.eid.kap4

# load data
data0 <- foreign::read.spss( linkname, to.data.frame=TRUE, use.value.labels=FALSE)
# extract items
dat <- data0[,2:11]

#*********************************************************
# Model 1: Rasch model
#*********************************************************

#-----------
#-- 1a: estimation with TAM package

# estimation with tam.mml
mod1a <- TAM::tam.mml(dat)
summary(mod1a)

# person parameters in TAM
pp1a <- TAM::tam.wle(mod1a)

# plot item response functions
plot(mod1a,export=FALSE,ask=TRUE)

# Infit and outfit in TAM
itemf1a <- TAM::tam.fit(mod1a)
itemf1a

# model fit
modf1a <- TAM::tam.modelfit(mod1a)
summary(modf1a)

#-----------
#-- 1b: estimation with mirt package

# estimation with mirt
mod1b <- mirt::mirt( dat, 1, itemtype="Rasch")
summary(mod1b)
print(mod1b)

# person parameters
pp1b <- mirt::fscores(mod1b, method="WLE")

# extract coefficients
sirt::mirt.wrapper.coef(mod1b)

# plot item response functions
plot(mod1b, type="trace" )
par(mfrow=c(1,1))

# item fit
itemf1b <- mirt::itemfit(mod1b)
itemf1b

# model fit
modf1b <- mirt::M2(mod1b)
modf1b

#-----------
#-- 1c: estimation with sirt package

# estimation with rasch.mml2
mod1c <- sirt::rasch.mml2(dat)
summary(mod1c)

# person parameters (EAP)
pp1c <- mod1c$person

# plot item response functions
plot(mod1c, ask=TRUE )

# model fit
modf1c <- sirt::modelfit.sirt(mod1c)
summary(modf1c)

#-----------
#-- 1d: estimation with eRm package

# estimation with RM
mod1d <- eRm::RM(dat)
summary(mod1d)

# estimation person parameters
pp1d <- eRm::person.parameter(mod1d)
summary(pp1d)

# plot item response functions
eRm::plotICC(mod1d)

# person-item map
eRm::plotPImap(mod1d)

# item fit
itemf1d <- eRm::itemfit(pp1d)

# person fit
persf1d <- eRm::personfit(pp1d)

#-----------
#-- 1e: estimation with ltm package

# estimation with rasch
mod1e <- ltm::rasch(dat)
summary(mod1e)

# estimation person parameters
pp1e <- ltm::factor.scores(mod1e)

# plot item response functions
plot(mod1e)

# item fit
itemf1e <- ltm::item.fit(mod1e)

# person fit
persf1e <- ltm::person.fit(mod1e)

# goodness of fit with Bootstrap
modf1e <- ltm::GoF.rasch(mod1e,B=20)    # use more bootstrap samples
modf1e

#*********************************************************
# Model 2: 2PL model
#*********************************************************

#-----------
#-- 2a: estimation with TAM package

# estimation
mod2a <- TAM::tam.mml.2pl(dat)
summary(mod2a)

# model fit
modf2a <- TAM::tam.modelfit(mod2a)
summary(modf2a)

# item response functions
plot(mod2a, export=FALSE, ask=TRUE)

# model comparison
anova(mod1a,mod2a)

#-----------
#-- 2b: estimation with mirt package

# estimation
mod2b <- mirt::mirt(dat,1,itemtype="2PL")
summary(mod2b)
print(mod2b)
sirt::mirt.wrapper.coef(mod2b)

# model fit
modf2b <- mirt::M2(mod2b)
modf2b

#-----------
#-- 2c: estimation with sirt package

I <- ncol(dat)
# estimation
mod2c <- sirt::rasch.mml2(dat,est.a=1:I)
summary(mod2c)

# model fit
modf2c <- sirt::modelfit.sirt(mod2c)
summary(modf2c)

#-----------
#-- 2e: estimation with ltm package

# estimation
mod2e <- ltm::ltm(dat ~ z1 )
summary(mod2e)

# item response functions
plot(mod2e)

#*********************************************************
# Model 3: Mixture Rasch model
#*********************************************************

#-----------
#-- 3a: estimation with TAM package

# avoid "_" in column names if the "__" operator is used in
# the tamaan syntax
dat1 <- dat
colnames(dat1) <- gsub("_", "", colnames(dat1) )
# define tamaan model
tammodel <- "
ANALYSIS:
  TYPE=MIXTURE ;
  NCLASSES(2);
  NSTARTS(20,25);   # 20 random starts with 25 initial iterations each
LAVAAN MODEL:
  F=~ Freude1__Freude2
  F ~~ F
ITEM TYPE:
  ALL(Rasch);
    "
mod3a <- TAM::tamaan( tammodel, resp=dat1 )
summary(mod3a)
# extract item parameters
ipars <- mod2$itempartable_MIXTURE[ 1:10, ]
plot( 1:10, ipars[,3], type="o", ylim=range( ipars[,3:4] ), pch=16,
        xlab="Item", ylab="Item difficulty")
lines( 1:10, ipars[,4], type="l", col=2, lty=2)
points( 1:10, ipars[,4],  col=2, pch=2)

#-----------
#-- 3f: estimation with psychomix package

# estimation
mod3f <- psychomix::raschmix( as.matrix(dat), k=2, scores="meanvar")
summary(mod3f)
# plot class-specific item difficulties
plot(mod3f)

#############################################################################
# EXAMPLES Ch. 5: Unidimensional IRT models | polytomous data
#############################################################################

data(data.eid.kap5)
data0 <- data.eid.kap5
# extract items
dat <- data0[,2:7]

#*********************************************************
# Model 1: Partial credit model
#*********************************************************

#-----------
#-- 1a: estimation with TAM package

# estimation with tam.mml
mod1a <- TAM::tam.mml(dat)
summary(mod1a)

# person parameters in TAM
pp1a <- tam.wle(mod1a)

# plot item response functions
plot(mod1a,export=FALSE,ask=TRUE)

# Infit and outfit in TAM
itemf1a <- TAM::tam.fit(mod1a)
itemf1a

# model fit
modf1a <- TAM::tam.modelfit(mod1a)
summary(modf1a)

#-----------
#-- 1b: estimation with mirt package

# estimation with tam.mml
mod1b <- mirt::mirt( dat, 1, itemtype="Rasch")
summary(mod1b)
print(mod1b)
sirt::mirt.wrapper.coef(mod1b)

# plot item response functions
plot(mod1b, type="trace" )
par(mfrow=c(1,1))

# item fit
itemf1b <- mirt::itemfit(mod1b)
itemf1b

#-----------
#-- 1c: estimation with sirt package

# estimation with rm.facets
mod1c <- sirt::rm.facets(dat)
summary(mod1c)
summary(mod1a)

#-----------
#-- 1d: estimation with eRm package

# estimation
mod1d <- eRm::PCM(dat)
summary(mod1d)

# plot item response functions
eRm::plotICC(mod1d)

# person-item map
eRm::plotPImap(mod1d)

# item fit
itemf1d <- eRm::itemfit(pp1d)

#-----------
#-- 1e: estimation with ltm package

# estimation
mod1e <- ltm::gpcm(dat, constraint="1PL")
summary(mod1e)
# plot item response functions
plot(mod1e)

#*********************************************************
# Model 2: Generalized partial credit model
#*********************************************************

#-----------
#-- 2a: estimation with TAM package

# estimation with tam.mml
mod2a <- TAM::tam.mml.2pl(dat, irtmodel="GPCM")
summary(mod2a)

# model fit
modf2a <- TAM::tam.modelfit(mod2a)
summary(modf2a)

#-----------
#-- 2b: estimation with mirt package

# estimation
mod2b <- mirt::mirt( dat, 1, itemtype="gpcm")
summary(mod2b)
print(mod2b)
sirt::mirt.wrapper.coef(mod2b)

#-----------
#-- 2c: estimation with sirt package

# estimation with rm.facets
mod2c <- sirt::rm.facets(dat, est.a.item=TRUE)
summary(mod2c)

#-----------
#-- 2e: estimation with ltm package

# estimation
mod2e <- ltm::gpcm(dat)
summary(mod2e)
plot(mod2e)

## End(Not run)

Dataset European Social Survey 2005

Description

This dataset contains item loadings \lambda and intercepts \nu for 26 countries for the European Social Survey (ESS 2005; see Asparouhov & Muthen, 2014).

Usage

data(data.ess2005)

Format

The format of the dataset is:

List of 2
$ lambda: num [1:26, 1:4] 0.688 0.721 0.72 0.687 0.625 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "ipfrule" "ipmodst" "ipbhprp" "imptrad"
$ nu : num [1:26, 1:4] 3.26 2.52 3.41 2.84 2.79 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "ipfrule" "ipmodst" "ipbhprp" "imptrad"

References

Asparouhov, T., & Muthen, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling, 21(4), 1-14. doi:10.1080/10705511.2014.919210

C-Test Datasets

Description

Some datasets of C-tests are provided. The dataset data.g308 was used in Schroeders, Robitzsch and Schipolowski (2014).

Usage

data(data.g308)

Format

The dataset data.g308 is a C-test containing 20 items and is used in Schroeders, Robitzsch and Schipolowski (2014) and is of the following format

'data.frame': 747 obs. of 21 variables:
$ id : int 1 2 3 4 5 6 7 8 9 10 ...
$ G30801: int 1 1 1 1 1 0 0 1 1 1 ...
$ G30802: int 1 1 1 1 1 1 1 1 1 1 ...
$ G30803: int 1 1 1 1 1 1 1 1 1 1 ...
$ G30804: int 1 1 1 1 1 0 1 1 1 1 ...
[...]
$ G30817: int 0 0 0 0 1 0 1 0 1 0 ...
$ G30818: int 0 0 1 0 0 0 0 1 1 0 ...
$ G30819: int 1 1 1 1 0 0 1 1 1 0 ...
$ G30820: int 1 1 1 1 0 0 0 1 1 0 ...

References

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Dataset G308 from Schroeders et al. (2014)
#############################################################################

data(data.g308)
dat <- data.g308

library(TAM)
library(sirt)

# define testlets
testlet <- c(1, 1, 2, 2, 2, 2, 2, 2, 3, 3, 4, 4, 4, 4, 4, 5, 5, 6, 6, 6)

#****************************************
#*** Model 1: Rasch model
mod1 <- TAM::tam.mml(resp=dat, control=list(maxiter=300, snodes=1500))
summary(mod1)

#****************************************
#*** Model 2: Rasch testlet model

# testlets are dimensions, assign items to Q-matrix
TT <- length(unique(testlet))
Q <- matrix(0, nrow=ncol(dat), ncol=TT + 1)
Q[,1] <- 1 # First dimension constitutes g-factor
for (tt in 1:TT){Q[testlet==tt, tt+1] <- 1}

# In a testlet model, all dimensions are uncorrelated among
# each other, that is, all pairwise correlations are set to 0,
# which can be accomplished with the "variance.fixed" command
variance.fixed <- cbind(t( utils::combn(TT+1,2)), 0)
mod2 <- TAM::tam.mml(resp=dat, Q=Q, variance.fixed=variance.fixed,
            control=list(snodes=1500, maxiter=300))
summary(mod2)

#****************************************
#*** Model 3: Partial credit model

scores <- list()
testlet.names <- NULL
dat.pcm <- NULL
for (tt in 1:max(testlet) ){
   scores[[tt]] <- rowSums (dat[, testlet==tt, drop=FALSE])
   dat.pcm <- c(dat.pcm, list(c(scores[[tt]])))
   testlet.names <- append(testlet.names, paste0("testlet",tt) )
   }
dat.pcm <- as.data.frame(dat.pcm)
colnames(dat.pcm) <- testlet.names
mod3 <- TAM::tam.mml(resp=dat.pcm, control=list(snodes=1500, maxiter=300) )
summary(mod3)

#****************************************
#*** Model 4: Copula model

mod4 <- sirt::rasch.copula2 (dat=dat, itemcluster=testlet)
summary(mod4)

## End(Not run)

Dataset for Invariance Testing with 4 Groups

Description

Dataset for invariance testing with 4 groups.

Usage

data(data.inv4gr)

Format

A data frame with 4000 observations on the following 12 variables. The first variable is a group identifier, the other variables are items.

group: A group identifier
I01: a numeric vector
I02: a numeric vector
I03: a numeric vector
I04: a numeric vector
I05: a numeric vector
I06: a numeric vector
I07: a numeric vector
I08: a numeric vector
I09: a numeric vector
I10: a numeric vector
I11: a numeric vector

Source

Simulated dataset

Dataset 'Liking For Science'

Description

Dataset 'Liking for science' published by Wright and Masters (1982).

Usage

data(data.liking.science)

Format

The format is:

num [1:75, 1:24] 1 2 2 1 1 1 2 2 0 2 ...
- attr(*, "dimnames")=List of 2
..$ : NULL
..$ : chr [1:24] "LS01" "LS02" "LS03" "LS04" ...

References

Wright, B. D., & Masters, G. N. (1982). Rating scale analysis. Chicago: MESA Press.

Longitudinal Dataset

Description

This dataset contains 200 observations on 12 items. 6 items (I1T1, ...,I6T1) were administered at measurement occasion T1 and 6 items at T2 (I3T2, ..., I8T2). There were 4 anchor items which were presented at both time points. The first column in the dataset contains the student identifier.

Usage

data(data.long)

Format

The format of the dataset is

'data.frame': 200 obs. of 13 variables:
$ idstud: int 1001 1002 1003 1004 1005 1006 1007 1008 1009 1010 ...
$ I1T1 : int 1 1 1 1 1 1 1 0 1 1 ...
$ I2T1 : int 0 0 1 1 1 1 0 1 1 1 ...
$ I3T1 : int 1 0 1 1 0 1 0 0 0 0 ...
$ I4T1 : int 1 0 0 1 0 0 0 0 1 1 ...
$ I5T1 : int 1 0 0 1 0 0 0 0 1 0 ...
$ I6T1 : int 1 0 0 0 0 0 0 0 0 0 ...
$ I3T2 : int 1 1 0 0 1 1 1 1 0 1 ...
$ I4T2 : int 1 1 0 0 1 1 0 0 0 1 ...
$ I5T2 : int 1 0 1 1 1 1 1 0 1 1 ...
$ I6T2 : int 1 1 0 0 0 0 0 0 0 1 ...
$ I7T2 : int 1 0 0 0 0 0 0 0 0 1 ...
$ I8T2 : int 0 0 0 0 1 0 0 0 0 0 ...

Examples

## Not run: 
data(data.long)
dat <- data.long
dat <- dat[,-1]
I <- ncol(dat)

#*************************************************
# Model 1: 2-dimensional Rasch model
#*************************************************
# define Q-matrix
Q <- matrix(0,I,2)
Q[1:6,1] <- 1
Q[7:12,2] <- 1
rownames(Q) <- colnames(dat)
colnames(Q) <- c("T1","T2")

# vector with same items
itemnr <- as.numeric( substring( colnames(dat),2,2) )
# fix mean at T2 to zero
mu.fixed <- cbind( 2,0 )

#--- M1a: rasch.mml2 (in sirt)
mod1a <- sirt::rasch.mml2(dat, Q=Q, est.b=itemnr, mu.fixed=mu.fixed)
summary(mod1a)

#--- M1b: smirt (in sirt)
mod1b <- sirt::smirt(dat, Qmatrix=Q, irtmodel="comp", est.b=itemnr,
                  mu.fixed=mu.fixed )

#--- M1c: tam.mml (in TAM)

# assume equal item difficulty of I3T1 and I3T2, I4T1 and I4T2, ...
# create draft design matrix and modify it
A <- TAM::designMatrices(resp=dat)$A
dimnames(A)[[1]] <- colnames(dat)
  ##   > str(A)
  ##    num [1:12, 1:2, 1:12] 0 0 0 0 0 0 0 0 0 0 ...
  ##    - attr(*, "dimnames")=List of 3
  ##     ..$ : chr [1:12] "Item01" "Item02" "Item03" "Item04" ...
  ##     ..$ : chr [1:2] "Category0" "Category1"
  ##     ..$ : chr [1:12] "I1T1" "I2T1" "I3T1" "I4T1" ...
A1 <- A[,, c(1:6, 11:12 ) ]
A1[7,2,3] <- -1     # difficulty(I3T1)=difficulty(I3T2)
A1[8,2,4] <- -1     # I4T1=I4T2
A1[9,2,5] <- A1[10,2,6] <- -1
dimnames(A1)[[3]] <- substring( dimnames(A1)[[3]],1,2)
  ##   > A1[,2,]
  ##        I1 I2 I3 I4 I5 I6 I7 I8
  ##   I1T1 -1  0  0  0  0  0  0  0
  ##   I2T1  0 -1  0  0  0  0  0  0
  ##   I3T1  0  0 -1  0  0  0  0  0
  ##   I4T1  0  0  0 -1  0  0  0  0
  ##   I5T1  0  0  0  0 -1  0  0  0
  ##   I6T1  0  0  0  0  0 -1  0  0
  ##   I3T2  0  0 -1  0  0  0  0  0
  ##   I4T2  0  0  0 -1  0  0  0  0
  ##   I5T2  0  0  0  0 -1  0  0  0
  ##   I6T2  0  0  0  0  0 -1  0  0
  ##   I7T2  0  0  0  0  0  0 -1  0
  ##   I8T2  0  0  0  0  0  0  0 -1

# estimate model
# set intercept of second dimension (T2) to zero
beta.fixed <- cbind( 1, 2, 0 )
mod1c <- TAM::tam.mml( resp=dat, Q=Q, A=A1, beta.fixed=beta.fixed)
summary(mod1c)

#*************************************************
# Model 2: 2-dimensional 2PL model
#*************************************************

# set variance at T2 to 1
variance.fixed <- cbind(2,2,1)

# M2a: rasch.mml2 (in sirt)
mod2a <- sirt::rasch.mml2(dat, Q=Q, est.b=itemnr, est.a=itemnr, mu.fixed=mu.fixed,
             variance.fixed=variance.fixed, mmliter=100)
summary(mod2a)

#*************************************************
# Model 3: Concurrent calibration by assuming invariant item parameters
#*************************************************

library(mirt)   # use mirt for concurrent calibration
data(data.long)
dat <- data.long[,-1]
I <- ncol(dat)

# create user defined function for between item dimensionality 4PL model
name <- "4PLbw"
par <- c("low"=0,"upp"=1,"a"=1,"d"=0,"dimItem"=1)
est <- c(TRUE, TRUE,TRUE,TRUE,FALSE)
# item response function
irf <- function(par,Theta,ncat){
     low <- par[1]
     upp <- par[2]
     a <- par[3]
     d <- par[4]
     dimItem <- par[5]
     P1 <- low + ( upp - low ) * plogis( a*Theta[,dimItem] + d )
     cbind(1-P1, P1)
}

# create item response function
fourPLbetw <- mirt::createItem(name, par=par, est=est, P=irf)
head(dat)

# create mirt model (use variable names in mirt.model)
mirtsyn <- "
     T1=I1T1,I2T1,I3T1,I4T1,I5T1,I6T1
     T2=I3T2,I4T2,I5T2,I6T2,I7T2,I8T2
     COV=T1*T2,,T2*T2
     MEAN=T1
     CONSTRAIN=(I3T1,I3T2,d),(I4T1,I4T2,d),(I5T1,I5T2,d),(I6T1,I6T2,d),
                 (I3T1,I3T2,a),(I4T1,I4T2,a),(I5T1,I5T2,a),(I6T1,I6T2,a)
        "
# create mirt model
mirtmodel <- mirt::mirt.model( mirtsyn, itemnames=colnames(dat) )
# define parameters to be estimated
mod3.pars <- mirt::mirt(dat, mirtmodel$model, rep( "4PLbw",I),
                   customItems=list("4PLbw"=fourPLbetw), pars="values")
# select dimensions
ind <- intersect( grep("T2",mod3.pars$item), which( mod3.pars$name=="dimItem" ) )
mod3.pars[ind,"value"] <- 2
# set item parameters low and upp to non-estimated
ind <- which( mod3.pars$name %in% c("low","upp") )
mod3.pars[ind,"est"] <- FALSE

# estimate 2PL model
mod3 <- mirt::mirt(dat, mirtmodel$model, itemtype=rep( "4PLbw",I),
                customItems=list("4PLbw"=fourPLbetw), pars=mod3.pars, verbose=TRUE,
                technical=list(NCYCLES=50)  )
mirt.wrapper.coef(mod3)

#****** estimate model in lavaan
library(lavaan)

# specify syntax
lavmodel <- "
             #**** T1
             F1=~ a1*I1T1+a2*I2T1+a3*I3T1+a4*I4T1+a5*I5T1+a6*I6T1
             I1T1 | b1*t1 ; I2T1 | b2*t1 ; I3T1 | b3*t1 ; I4T1 | b4*t1
             I5T1 | b5*t1 ; I6T1 | b6*t1
             F1 ~~ 1*F1
             #**** T2
             F2=~ a3*I3T2+a4*I4T2+a5*I5T2+a6*I6T2+a7*I7T2+a8*I8T2
             I3T2 | b3*t1 ; I4T2 | b4*t1 ; I5T2 | b5*t1 ; I6T2 | b6*t1
             I7T2 | b7*t1 ; I8T2 | b8*t1
             F2 ~~ NA*F2
             F2 ~ 1
             #*** covariance
             F1 ~~ F2
                "
# estimate model using theta parameterization
mod3lav <- lavaan::cfa( data=dat, model=lavmodel,
            std.lv=TRUE, ordered=colnames(dat), parameterization="theta")
summary(mod3lav, standardized=TRUE, fit.measures=TRUE, rsquare=TRUE)

#*************************************************
# Model 4: Linking with items of different item slope groups
#*************************************************

data(data.long)
dat <- data.long
# dataset for T1
dat1 <- dat[, grep( "T1", colnames(dat) ) ]
colnames(dat1) <- gsub("T1","", colnames(dat1) )
# dataset for T2
dat2 <- dat[, grep( "T2", colnames(dat) ) ]
colnames(dat2) <- gsub("T2","", colnames(dat2) )

# 2PL model with slope groups T1
mod1 <- sirt::rasch.mml2( dat1, est.a=c( rep(1,2), rep(2,4) ) )
summary(mod1)

# 2PL model with slope groups T2
mod2 <- sirt::rasch.mml2( dat2, est.a=c( rep(1,4), rep(2,2) ) )
summary(mod2)

#------- Link 1: Haberman Linking
# collect item parameters
dfr1 <- data.frame( "study1", mod1$item$item, mod1$item$a, mod1$item$b )
dfr2 <- data.frame( "study2", mod2$item$item, mod2$item$a, mod2$item$b )
colnames(dfr2) <- colnames(dfr1) <- c("study", "item", "a", "b" )
itempars <- rbind( dfr1, dfr2 )
# Linking
link1 <- sirt::linking.haberman(itempars=itempars)

#------- Link 2: Invariance alignment method
# create objects for invariance.alignment
nu <- rbind( c(mod1$item$thresh,NA,NA), c(NA,NA,mod2$item$thresh) )
lambda <- rbind( c(mod1$item$a,NA,NA), c(NA,NA,mod2$item$a ) )
colnames(lambda) <- colnames(nu) <- paste0("I",1:8)
rownames(lambda) <- rownames(nu) <- c("T1", "T2")
# Linking
link2a <- sirt::invariance.alignment( lambda, nu )
summary(link2a)

## End(Not run)

Datasets for Local Structural Equation Models / Moderated Factor Analysis

Description

Datasets for local structural equation models or moderated factor analysis.

Usage

data(data.lsem01)
data(data.lsem02)
data(data.lsem03)

Format

The dataset data.lsem01 has the following structure

'data.frame': 989 obs. of 6 variables:
$ age: num 4 4 4 4 4 4 4 4 4 4 ...
$ v1 : num 1.83 2.38 1.85 4.53 -0.04 4.35 2.38 1.83 4.81 2.82 ...
$ v2 : num 6.06 9.08 7.41 8.24 6.18 7.4 6.54 4.28 6.43 7.6 ...
$ v3 : num 1.42 3.05 6.42 -1.05 -1.79 4.06 -0.17 -2.64 0.84 6.42 ...
$ v4 : num 3.84 4.24 3.24 3.36 2.31 6.07 4 5.93 4.4 3.49 ...
$ v5 : num 7.84 7.51 6.62 8.02 7.12 7.99 7.25 7.62 7.66 7.03 ...
The dataset data.lsem02 is a slightly perturbed dataset of the Woodcock-Johnson III (WJ-III) Tests of Cognitive Abilities used in Hildebrandt et al. (2016) and has the following structure

'data.frame': 1129 obs. of 8 variables:
$ age : int 4 4 4 4 4 4 4 4 4 4 ...
$ gcw : num -3.53 -3.73 -3.77 -3.84 -4.26 -4.6 -3.66 -4.31 -4.46 -3.64 ...
$ gvw : num -1.98 -1.35 -1.66 -3.24 -1.17 -2.78 -2.97 -3.88 -3.22 -0.68 ...
$ gfw : num -2.49 -2.41 -4.48 -4.17 -4.43 -5.06 -3.94 -3.66 -3.7 -2.74 ...
$ gsw : num -4.85 -5.05 -5.66 -4.3 -5.23 -5.63 -4.91 -5.75 -6.29 -5.47 ...
$ gsmw: num -2.99 -1.13 -4.21 -3.59 -3.79 -4.77 -2.98 -4.48 -2.99 -3.83 ...
$ glrw: num -2.49 -2.91 -3.45 -2.91 -3.31 -3.78 -3.5 -3.96 -2.97 -3.14 ...
$ gaw : num -3.22 -3.77 -3.54 -3.6 -3.22 -3.5 -1.27 -2.08 -2.23 -3.25 ...
The dataset data.lsem03 is a synthetic dataset of the SON-R application used in Hueluer et al. (2011) has the following structure

'data.frame': 1027 obs. of 10 variables:
$ id : num 10001 10002 10003 10004 10005 ...
$ female : int 0 0 0 0 0 0 0 0 0 0 ...
$ age : num 2.62 2.65 2.66 2.67 2.68 2.68 2.68 2.69 2.71 2.71 ...
$ age_group: int 1 1 1 1 1 1 1 1 1 1 ...
$ p1 : num -1.98 -1.98 -1.67 -2.29 -1.67 -1.98 -2.29 -1.98 -2.6 -1.67 ...
$ p2 : num -1.51 -1.51 -0.55 -1.84 -1.51 -1.84 -2.16 -1.84 -2.48 -1.84 ...
$ p3 : num -1.4 -2.31 -1.1 -2 -1.4 -1.7 -2.31 -1.4 -2.31 -0.79 ...
$ r1 : num -1.46 -1.14 -0.49 -2.11 -1.46 -1.46 -2.11 -1.46 -2.75 -1.78 ...
$ r2 : num -2.67 -1.74 0.74 -1.74 -0.81 -1.43 -2.05 -1.43 -1.74 -1.12 ...
$ r3 : num -1.64 -1.64 -1.64 -0.9 -1.27 -3.11 -2.74 -1.64 -2.37 -1.27 ...

The subtests Mosaics (p1), Puzzles (p1), and Patterns (p3) constitute the performance subscale; the subtests Categories (r1), Analogies (r2), and Situations (r3) constitute the reasoning subscale.

References

Hueluer, G., Wilhelm, O., & Robitzsch, A. (2011). Intelligence differentiation in early childhood. Journal of Individual Differences, 32(3), 170-179. doi:10.1027/1614-0001/a000049

Dataset Mathematics

Description

This is an example dataset involving Mathematics items for German fourth graders. Items are classified into several domains and subdomains (see Section Format). The dataset contains 664 students on 30 items.

Usage

data(data.math)

Format

The dataset is a list. The list element data contains the dataset with the demographic variables student ID (idstud) and a dummy variable for female students (female). The remaining variables (starting with M in the name) are the mathematics items.
The item metadata are included in the list element item which contains item name (item) and the testlet label (testlet). An item not included in a testlet is indicated by NA. Each item is allocated to one and only competence domain (domain).

The format is:

List of 2
$ data:'data.frame':
..$ idstud: int [1:664] 1001 1002 1003 ...
..$ female: int [1:664] 1 1 0 0 1 1 1 0 0 1 ...
..$ MA1 : int [1:664] 1 1 1 0 0 1 1 1 1 1 ...
..$ MA2 : int [1:664] 1 1 1 1 1 0 0 0 0 1 ...
..$ MA3 : int [1:664] 1 1 0 0 0 0 0 1 0 0 ...
..$ MA4 : int [1:664] 0 1 1 1 0 0 1 0 0 0 ...
..$ MB1 : int [1:664] 0 1 0 1 0 0 0 0 0 1 ...
..$ MB2 : int [1:664] 1 1 1 1 0 1 0 1 0 0 ...
..$ MB3 : int [1:664] 1 1 1 1 0 0 0 1 0 1 ...
[...]
..$ MH3 : int [1:664] 1 1 0 1 0 0 1 0 1 0 ...
..$ MH4 : int [1:664] 0 1 1 1 0 0 0 0 1 0 ...
..$ MI1 : int [1:664] 1 1 0 1 0 1 0 0 1 0 ...
..$ MI2 : int [1:664] 1 1 0 0 0 1 1 0 1 1 ...
..$ MI3 : int [1:664] 0 1 0 1 0 0 0 0 0 0 ...
$ item:'data.frame':
..$ item : Factor w/ 30 levels "MA1","MA2","MA3",..: 1 2 3 4 5 ...
..$ testlet : Factor w/ 9 levels "","MA","MB","MC",..: 2 2 2 2 3 3 ...
..$ domain : Factor w/ 3 levels "arithmetic","geometry",..: 1 1 1 ...
..$ subdomain: Factor w/ 9 levels "","addition",..: 2 2 2 2 7 7 ...

Some Datasets from McDonald's Test Theory Book

Description

Some datasets from McDonald (1999), especially related to using NOHARM for item response modeling. See Examples below.

Usage

data(data.mcdonald.act15)
data(data.mcdonald.LSAT6)
data(data.mcdonald.rape)

Format

The format of the ACT15 data data.mcdonald.act15 is:

num [1:15, 1:15] 0.49 0.44 0.38 0.3 0.29 0.13 0.23 0.16 0.16 0.23 ...
- attr(*, "dimnames")=List of 2
..$ : chr [1:15] "A01" "A02" "A03" "A04" ...
..$ : chr [1:15] "A01" "A02" "A03" "A04" ...

The dataset (which is the product-moment covariance matrix) is obtained from Ch. 12 in McDonald (1999).
The format of the LSAT6 data data.mcdonald.LSAT6 is:

'data.frame': 1004 obs. of 5 variables:
$ L1: int 0 0 0 0 0 0 0 0 0 0 ...
$ L2: int 0 0 0 0 0 0 0 0 0 0 ...
$ L3: int 0 0 0 0 0 0 0 0 0 0 ...
$ L4: int 0 0 0 0 0 0 0 0 0 1 ...
$ L5: int 0 0 0 1 1 1 1 1 1 0 ...

The dataset is obtained from Ch. 6 in McDonald (1999).
The format of the rape myth scale data data.mcdonald.rape is

List of 2
$ lambda: num [1:2, 1:19] 1.13 0.88 0.85 0.77 0.79 0.55 1.12 1.01 0.99 0.79 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:2] "male" "female"
.. ..$ : chr [1:19] "I1" "I2" "I3" "I4" ...
$ nu : num [1:2, 1:19] 2.88 1.87 3.12 2.32 2.13 1.43 3.79 2.6 3.01 2.11 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : chr [1:2] "male" "female"
.. ..$ : chr [1:19] "I1" "I2" "I3" "I4" ...

The dataset is obtained from Ch. 15 in McDonald (1999).

Source

Tables in McDonald (1999)

References

McDonald, R. P. (1999). Test theory: A unified treatment. Psychology Press.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: LSAT6 data    | Chapter 12 McDonald (1999)
#############################################################################
data(data.mcdonald.act15)

#************
# Model 1: 2-parameter normal ogive model

#++ NOHARM estimation
I <- ncol(dat)
# covariance structure
P.pattern <- matrix( 0, ncol=1, nrow=1 )
P.init <- 1+0*P.pattern
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 1, nrow=I, ncol=1 )
F.init <- F.pattern
# estimate model
mod1a <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
             F.init=F.init, P.pattern=P.pattern, P.init=P.init,
             writename="LSAT6__1dim_2pno", noharm.path=noharm.path, dec="," )
summary(mod1a, logfile="LSAT6__1dim_2pno__SUMMARY")

#++ pairwise marginal maximum likelihood estimation using the probit link
mod1b <- sirt::rasch.pml3( dat, est.a=1:I, est.sigma=FALSE)

#************
# Model 2: 1-parameter normal ogive model

#++ NOHARM estimation
# covariance structure
P.pattern <- matrix( 0, ncol=1, nrow=1 )
P.init <- 1+0*P.pattern
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 2, nrow=I, ncol=1 )
F.init <- 1+0*F.pattern
# estimate model
mod2a <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
                F.init=F.init, P.pattern=P.pattern, P.init=P.init,
                writename="LSAT6__1dim_1pno", noharm.path=noharm.path, dec="," )
summary(mod2a, logfile="LSAT6__1dim_1pno__SUMMARY")

# PMML estimation
mod2b <- sirt::rasch.pml3( dat, est.a=rep(1,I), est.sigma=FALSE )
summary(mod2b)

#************
# Model 3: 3-parameter normal ogive model with fixed guessing parameters

#++ NOHARM estimation
# covariance structure
P.pattern <- matrix( 0, ncol=1, nrow=1 )
P.init <- 1+0*P.pattern
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 1, nrow=I, ncol=1 )
F.init <- 1+0*F.pattern
# estimate model
mod <- sirt::R2noharm( dat=dat, model.type="CFA",  guesses=rep(.2,I),
            F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
            P.init=P.init, writename="LSAT6__1dim_3pno",
            noharm.path=noharm.path, dec="," )
summary(mod, logfile="LSAT6__1dim_3pno__SUMMARY")

#++ logistic link function employed in smirt function
mod1d <- sirt::smirt(dat, Qmatrix=F.pattern, est.a=matrix(1:I,I,1), c.init=rep(.2,I))
summary(mod1d)

#############################################################################
# EXAMPLE 2: ACT15 data    | Chapter 6 McDonald (1999)
#############################################################################
data(data.mcdonald.act15)
pm <- data.mcdonald.act15

#************
# Model 1: 2-dimensional exploratory factor analysis
mod1 <- sirt::R2noharm( pm=pm, n=1000, model.type="EFA", dimensions=2,
             writename="ACT15__efa_2dim", noharm.path=noharm.path, dec="," )
summary(mod1)

#************
# Model 2: 2-dimensional independent clusters basis solution
P.pattern <- matrix(1,2,2)
diag(P.pattern) <- 0
P.init <- 1+0*P.pattern
F.pattern <- matrix(0,15,2)
F.pattern[ c(1:5,11:15),1] <- 1
F.pattern[ c(6:10,11:15),2] <- 1
F.init <- F.pattern

# estimate model
mod2 <- sirt::R2noharm( pm=pm, n=1000,  model.type="CFA", F.pattern=F.pattern,
            F.init=F.init, P.pattern=P.pattern,P.init=P.init,
            writename="ACT15_indep_clusters", noharm.path=noharm.path, dec="," )
summary(mod2)

#************
# Model 3: Hierarchical model

P.pattern <- matrix(0,3,3)
P.init <- P.pattern
diag(P.init) <- 1
F.pattern <- matrix(0,15,3)
F.pattern[,1] <- 1    # all items load on g factor
F.pattern[ c(1:5,11:15),2] <- 1   # Items 1-5 and 11-15 load on first nested factor
F.pattern[ c(6:10,11:15),3] <- 1  # Items 6-10 and 11-15 load on second nested factor
F.init <- F.pattern

# estimate model
mod3 <- sirt::R2noharm( pm=pm, n=1000,  model.type="CFA", F.pattern=F.pattern,
           F.init=F.init, P.pattern=P.pattern, P.init=P.init,
           writename="ACT15_hierarch_model", noharm.path=noharm.path, dec="," )
summary(mod3)

#############################################################################
# EXAMPLE 3: Rape myth scale | Chapter 15 McDonald (1999)
#############################################################################
data(data.mcdonald.rape)
lambda <- data.mcdonald.rape$lambda
nu <- data.mcdonald.rape$nu

# obtain multiplier for factor loadings (Formula 15.5)
k <- sum( lambda[1,] * lambda[2,] ) / sum( lambda[2,]^2 )
  ##   [1] 1.263243

# additive parameter (Formula 15.7)
c <- sum( lambda[2,]*(nu[1,]-nu[2,]) ) / sum( lambda[2,]^2 )
  ##   [1] 1.247697

# SD in the female group
1/k
  ##   [1] 0.7916132

# M in the female group
- c/k
  ##   [1] -0.9876932

# Burt's coefficient of factorial congruence (Formula 15.10a)
sum( lambda[1,] * lambda[2,] ) / sqrt( sum( lambda[1,]^2 ) * sum( lambda[2,]^2 ) )
  ##   [1] 0.9727831

# congruence for mean parameters
sum(  (nu[1,]-nu[2,]) * lambda[2,] ) / sqrt( sum( (nu[1,]-nu[2,])^2 ) * sum( lambda[2,]^2 ) )
  ##   [1] 0.968176

## End(Not run)

Dataset with Mixed Dichotomous and Polytomous Item Responses

Description

Dataset with mixed dichotomous and polytomous item responses.

Usage

data(data.mixed1)

Format

A data frame with 1000 observations on the following 37 variables.

'data.frame': 1000 obs. of 37 variables:
$ I01: num 1 1 1 1 1 1 1 0 1 1 ...
$ I02: num 1 1 1 1 1 1 1 1 0 1 ...
[...]
$ I36: num 1 1 1 1 0 0 0 0 1 1 ...
$ I37: num 0 1 1 1 0 1 0 0 1 1 ...

Examples

data(data.mixed1)
apply( data.mixed1, 2, max )
  ##   I01 I02 I03 I04 I05 I06 I07 I08 I09 I10 I11 I12 I13 I14 I15 I16
  ##     1   1   1   1   1   1   1   1   1   1   1   1   1   1   1   1
  ##   I17 I18 I19 I20 I21 I22 I23 I24 I25 I26 I27 I28 I29 I30 I31 I32
  ##     1   1   1   1   4   4   1   1   1   1   1   1   1   1   1   1
  ##   I33 I34 I35 I36 I37
  ##     1   1   1   1   1

Multilevel Datasets

Description

Datasets for conducting multilevel IRT analysis. This dataset is used in the examples of the function mcmc.2pno.ml.

Usage

data(data.ml1)
data(data.ml2)

Format

data.ml1

A data frame with 2000 student observations in 100 classes on 17 variables. The first variable group contains the class identifier. The remaining 16 variables are dichotomous test items.

'data.frame': 2000 obs. of 17 variables:
$ group: num 1001 1001 1001 1001 1001 ...
$ X1 : num 1 1 1 1 1 1 1 1 1 1 ...
$ X2 : num 1 1 1 0 1 1 1 1 1 1 ...
$ X3 : num 0 1 1 0 1 0 1 0 1 0 ...
$ X4 : num 1 1 1 0 0 1 1 1 1 1 ...
$ X5 : num 0 0 0 1 1 1 0 0 1 1 ...
[...]
$ X16 : num 0 0 1 0 0 0 1 0 0 0 ...
data.ml2

A data frame with 2000 student observations in 100 classes on 6 variables. The first variable group contains the class identifier. The remaining 5 variables are polytomous test items.

'data.frame': 2000 obs. of 6 variables:
$ group: num 1 1 1 1 1 1 1 1 1 1 ...
$ X1 : num 2 3 4 3 3 3 1 4 4 3 ...
$ X2 : num 2 2 4 3 3 2 2 3 4 3 ...
$ X3 : num 3 4 5 4 2 3 3 4 4 2 ...
$ X4 : num 2 3 3 2 1 3 1 4 4 3 ...
$ X5 : num 2 3 3 2 3 3 1 3 2 2 ...

Datasets for NOHARM Analysis

Description

Datasets for analyses in NOHARM (see R2noharm).

Usage

data(data.noharmExC)
data(data.noharm18)

Format

data.noharmExC

The format of this dataset is

'data.frame': 300 obs. of 8 variables:
$ C1: int 1 1 1 1 1 0 1 1 1 1 ...
$ C2: int 1 1 1 1 0 1 1 1 1 1 ...
$ C3: int 1 1 1 1 1 0 0 0 1 1 ...
$ C4: int 0 0 1 1 1 1 1 0 1 0 ...
$ C5: int 1 1 1 1 1 0 0 1 1 0 ...
$ C6: int 1 0 0 0 1 0 1 1 0 1 ...
$ C7: int 1 1 0 0 1 1 0 0 0 1 ...
$ C8: int 1 0 1 0 1 0 1 0 1 1 ...
data.noharm18

A data frame with 200 observations on the following 18 variables I01, ..., I18. The format is

'data.frame': 200 obs. of 18 variables:
$ I01: int 1 1 1 1 1 0 1 1 0 1 ...
$ I02: int 1 1 0 1 1 0 1 1 1 1 ...
$ I03: int 1 0 0 1 0 0 1 1 0 1 ...
$ I04: int 0 1 0 1 0 0 0 1 1 1 ...
$ I05: int 1 0 0 0 1 0 1 1 0 1 ...
$ I06: int 1 1 0 1 0 0 1 1 0 1 ...
$ I07: int 1 1 1 1 0 1 1 1 1 1 ...
$ I08: int 1 1 1 1 1 1 1 1 0 1 ...
$ I09: int 1 1 1 1 0 0 1 1 0 1 ...
$ I10: int 1 0 0 1 1 0 1 1 0 1 ...
$ I11: int 1 1 1 1 0 0 1 1 0 1 ...
$ I12: int 0 0 0 0 0 1 0 0 0 0 ...
$ I13: int 1 1 1 1 0 1 1 0 1 1 ...
$ I14: int 1 1 1 0 1 0 1 1 0 1 ...
$ I15: int 1 1 1 0 0 1 1 1 0 1 ...
$ I16: int 1 1 0 1 1 0 1 0 1 1 ...
$ I17: int 0 1 0 0 0 0 1 1 0 1 ...
$ I18: int 0 0 0 0 0 0 0 0 1 0 ...

Item Parameters for Three Studies Obtained by 1PL and 2PL Estimation

Description

The datasets contain item parameters to be prepared for linking using the function linking.haberman.

Usage

data(data.pars1.rasch)
data(data.pars1.2pl)

Format

The format of data.pars1.rasch is:

'data.frame': 22 obs. of 4 variables:
$ study: chr "study1" "study1" "study1" "study1" ...
$ item : Factor w/ 12 levels "M133","M176",..: 1 2 3 4 5 1 6 7 3 8 ...
$ a : num 1 1 1 1 1 1 1 1 1 1 ...
$ b : num -1.5862 0.40762 1.78031 2.00382 0.00862 ...

Item slopes a are fixed to 1 in 1PL estimation. Item difficulties are denoted by b.
The format of data.pars1.2pl is:

'data.frame': 22 obs. of 4 variables:
$ study: chr "study1" "study1" "study1" "study1" ...
$ item : Factor w/ 12 levels "M133","M176",..: 1 2 3 4 5 1 6 7 3 8 ...
$ a : num 1.238 0.957 1.83 1.927 2.298 ...
$ b : num -1.16607 0.35844 1.06571 1.17159 0.00792 ...

Dataset from PIRLS Study with Missing Responses

Description

This is a dataset of the PIRLS 2011 study for 4th graders for the reading booklet 13 (the 'PIRLS reader') and 4 countries (Austria, Germany, France, Netherlands). Missing responses (missing by intention and not reached) are coded by 9.

Usage

data(data.pirlsmissing)

Format

A data frame with 3480 observations on the following 38 variables.

The format is:

'data.frame': 3480 obs. of 38 variables:
$ idstud : int 1000001 1000002 1000003 1000004 1000005 ...
$ country : Factor w/ 4 levels "AUT","DEU","FRA",..: 1 1 1 1 1 1 1 1 1 1 ...
$ studwgt : num 1.06 1.06 1.06 1.06 1.06 ...
$ R31G01M : int 1 1 1 1 1 1 0 1 1 0 ...
$ R31G02C : int 0 9 0 1 0 0 0 0 1 0 ...
$ R31G03M : int 1 1 1 1 0 1 0 0 1 1 ...
[...]
$ R31P15C : int 1 9 0 1 0 0 0 0 1 0 ...
$ R31P16C : int 0 0 0 0 0 0 0 9 0 1 ...

Examples

data(data.pirlsmissing)
# inspect missing rates
round( colMeans( data.pirlsmissing==9 ), 3 )
  ##    idstud  country  studwgt  R31G01M  R31G02C  R31G03M  R31G04C  R31G05M
  ##     0.000    0.000    0.000    0.009    0.076    0.012    0.203    0.018
  ##   R31G06M  R31G07M R31G08CZ R31G08CA R31G08CB  R31G09M  R31G10C  R31G11M
  ##     0.010    0.020    0.189    0.225    0.252    0.019    0.126    0.023
  ##   R31G12C R31G13CZ R31G13CA R31G13CB R31G13CC  R31G14M  R31P01M  R31P02C
  ##     0.202    0.170    0.198    0.220    0.223    0.074    0.013    0.039
  ##   R31P03C  R31P04M  R31P05C  R31P06C  R31P07C  R31P08M  R31P09C  R31P10M
  ##     0.056    0.012    0.075    0.043    0.074    0.024    0.062    0.025
  ##   R31P11M  R31P12M  R31P13M  R31P14C  R31P15C  R31P16C
  ##     0.027    0.030    0.030    0.126    0.130    0.127

Dataset PISA Mathematics

Description

This is an example PISA dataset of reading items from the PISA 2009 study of students from Austria. The dataset contains 565 students who worked on the 11 reading items from item cluster M3.

Usage

data(data.pisaMath)

Format

The dataset is a list. The list element data contains the dataset with the demographical variables student ID (idstud), school ID (idschool), a dummy variable for female students (female), socioeconomic status (hisei) and migration background (migra). The remaining variables (starting with M in the name) are the mathematics items.
The item metadata are included in the list element item which contains item name (item) and the testlet label (testlet). An item not included in a testlet is indicated by NA.

The format is:

List of 2
$ data:'data.frame':
..$ idstud : num [1:565] 9e+10 9e+10 9e+10 9e+10 9e+10 ...
..$ idschool: int [1:565] 900015 900015 900015 900015 ...
..$ female : int [1:565] 0 0 0 0 0 0 0 0 0 0 ...
..$ hisei : num [1:565] -1.16 -1.099 -1.588 -0.365 -1.588 ...
..$ migra : int [1:565] 0 0 0 0 0 0 0 0 0 1 ...
..$ M192Q01 : int [1:565] 1 0 1 1 1 1 1 0 0 0 ...
..$ M406Q01 : int [1:565] 1 1 1 0 1 0 0 0 1 0 ...
..$ M406Q02 : int [1:565] 1 0 0 0 1 0 0 0 1 0 ...
..$ M423Q01 : int [1:565] 0 1 0 1 1 1 1 1 1 0 ...
..$ M496Q01 : int [1:565] 1 0 0 0 0 0 0 0 1 0 ...
..$ M496Q02 : int [1:565] 1 0 0 1 0 1 0 1 1 0 ...
..$ M564Q01 : int [1:565] 1 1 1 1 1 1 0 0 1 0 ...
..$ M564Q02 : int [1:565] 1 0 1 1 1 0 0 0 0 0 ...
..$ M571Q01 : int [1:565] 1 0 0 0 1 0 0 0 0 0 ...
..$ M603Q01 : int [1:565] 1 0 0 0 1 0 0 0 0 0 ...
..$ M603Q02 : int [1:565] 1 0 0 0 1 0 0 0 1 0 ...
$ item:'data.frame':
..$ item : Factor w/ 11 levels "M192Q01","M406Q01",..: 1 2 3 4 ...
..$ testlet: chr [1:11] NA "M406" "M406" NA ...

Item Parameters from Two PISA Studies

Description

This data frame contains item parameters from two PISA studies. Because the Rasch model is used, only item difficulties are considered.

Usage

data(data.pisaPars)

Format

A data frame with 25 observations on the following 4 variables.

item: Item names
testlet: Items are arranged in corresponding testlets. These names are located in this column.
study1: Item difficulties of study 1
study2: Item difficulties of study 2

Dataset PISA Reading

Description

This is an example PISA dataset of reading items from the PISA 2009 study of students from Austria. The dataset contains 623 students who worked on the 12 reading items from item cluster R7.

Usage

data(data.pisaRead)

Format

The dataset is a list. The list element data contains the dataset with the demographical variables student ID (idstud), school ID (idschool), a dummy variable for female students (female), socioeconomic status (hisei) and migration background (migra). The remaining variables (starting with R in the name) are the reading items.
The item metadata are included in the list element item which contains item name (item), testlet label (testlet), item format (ItemFormat), text type (TextType) and text aspect (Aspect).

The format is:

List of 2
$ data:'data.frame':
..$ idstud : num [1:623] 9e+10 9e+10 9e+10 9e+10 9e+10 ...
..$ idschool: int [1:623] 900003 900003 900003 900003 ...
..$ female : int [1:623] 1 0 1 0 0 0 1 0 1 0 ...
..$ hisei : num [1:623] -1.16 -0.671 1.286 0.185 1.225 ...
..$ migra : int [1:623] 0 0 0 0 0 0 0 0 0 0 ...
..$ R432Q01 : int [1:623] 1 1 1 1 1 1 1 1 1 1 ...
..$ R432Q05 : int [1:623] 1 1 1 1 1 0 1 1 1 0 ...
..$ R432Q06 : int [1:623] 0 0 0 0 0 0 0 0 0 0 ...
..$ R456Q01 : int [1:623] 1 1 1 1 1 1 1 1 1 1 ...
..$ R456Q02 : int [1:623] 1 1 1 1 1 1 1 1 1 1 ...
..$ R456Q06 : int [1:623] 1 1 1 1 1 1 0 0 1 1 ...
..$ R460Q01 : int [1:623] 1 1 0 0 0 0 0 1 1 1 ...
..$ R460Q05 : int [1:623] 1 1 1 1 1 1 1 1 1 1 ...
..$ R460Q06 : int [1:623] 0 1 1 1 1 1 0 0 1 1 ...
..$ R466Q02 : int [1:623] 0 1 0 1 1 0 1 0 0 1 ...
..$ R466Q03 : int [1:623] 0 0 0 1 0 0 0 1 0 1 ...
..$ R466Q06 : int [1:623] 0 1 1 1 1 1 0 1 1 1 ...
$ item:'data.frame':
..$ item : Factor w/ 12 levels "R432Q01","R432Q05",..: 1 2 3 4 ...
..$ testlet : Factor w/ 4 levels "R432","R456",..: 1 1 1 2 ...
..$ ItemFormat: Factor w/ 2 levels "CR","MC": 1 1 2 2 1 1 1 2 2 1 ...
..$ TextType : Factor w/ 3 levels "Argumentation",..: 1 1 1 3 ...
..$ Aspect : Factor w/ 3 levels "Access_and_retrieve",..: 2 3 2 1 ...

Datasets for Pairwise Comparisons

Description

Some datasets for pairwise comparisons.

Usage

data(data.pw01)

Format

The dataset data.pw01 contains results of a German football league from the season 2000/01.

Rating Datasets

Description

Some rating datasets.

Usage

data(data.ratings1)
data(data.ratings2)
data(data.ratings3)

Format

Dataset data.ratings1:

Data frame with 274 observations containing 5 criteria (k1, ..., k5), 135 students and 7 raters.

'data.frame': 274 obs. of 7 variables:
$ idstud: int 100020106 100020106 100070101 100070101 100100109 ...
$ rater : Factor w/ 16 levels "db01","db02",..: 3 15 5 10 2 1 5 4 1 5 ...
$ k1 : int 1 1 0 1 2 0 1 3 0 0 ...
$ k2 : int 1 1 1 1 1 0 0 3 0 0 ...
$ k3 : int 1 1 1 1 2 0 0 3 1 0 ...
$ k4 : int 1 1 1 2 1 0 0 2 0 1 ...
$ k5 : int 2 2 1 2 0 1 0 3 1 0 ...

Data from a 2009 Austrian survey of national educational standards for 8th graders in German language writing. Variables k1 to k5 denote several rating criteria of writing competency.
Dataset data.ratings2:

Data frame with 615 observations containing 5 criteria (k1, ..., k5), 178 students and 16 raters.

'data.frame': 615 obs. of 7 variables:
$ idstud: num 1001 1001 1002 1002 1003 ...
$ rater : chr "R03" "R15" "R05" "R10" ...
$ k1 : int 1 1 0 1 2 0 1 3 3 0 ...
$ k2 : int 1 1 1 1 1 0 0 3 3 0 ...
$ k3 : int 1 1 1 1 2 0 0 3 3 1 ...
$ k4 : int 1 1 1 2 1 0 0 2 2 0 ...
$ k5 : int 2 2 1 2 0 1 0 3 2 1 ...
Dataset data.ratings3:

Data frame with 3169 observations containing 4 criteria (crit2, ..., crit6), 561 students and 52 raters.

'data.frame': 3169 obs. of 6 variables:
$ idstud: num 10001 10001 10002 10002 10003 ...
$ rater : num 840 838 842 808 830 845 813 849 809 802 ...
$ crit2 : int 1 3 3 1 2 2 2 2 3 3 ...
$ crit3 : int 2 2 2 2 2 2 2 2 3 3 ...
$ crit4 : int 1 2 2 2 1 1 1 2 2 2 ...
$ crit6 : num 4 4 4 3 4 4 4 4 4 4 ...

Dataset with Raw Item Responses

Description

Dataset with raw item responses

Usage

data(data.raw1)

Format

A data frame with raw item responses of 1200 persons on the following 77 items:

'data.frame': 1200 obs. of 77 variables:
$ I101: num 0 0 0 2 0 0 0 0 0 0 ...
$ I102: int NA NA 2 1 2 1 3 2 NA NA ...
$ I103: int 1 1 NA NA NA NA NA NA 1 1 ...
...
$ I179: chr "E" "C" "D" "E" ...

Dataset Reading

Description

This dataset contains N=328 students and I=12 items measuring reading competence. All 12 items are arranged into 3 testlets (items with common text stimulus) labeled as A, B and C. The allocation of items to testlets is indicated by their variable names.

Usage

data(data.read)

Format

A data frame with 328 persons on the following 12 variables. Rows correspond to persons and columns to items. The following items are included in data.read:

Testlet A: A1, A2, A3, A4

Testlet B: B1, B2, B3, B4

Testlet C: C1, C2, C3, C4

Examples

## Not run: 
data(data.read)
dat <- data.read
I <- ncol(dat)

# list of needed packages for the following examples
packages <- scan(what="character")
     eRm  ltm  TAM mRm  CDM  mirt psychotools  IsingFit  igraph  qgraph  pcalg
     poLCA  randomLCA psychomix MplusAutomation lavaan

# load packages. make an installation if necessary
miceadds::library_install(packages)

#*****************************************************
# Model 1: Rasch model
#*****************************************************

#----  M1a: rasch.mml2 (in sirt)
mod1a <- sirt::rasch.mml2(dat)
summary(mod1a)

#----  M1b: smirt (in sirt)
Qmatrix <- matrix(1,nrow=I, ncol=1)
mod1b <- sirt::smirt(dat,Qmatrix=Qmatrix)
summary(mod1b)

#----  M1c: gdm (in CDM)
theta.k <- seq(-6,6,len=21)
mod1c <- CDM::gdm(dat,theta.k=theta.k,irtmodel="1PL", skillspace="normal")
summary(mod1c)

#----  M1d: tam.mml (in TAM)
mod1d <- TAM::tam.mml( resp=dat )
summary(mod1d)

#----  M1e: RM (in eRm)
mod1e <- eRm::RM( dat )
  # eRm uses Conditional Maximum Likelihood (CML) as the estimation method.
summary(mod1e)
eRm::plotPImap(mod1e)

#----  M1f: mrm (in mRm)
mod1f <- mRm::mrm( dat, cl=1)   # CML estimation
mod1f$beta  # item parameters

#----  M1g: mirt (in mirt)
mod1g <- mirt::mirt( dat, model=1, itemtype="Rasch", verbose=TRUE )
print(mod1g)
summary(mod1g)
coef(mod1g)
    # arrange coefficients in nicer layout
sirt::mirt.wrapper.coef(mod1g)$coef

#----  M1h: rasch (in ltm)
mod1h <- ltm::rasch( dat, control=list(verbose=TRUE ) )
summary(mod1h)
coef(mod1h)

#----  M1i: RaschModel.fit (in psychotools)
mod1i <- psychotools::RaschModel.fit(dat)  # CML estimation
summary(mod1i)
plot(mod1i)

#----  M1j: noharm.sirt (in sirt)
Fpatt <- matrix( 0, I, 1 )
Fval <- 1 + 0*Fpatt
Ppatt <- Pval <- matrix(1,1,1)
mod1j <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval)
summary(mod1j)
  #   Normal-ogive model, multiply item discriminations with constant D=1.7.
  #   The same holds for other examples with noharm.sirt and R2noharm.
plot(mod1j)

#----  M1k: rasch.pml3 (in sirt)
mod1k <- sirt::rasch.pml3( dat=dat)
  #         pairwise marginal maximum likelihood estimation
summary(mod1k)

#----  M1l: running Mplus (using MplusAutomation package)
mplus_path <- "c:/Mplus7/Mplus.exe"    # locate Mplus executable
#****************
  # specify Mplus object
mplusmod <- MplusAutomation::mplusObject(
    TITLE="1PL in Mplus ;",
    VARIABLE=paste0( "CATEGORICAL ARE ", paste0(colnames(dat),collapse=" ") ),
    MODEL="
       ! fix all item loadings to 1
       F1 BY A1@1 A2@1 A3@1 A4@1 ;
       F1 BY B1@1 B2@1 B3@1 B4@1 ;
       F1 BY C1@1 C2@1 C3@1 C4@1 ;
       ! estimate variance
       F1 ;
            ",
    ANALYSIS="ESTIMATOR=MLR;",
    OUTPUT="stand;",
    usevariables=colnames(dat),  rdata=dat )
#****************

  # write Mplus syntax
filename <- "mod1u"   # specify file name
  # create Mplus syntaxes
res2 <- MplusAutomation::mplusModeler(object=mplusmod, dataout=paste0(filename,".dat"),
               modelout=paste0(filename,".inp"), run=0 )
  # run Mplus model
MplusAutomation::runModels( filefilter=paste0(filename,".inp"), Mplus_command=mplus_path)
  # alternatively, the system() command can also be used
  # get results
mod1l <- MplusAutomation::readModels(target=getwd(), filefilter=filename )
mod1l$summaries    # summaries
mod1l$parameters$unstandardized   # parameter estimates

#*****************************************************
# Model 2: 2PL model
#*****************************************************

#----  M2a: rasch.mml2 (in sirt)
mod2a <- sirt::rasch.mml2(dat, est.a=1:I)
summary(mod2a)

#----  M2b: smirt (in sirt)
mod2b <- sirt::smirt(dat,Qmatrix=Qmatrix,est.a="2PL")
summary(mod2b)

#----  M2c: gdm (in CDM)
mod2c <- CDM::gdm(dat,theta.k=theta.k,irtmodel="2PL", skillspace="normal")
summary(mod2c)

#----  M2d: tam.mml (in TAM)
mod2d <- TAM::tam.mml.2pl( resp=dat )
summary(mod2d)

#----  M2e: mirt (in mirt)
mod2e <- mirt::mirt( dat, model=1, itemtype="2PL" )
print(mod2e)
summary(mod2e)
sirt::mirt.wrapper.coef(mod1g)$coef

#----  M2f: ltm (in ltm)
mod2f <- ltm::ltm( dat ~ z1, control=list(verbose=TRUE ) )
summary(mod2f)
coef(mod2f)
plot(mod2f)

#----  M2g: R2noharm (in NOHARM, running from within R using sirt package)
  # define noharm.path where 'NoharmCL.exe' is located
noharm.path <- "c:/NOHARM"
  # covariance matrix
P.pattern <- matrix( 1, ncol=1, nrow=1 )
P.init <- P.pattern
P.init[1,1] <- 1
  # loading matrix
F.pattern <- matrix(1,I,1)
F.init <- F.pattern
  # estimate model
mod2g <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
             F.init=F.init, P.pattern=P.pattern, P.init=P.init,
             writename="ex2g", noharm.path=noharm.path, dec="," )
summary(mod2g)

#----  M2h: noharm.sirt (in sirt)
mod2h <- sirt::noharm.sirt( dat=dat, Ppatt=P.pattern,Fpatt=F.pattern,
              Fval=F.init, Pval=P.init )
summary(mod2h)
plot(mod2h)

#----  M2i: rasch.pml2 (in sirt)
mod2i <- sirt::rasch.pml2(dat, est.a=1:I)
summary(mod2i)

#----  M2j: WLSMV estimation with cfa (in lavaan)
lavmodel <- "F=~ A1+A2+A3+A4+B1+B2+B3+B4+
                        C1+C2+C3+C4"
mod2j <- lavaan::cfa( data=dat, model=lavmodel, std.lv=TRUE, ordered=colnames(dat))
summary(mod2j, standardized=TRUE, fit.measures=TRUE, rsquare=TRUE)

#*****************************************************
# Model 3: 3PL model (note that results can be quite unstable!)
#*****************************************************

#----  M3a: rasch.mml2 (in sirt)
mod3a <- sirt::rasch.mml2(dat, est.a=1:I, est.c=1:I)
summary(mod3a)

#----  M3b: smirt (in sirt)
mod3b <- sirt::smirt(dat,Qmatrix=Qmatrix,est.a="2PL", est.c=1:I)
summary(mod3b)

#----  M3c: mirt (in mirt)
mod3c <- mirt::mirt( dat, model=1, itemtype="3PL", verbose=TRUE)
summary(mod3c)
coef(mod3c)
  # stabilize parameter estimating using informative priors for guessing parameters
mirtmodel <- mirt::mirt.model("
            F=1-12
            PRIOR=(1-12, g, norm, -1.38, 0.25)
            ")
  # a prior N(-1.38,.25) is specified for transformed guessing parameters: qlogis(g)
  # simulate values from this prior for illustration
N <- 100000
logit.g <- stats::rnorm(N, mean=-1.38, sd=sqrt(.5) )
graphics::plot( stats::density(logit.g) )  # transformed qlogis(g)
graphics::plot( stats::density( stats::plogis(logit.g)) )  # g parameters
  # estimate 3PL with priors
mod3c1 <- mirt::mirt(dat, mirtmodel, itemtype="3PL",verbose=TRUE)
coef(mod3c1)
  # In addition, set upper bounds for g parameters of .35
mirt.pars <- mirt::mirt( dat, mirtmodel, itemtype="3PL",  pars="values")
ind <- which( mirt.pars$name=="g" )
mirt.pars[ ind, "value" ] <- stats::plogis(-1.38)
mirt.pars[ ind, "ubound" ] <- .35
  # prior distribution for slopes
ind <- which( mirt.pars$name=="a1" )
mirt.pars[ ind, "prior_1" ] <- 1.3
mirt.pars[ ind, "prior_2" ] <- 2
mod3c2 <- mirt::mirt(dat, mirtmodel, itemtype="3PL",
                pars=mirt.pars,verbose=TRUE, technical=list(NCYCLES=100) )
coef(mod3c2)
sirt::mirt.wrapper.coef(mod3c2)

#----  M3d: ltm (in ltm)
mod3d <- ltm::tpm( dat, control=list(verbose=TRUE), max.guessing=.3)
summary(mod3d)
coef(mod3d) #=> numerical instabilities

#*****************************************************
# Model 4: 3-dimensional Rasch model
#*****************************************************

# define Q-matrix
Q <- matrix( 0, nrow=12, ncol=3 )
Q[ cbind(1:12, rep(1:3,each=4) ) ] <- 1
rownames(Q) <- colnames(dat)
colnames(Q) <- c("A","B","C")

# define nodes
theta.k <- seq(-6,6,len=13)

#----  M4a: smirt (in sirt)
mod4a <- sirt::smirt(dat,Qmatrix=Q,irtmodel="comp", theta.k=theta.k, maxiter=30)
summary(mod4a)

#----  M4b: rasch.mml2 (in sirt)
mod4b <- sirt::rasch.mml2(dat,Q=Q,theta.k=theta.k, mmliter=30)
summary(mod4b)

#----  M4c: gdm (in CDM)
mod4c <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k, skillspace="normal",
            Qmatrix=Q, maxiter=30, centered.latent=TRUE )
summary(mod4c)

#----  M4d: tam.mml (in TAM)
mod4d <- TAM::tam.mml( resp=dat, Q=Q, control=list(nodes=theta.k, maxiter=30) )
summary(mod4d)

#----  M4e: R2noharm (in NOHARM, running from within R using sirt package)
noharm.path <- "c:/NOHARM"
  # covariance matrix
P.pattern <- matrix( 1, ncol=3, nrow=3 )
P.init <- 0.8+0*P.pattern
diag(P.init) <- 1
  # loading matrix
F.pattern <- 0*Q
F.init <- Q
  # estimate model
mod4e <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
    F.init=F.init, P.pattern=P.pattern, P.init=P.init,
    writename="ex4e", noharm.path=noharm.path, dec="," )
summary(mod4e)

#----  M4f: mirt (in mirt)
cmodel <- mirt::mirt.model("
     F1=1-4
     F2=5-8
     F3=9-12
     # equal item slopes correspond to the Rasch model
     CONSTRAIN=(1-4, a1), (5-8, a2), (9-12,a3)
     COV=F1*F2, F1*F3, F2*F3
     " )
mod4f <- mirt::mirt(dat, cmodel, verbose=TRUE)
summary(mod4f)

#*****************************************************
# Model 5: 3-dimensional 2PL model
#*****************************************************

#----  M5a: smirt (in sirt)
mod5a <- sirt::smirt(dat,Qmatrix=Q,irtmodel="comp", est.a="2PL", theta.k=theta.k,
                 maxiter=30)
summary(mod5a)

#----  M5b: rasch.mml2 (in sirt)
mod5b <- sirt::rasch.mml2(dat,Q=Q,theta.k=theta.k,est.a=1:12, mmliter=30)
summary(mod5b)

#----  M5c: gdm (in CDM)
mod5c <- CDM::gdm( dat, irtmodel="2PL", theta.k=theta.k, skillspace="loglinear",
            Qmatrix=Q, maxiter=30, centered.latent=TRUE,
            standardized.latent=TRUE)
summary(mod5c)

#----  M5d: tam.mml (in TAM)
mod5d <- TAM::tam.mml.2pl( resp=dat, Q=Q, control=list(nodes=theta.k, maxiter=30) )
summary(mod5d)

#----  M5e: R2noharm (in NOHARM, running from within R using sirt package)
noharm.path <- "c:/NOHARM"
  # covariance matrix
P.pattern <- matrix( 1, ncol=3, nrow=3 )
diag(P.pattern) <- 0
P.init <- 0.8+0*P.pattern
diag(P.init) <- 1
  # loading matrix
F.pattern <- Q
F.init <- Q
  # estimate model
mod5e <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
    F.init=F.init, P.pattern=P.pattern, P.init=P.init,
    writename="ex5e", noharm.path=noharm.path, dec="," )
summary(mod5e)

#----  M5f: mirt (in mirt)
cmodel <- mirt::mirt.model("
   F1=1-4
   F2=5-8
   F3=9-12
   COV=F1*F2, F1*F3, F2*F3
   "  )
mod5f <- mirt::mirt(dat, cmodel, verbose=TRUE)
summary(mod5f)

#*****************************************************
# Model 6: Network models (Graphical models)
#*****************************************************

#----  M6a: Ising model using the IsingFit package (undirected graph)
#        - fit Ising model using the "OR rule" (AND=FALSE)
mod6a <- IsingFit::IsingFit(x=dat, family="binomial", AND=FALSE)
summary(mod6a)
##           Network Density:                 0.29
##    Gamma:                  0.25
##    Rule used:              Or-rule
# plot results
qgraph::qgraph(mod6a$weiadj,fade=FALSE)

#**-- graph estimation using pcalg package

# some packages from Bioconductor must be downloaded at first (if not yet done)
if (FALSE){  # set 'if (TRUE)' if packages should be downloaded
     source("http://bioconductor.org/biocLite.R")
     biocLite("RBGL")
     biocLite("Rgraphviz")
}

#----  M6b: graph estimation based on Pearson correlations
V <- colnames(dat)
n <- nrow(dat)
mod6b <- pcalg::pc(suffStat=list(C=stats::cor(dat), n=n ),
             indepTest=gaussCItest, ## indep.test: partial correlations
             alpha=0.05, labels=V, verbose=TRUE)
plot(mod6b)
# plot in qgraph package
qgraph::qgraph(mod6b, label.color=rep( c( "red", "blue","darkgreen" ), each=4 ),
         edge.color="black")
summary(mod6b)

#----  M6c: graph estimation based on tetrachoric correlations
mod6c <- pcalg::pc(suffStat=list(C=sirt::tetrachoric2(dat)$rho, n=n ),
             indepTest=gaussCItest, alpha=0.05, labels=V, verbose=TRUE)
plot(mod6c)
summary(mod6c)

#----  M6d: Statistical implicative analysis (in sirt)
mod6d <- sirt::sia.sirt(dat, significance=.85 )
  # plot results with igraph and qgraph package
plot( mod6d$igraph.obj, vertex.shape="rectangle", vertex.size=30 )
qgraph::qgraph( mod6d$adj.matrix )

#*****************************************************
# Model 7: Latent class analysis with 3 classes
#*****************************************************

#----  M7a: randomLCA (in randomLCA)
  #        - use two trials of starting values
mod7a <- randomLCA::randomLCA(dat,nclass=3, notrials=2, verbose=TRUE)
summary(mod7a)
plot(mod7a,type="l", xlab="Item")

#----  M7b: rasch.mirtlc (in sirt)
mod7b <- sirt::rasch.mirtlc( dat, Nclasses=3,seed=-30,  nstarts=2 )
summary(mod7b)
matplot( t(mod7b$pjk), type="l", xlab="Item" )

#----  M7c: poLCA (in poLCA)
  #   define formula for outcomes
f7c <- paste0( "cbind(", paste0(colnames(dat),collapse=","), ") ~ 1 " )
dat1 <- as.data.frame( dat + 1 ) # poLCA needs integer values from 1,2,..
mod7c <- poLCA::poLCA( stats::as.formula(f7c),dat1,nclass=3, verbose=TRUE)
plot(mod7c)

#----  M7d: gom.em (in sirt)
  #    - the latent class model is a special grade of membership model
mod7d <- sirt::gom.em( dat, K=3, problevels=c(0,1),  model="GOM"  )
summary(mod7d)

#---- - M7e: mirt (in mirt)
  # define three latent classes
Theta <- diag(3)
  # define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        C1=1-12
        C2=1-12
        C3=1-12
        ")
  # get initial parameter values
mod.pars <- mirt::mirt(dat, model=mirtmodel,  pars="values")
  # modify parameters: only slopes refer to item-class probabilities
set.seed(9976)
  # set starting values for class specific item probabilities
mod.pars[ mod.pars$name=="d","value" ]  <- 0
mod.pars[ mod.pars$name=="d","est" ]  <- FALSE
b1 <- stats::qnorm( colMeans( dat ) )
mod.pars[ mod.pars$name=="a1","value" ]  <- b1
  # random starting values for other classes
mod.pars[ mod.pars$name %in% c("a2","a3"),"value" ]  <- b1 + stats::runif(12*2,-1,1)
mod.pars
  #** define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  # number of latent Theta classes
  TP <- nrow(Theta)
  # prior in initial iteration
  if ( is.null(Etable) ){
    prior <- rep( 1/TP, TP )
  }
  # process Etable (this is correct for datasets without missing data)
  if ( ! is.null(Etable) ){
    # sum over correct and incorrect expected responses
    prior <- ( rowSums(Etable[, seq(1,2*I,2)]) + rowSums(Etable[,seq(2,2*I,2)]) )/I
  }
  prior <- prior / sum(prior)
  return(prior)
}
  #** estimate model
mod7e <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE,
            technical=list( customTheta=Theta, customPriorFun=lca_prior) )
  # compare estimated results
print(mod7e)
summary(mod7b)
  # The number of estimated parameters is incorrect because mirt does not correctly count
  # estimated parameters from the user customized  prior distribution.
mod7e@nest <- as.integer(sum(mod.pars$est) + 2)  # two additional class probabilities
  # extract log-likelihood
mod7e@logLik
  # compute AIC and BIC
( AIC <- -2*mod7e@logLik+2*mod7e@nest )
( BIC <- -2*mod7e@logLik+log(mod7e@Data$N)*mod7e@nest )
  # RMSEA and SRMSR fit statistic
mirt::M2(mod7e)     # TLI and CFI does not make sense in this example
  #** extract item parameters
sirt::mirt.wrapper.coef(mod7e)
  #** extract class-specific item-probabilities
probs <- apply( coef1[, c("a1","a2","a3") ], 2, stats::plogis )
matplot( probs, type="l", xlab="Item", main="mirt::mirt")
  #** inspect estimated distribution
mod7e@Theta
mod7e@Prior[[1]]

#*****************************************************
# Model 8: Mixed Rasch model with two classes
#*****************************************************

#----  M8a: raschmix (in psychomix)
mod8a <- psychomix::raschmix(data=as.matrix(dat), k=2, scores="saturated")
summary(mod8a)

#----  M8b: mrm (in mRm)
mod8b <- mRm::mrm(data.matrix=dat, cl=2)
mod8b$conv.to.bound
plot(mod8b)
print(mod8b)

#----  M8c: mirt (in mirt)
  #* define theta grid
theta.k <- seq( -5, 5, len=9 )
TP <- length(theta.k)
Theta <- matrix( 0, nrow=2*TP, ncol=4)
Theta[1:TP,1:2] <- cbind(theta.k, 1 )
Theta[1:TP + TP,3:4] <- cbind(theta.k, 1 )
Theta
  # define model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        F1a=1-12  # slope Class 1
        F1b=1-12  # difficulty Class 1
        F2a=1-12  # slope Class 2
        F2b=1-12  # difficulty Class 2
        CONSTRAIN=(1-12,a1),(1-12,a3)
        ")
  # get initial parameter values
mod.pars <- mirt::mirt(dat, model=mirtmodel,  pars="values")
  # set starting values for class specific item probabilities
mod.pars[ mod.pars$name=="d","value" ]  <- 0
mod.pars[ mod.pars$name=="d","est" ]  <- FALSE
mod.pars[ mod.pars$name=="a1","value" ]  <- 1
mod.pars[ mod.pars$name=="a3","value" ]  <- 1
  # initial values difficulties
b1 <-  stats::qlogis( colMeans(dat) )
mod.pars[ mod.pars$name=="a2","value" ]  <- b1
mod.pars[ mod.pars$name=="a4","value" ]  <- b1 + stats::runif(I, -1, 1)
  #* define prior for mixed Rasch analysis
mixed_prior <- function(Theta,Etable){
  NC <- 2   # number of theta classes
  TP <- nrow(Theta) / NC
  prior1 <- stats::dnorm( Theta[1:TP,1] )
  prior1 <- prior1 / sum(prior1)
  if ( is.null(Etable) ){   prior <- c( prior1, prior1 ) }
  if ( ! is.null(Etable) ){
    prior <- ( rowSums( Etable[, seq(1,2*I,2)] ) +
                   rowSums( Etable[,seq(2,2*I,2)]) )/I
    a1 <- stats::aggregate( prior, list( rep(1:NC, each=TP) ), sum )
    a1[,2] <- a1[,2] / sum( a1[,2])
    # print some information during estimation
    cat( paste0( " Class proportions: ",
              paste0( round(a1[,2], 3 ), collapse=" " ) ), "\n")
    a1 <- rep( a1[,2], each=TP )
    # specify mixture of two normal distributions
    prior <- a1*c(prior1,prior1)
  }
  prior <- prior / sum(prior)
  return(prior)
}
  #* estimate model
mod8c <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE,
        technical=list(  customTheta=Theta, customPriorFun=mixed_prior ) )
  # Like in Model 7e, the number of estimated parameters must be included.
mod8c@nest <- as.integer(sum(mod.pars$est) + 1)
      # two class proportions and therefore one probability is freely estimated.
  #* extract item parameters
sirt::mirt.wrapper.coef(mod8c)
  #* estimated distribution
mod8c@Theta
mod8c@Prior

#----  M8d: tamaan (in TAM)

tammodel <- "
ANALYSIS:
  TYPE=MIXTURE ;
  NCLASSES(2);
  NSTARTS(7,20);
LAVAAN MODEL:
  F=~ A1__C4
  F ~~ F
ITEM TYPE:
  ALL(Rasch);
    "
mod8d <- TAM::tamaan( tammodel, resp=dat )
summary(mod8d)
# plot item parameters
I <- 12
ipars <- mod8d$itempartable_MIXTURE[ 1:I, ]
plot( 1:I, ipars[,3], type="o", ylim=range( ipars[,3:4] ), pch=16,
        xlab="Item", ylab="Item difficulty")
lines( 1:I, ipars[,4], type="l", col=2, lty=2)
points( 1:I, ipars[,4],  col=2, pch=2)

#*****************************************************
# Model 9: Mixed 2PL model with two classes
#*****************************************************

#----  M9a: tamaan (in TAM)

tammodel <- "
ANALYSIS:
  TYPE=MIXTURE ;
  NCLASSES(2);
  NSTARTS(10,30);
LAVAAN MODEL:
  F=~ A1__C4
  F ~~ F
ITEM TYPE:
  ALL(2PL);
    "
mod9a <- TAM::tamaan( tammodel, resp=dat )
summary(mod9a)

#*****************************************************
# Model 10: Rasch testlet model
#*****************************************************

#----  M10a: tam.fa (in TAM)
dims <- substring( colnames(dat),1,1 )  # define dimensions
mod10a <- TAM::tam.fa( resp=dat, irtmodel="bifactor1", dims=dims,
                control=list(maxiter=60) )
summary(mod10a)

#----  M10b: mirt (in mirt)
cmodel <- mirt::mirt.model("
        G=1-12
        A=1-4
        B=5-8
        C=9-12
        CONSTRAIN=(1-12,a1), (1-4, a2), (5-8, a3), (9-12,a4)
      ")
mod10b <- mirt::mirt(dat, model=cmodel, verbose=TRUE)
summary(mod10b)
coef(mod10b)
mod10b@logLik   # equivalent is slot( mod10b, "logLik")

#alternatively, using a dimensional reduction approach (faster and better accuracy)
cmodel <- mirt::mirt.model("
      G=1-12
      CONSTRAIN=(1-12,a1), (1-4, a2), (5-8, a3), (9-12,a4)
     ")
item_bundles <- rep(c(1,2,3), each=4)
mod10b1 <- mirt::bfactor(dat, model=item_bundles, model2=cmodel, verbose=TRUE)
coef(mod10b1)

#----  M10c: smirt (in sirt)
  # define Q-matrix
Qmatrix <- matrix(0,12,4)
Qmatrix[,1] <- 1
Qmatrix[ cbind( 1:12, match( dims, unique(dims)) +1 ) ]  <- 1
  # uncorrelated factors
variance.fixed <- cbind( c(1,1,1,2,2,3), c(2,3,4,3,4,4), 0 )
  # estimate model
mod10c <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp",
              variance.fixed=variance.fixed, qmcnodes=1000, maxiter=60)
summary(mod10c)

#*****************************************************
# Model 11: Bifactor model
#*****************************************************

#----  M11a: tam.fa (in TAM)
dims <- substring( colnames(dat),1,1 )  # define dimensions
mod11a <- TAM::tam.fa( resp=dat, irtmodel="bifactor2", dims=dims,
                 control=list(maxiter=60) )
summary(mod11a)

#----  M11b: bfactor (in mirt)
dims1 <- match( dims, unique(dims) )
mod11b <- mirt::bfactor(dat, model=dims1, verbose=TRUE)
summary(mod11b)
coef(mod11b)
mod11b@logLik

#----  M11c: smirt (in sirt)
  # define Q-matrix
Qmatrix <- matrix(0,12,4)
Qmatrix[,1] <- 1
Qmatrix[ cbind( 1:12, match( dims, unique(dims)) +1 ) ]  <- 1
  # uncorrelated factors
variance.fixed <- cbind( c(1,1,1,2,2,3), c(2,3,4,3,4,4), 0 )
  # estimate model
mod11c <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp", est.a="2PL",
                variance.fixed=variance.fixed, qmcnodes=1000, maxiter=60)
summary(mod11c)

#*****************************************************
# Model 12: Located latent class model: Rasch model with three theta classes
#*****************************************************

# use 10th item as the reference item
ref.item <- 10
# ability grid
theta.k <- seq(-4,4,len=9)

#----  M12a: rasch.mirtlc (in sirt)
mod12a <- sirt::rasch.mirtlc(dat, Nclasses=3, modeltype="MLC1", ref.item=ref.item)
summary(mod12a)

#----  M12b: gdm (in CDM)
theta.k <- seq(-1, 1, len=3)      # initial matrix
b.constraint <- matrix( c(10,1,0), nrow=1,ncol=3)
  # estimate model
mod12b <- CDM::gdm( dat, theta.k=theta.k, skillspace="est", irtmodel="1PL",
              b.constraint=b.constraint, maxiter=200)
summary(mod12b)

#----  M12c: mirt (in mirt)
items <- colnames(dat)
  # define three latent classes
Theta <- diag(3)
  # define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        C1=1-12
        C2=1-12
        C3=1-12
        CONSTRAIN=(1-12,a1),(1-12,a2),(1-12,a3)
        ")
  # get parameters
mod.pars <- mirt(dat, model=mirtmodel,  pars="values")
 # set starting values for class specific item probabilities
mod.pars[ mod.pars$name=="d","value" ]  <- stats::qlogis( colMeans(dat,na.rm=TRUE) )
  # set item difficulty of reference item to zero
ind <- which( ( paste(mod.pars$item)==items[ref.item] ) &
               ( ( paste(mod.pars$name)=="d" ) ) )
mod.pars[ ind,"value" ]  <- 0
mod.pars[ ind,"est" ]  <- FALSE
  # initial values for a1, a2 and a3
mod.pars[ mod.pars$name %in% c("a1","a2","a3"),"value" ]  <- c(-1,0,1)
mod.pars
  #* define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  # number of latent Theta classes
  TP <- nrow(Theta)
  # prior in initial iteration
  if ( is.null(Etable) ){
    prior <- rep( 1/TP, TP )
              }
  # process Etable (this is correct for datasets without missing data)
  if ( ! is.null(Etable) ){
    # sum over correct and incorrect expected responses
    prior <- ( rowSums( Etable[, seq(1,2*I,2)] ) + rowSums( Etable[, seq(2,2*I,2)] ) )/I
            }
  prior <- prior / sum(prior)
  return(prior)
   }
 #* estimate model
mod12c <- mirt(dat, mirtmodel, technical=list(
            customTheta=Theta, customPriorFun=lca_prior),
            pars=mod.pars, verbose=TRUE )
  # estimated parameters from the user customized  prior distribution.
mod12c@nest <- as.integer(sum(mod.pars$est) + 2)
  #* extract item parameters
coef1 <- sirt::mirt.wrapper.coef(mod12c)
  #* inspect estimated distribution
mod12c@Theta
coef1$coef[1,c("a1","a2","a3")]
mod12c@Prior[[1]]

#*****************************************************
# Model 13: Multidimensional model with discrete traits
#*****************************************************
# define Q-Matrix
Q <- matrix( 0, nrow=12,ncol=3)
Q[1:4,1] <- 1
Q[5:8,2] <- 1
Q[9:12,3] <- 1
# define discrete theta distribution with 3 dimensions
Theta <- scan(what="character",nlines=1)
  000 100 010 001 110 101 011 111
Theta <- as.numeric( unlist( lapply( Theta, strsplit, split="")   ) )
Theta <- matrix(Theta, 8, 3, byrow=TRUE )
Theta

#----  Model 13a: din (in CDM)
mod13a <- CDM::din( dat, q.matrix=Q, rule="DINA")
summary(mod13a)
# compare used Theta distributions
cbind( Theta, mod13a$attribute.patt.splitted)

#----  Model 13b: gdm (in CDM)
mod13b <- CDM::gdm( dat, Qmatrix=Q, theta.k=Theta, skillspace="full")
summary(mod13b)

#----  Model 13c: mirt (in mirt)
  # define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        F1=1-4
        F2=5-8
        F3=9-12
        ")
  # get parameters
mod.pars <- mirt(dat, model=mirtmodel,  pars="values")
# starting values d parameters (transformed guessing parameters)
ind <- which(  mod.pars$name=="d"  )
mod.pars[ind,"value"] <- stats::qlogis(.2)
# starting values transformed slipping parameters
ind <- which( ( mod.pars$name %in% paste0("a",1:3)  ) &  ( mod.pars$est ) )
mod.pars[ind,"value"] <- stats::qlogis(.8) - stats::qlogis(.2)
mod.pars

  #* define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  TP <- nrow(Theta)
  if ( is.null(Etable) ){
    prior <- rep( 1/TP, TP )
              }
  if ( ! is.null(Etable) ){
    prior <- ( rowSums( Etable[, seq(1,2*I,2)] ) + rowSums( Etable[, seq(2,2*I,2)] ) )/I
            }
  prior <- prior / sum(prior)
  return(prior)
}
 #* estimate model
mod13c <- mirt(dat, mirtmodel, technical=list(
            customTheta=Theta, customPriorFun=lca_prior),
            pars=mod.pars, verbose=TRUE )
  # estimated parameters from the user customized  prior distribution.
mod13c@nest <- as.integer(sum(mod.pars$est) + 2)
  #* extract item parameters
coef13c <- sirt::mirt.wrapper.coef(mod13c)$coef
  #* inspect estimated distribution
mod13c@Theta
mod13c@Prior[[1]]

 #-* comparisons of estimated  parameters
# extract guessing and slipping parameters from din
dfr <- coef(mod13a)[, c("guess","slip") ]
colnames(dfr) <- paste0("din.",c("guess","slip") )
# estimated parameters from gdm
dfr$gdm.guess <- stats::plogis(mod13b$item$b)
dfr$gdm.slip <- 1 - stats::plogis( rowSums(mod13b$item[,c("b.Cat1","a.F1","a.F2","a.F3")] ) )
# estimated parameters from mirt
dfr$mirt.guess <- stats::plogis( coef13c$d )
dfr$mirt.slip <- 1 - stats::plogis( rowSums(coef13c[,c("d","a1","a2","a3")]) )
# comparison
round(dfr[, c(1,3,5,2,4,6)],3)
  ##      din.guess gdm.guess mirt.guess din.slip gdm.slip mirt.slip
  ##   A1     0.691     0.684      0.686    0.000    0.000     0.000
  ##   A2     0.491     0.489      0.489    0.031    0.038     0.036
  ##   A3     0.302     0.300      0.300    0.184    0.193     0.190
  ##   A4     0.244     0.239      0.240    0.337    0.340     0.339
  ##   B1     0.568     0.579      0.577    0.163    0.148     0.151
  ##   B2     0.329     0.344      0.340    0.344    0.326     0.329
  ##   B3     0.817     0.827      0.825    0.014    0.007     0.009
  ##   B4     0.431     0.463      0.456    0.104    0.089     0.092
  ##   C1     0.188     0.191      0.189    0.013    0.013     0.013
  ##   C2     0.050     0.050      0.050    0.239    0.238     0.239
  ##   C3     0.000     0.002      0.001    0.065    0.065     0.065
  ##   C4     0.000     0.004      0.000    0.212    0.212     0.212

# estimated class sizes
dfr <- data.frame( "Theta"=Theta, "din"=mod13a$attribute.patt$class.prob,
                   "gdm"=mod13b$pi.k, "mirt"=mod13c@Prior[[1]])
# comparison
round(dfr,3)
  ##     Theta.1 Theta.2 Theta.3   din   gdm  mirt
  ##   1       0       0       0 0.039 0.041 0.040
  ##   2       1       0       0 0.008 0.009 0.009
  ##   3       0       1       0 0.009 0.007 0.008
  ##   4       0       0       1 0.394 0.417 0.412
  ##   5       1       1       0 0.011 0.011 0.011
  ##   6       1       0       1 0.017 0.042 0.037
  ##   7       0       1       1 0.042 0.008 0.016
  ##   8       1       1       1 0.480 0.465 0.467

#*****************************************************
# Model 14: DINA model with two skills
#*****************************************************

# define some simple Q-Matrix (does not really make in this application)
Q <- matrix( 0, nrow=12,ncol=2)
Q[1:4,1] <- 1
Q[5:8,2] <- 1
Q[9:12,1:2] <- 1
# define discrete theta distribution with 3 dimensions
Theta <- scan(what="character",nlines=1)
  00 10 01 11
Theta <- as.numeric( unlist( lapply( Theta, strsplit, split="")   ) )
Theta <- matrix(Theta, 4, 2, byrow=TRUE )
Theta

#----  Model 14a: din (in CDM)
mod14a <- CDM::din( dat, q.matrix=Q, rule="DINA")
summary(mod14a)
# compare used Theta distributions
cbind( Theta, mod14a$attribute.patt.splitted)

#----  Model 14b: mirt (in mirt)
  # define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        F1=1-4
        F2=5-8
        (F1*F2)=9-12
        ")
#-> constructions like (F1*F2*F3) are also allowed in mirt.model
  # get parameters
mod.pars <- mirt(dat, model=mirtmodel,  pars="values")
# starting values d parameters (transformed guessing parameters)
ind <- which(  mod.pars$name=="d"  )
mod.pars[ind,"value"] <- stats::qlogis(.2)
# starting values transformed slipping parameters
ind <- which( ( mod.pars$name %in% paste0("a",1:3)  ) &  ( mod.pars$est ) )
mod.pars[ind,"value"] <- stats::qlogis(.8) - stats::qlogis(.2)
mod.pars
 #* use above defined prior lca_prior
 # lca_prior <- function(prior,Etable) ...
 #* estimate model
mod14b <- mirt(dat, mirtmodel, technical=list(
            customTheta=Theta, customPriorFun=lca_prior),
            pars=mod.pars, verbose=TRUE )
  # estimated parameters from the user customized  prior distribution.
mod14b@nest <- as.integer(sum(mod.pars$est) + 2)
  #* extract item parameters
coef14b <- sirt::mirt.wrapper.coef(mod14b)$coef

 #-* comparisons of estimated  parameters
# extract guessing and slipping parameters from din
dfr <- coef(mod14a)[, c("guess","slip") ]
colnames(dfr) <- paste0("din.",c("guess","slip") )
# estimated parameters from mirt
dfr$mirt.guess <- stats::plogis( coef14b$d )
dfr$mirt.slip <- 1 - stats::plogis( rowSums(coef14b[,c("d","a1","a2","a3")]) )
# comparison
round(dfr[, c(1,3,2,4)],3)
  ##      din.guess mirt.guess din.slip mirt.slip
  ##   A1     0.674      0.671    0.030     0.030
  ##   A2     0.423      0.420    0.049     0.050
  ##   A3     0.258      0.255    0.224     0.225
  ##   A4     0.245      0.243    0.394     0.395
  ##   B1     0.534      0.543    0.166     0.164
  ##   B2     0.338      0.347    0.382     0.380
  ##   B3     0.796      0.802    0.016     0.015
  ##   B4     0.421      0.436    0.142     0.140
  ##   C1     0.850      0.851    0.000     0.000
  ##   C2     0.480      0.480    0.097     0.097
  ##   C3     0.746      0.746    0.026     0.026
  ##   C4     0.575      0.577    0.136     0.137

# estimated class sizes
dfr <- data.frame( "Theta"=Theta, "din"=mod13a$attribute.patt$class.prob,
                    "mirt"=mod14b@Prior[[1]])
# comparison
round(dfr,3)
  ##     Theta.1 Theta.2   din  mirt
  ##   1       0       0 0.357 0.369
  ##   2       1       0 0.044 0.049
  ##   3       0       1 0.047 0.031
  ##   4       1       1 0.553 0.551

#*****************************************************
# Model 15: Rasch model with non-normal distribution
#*****************************************************

# A non-normal theta distributed is specified by log-linear smoothing
# the distribution as described in
# Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model
# to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS.

# define theta grid
theta.k <- matrix( seq(-4,4,len=15), ncol=1 )
# define design matrix for smoothing (up to cubic moments)
delta.designmatrix <- cbind( 1, theta.k, theta.k^2, theta.k^3 )
# constrain item difficulty of fifth item (item B1) to zero
b.constraint <- matrix( c(5,1,0), ncol=3 )

#----  Model 15a: gdm (in CDM)
mod15a <- CDM::gdm( dat, irtmodel="1PL", theta.k=theta.k,
               b.constraint=b.constraint  )
summary(mod15a)
 # plot estimated distribution
graphics::barplot( mod15a$pi.k[,1], space=0, names.arg=round(theta.k[,1],2),
           main="Estimated Skewed Distribution (gdm function)")

#----  Model 15b: mirt (in mirt)
 # define mirt model
mirtmodel <- mirt::mirt.model("
    F=1-12
    ")
 # get parameters
mod.pars <- mirt::mirt(dat, model=mirtmodel, pars="values", itemtype="Rasch")
  # fix variance (just for correct counting of parameters)
mod.pars[ mod.pars$name=="COV_11", "est"] <- FALSE
  # fix item difficulty
ind <- which( ( mod.pars$item=="B1" ) & ( mod.pars$name=="d" ) )
mod.pars[ ind, "value"] <- 0
mod.pars[ ind, "est"] <- FALSE

 # define prior
loglinear_prior <- function(Theta,Etable){
    TP <- nrow(Theta)
    if ( is.null(Etable) ){
    prior <- rep( 1/TP, TP )
           }
    # process Etable (this is correct for datasets without missing data)
    if ( ! is.null(Etable) ){
          # sum over correct and incorrect expected responses
       prior <- ( rowSums( Etable[, seq(1,2*I,2)] ) + rowSums( Etable[, seq(2,2*I,2)] ) )/I
       # smooth prior using the above design matrix and a log-linear model
       # see Xu & von Davier (2008).
       y <- log( prior + 1E-15 )
       lm1 <- lm( y ~ 0 + delta.designmatrix, weights=prior )
       prior <- exp(fitted(lm1))   # smoothed prior
           }
    prior <- prior / sum(prior)
    return(prior)
}

#* estimate model
mod15b <- mirt::mirt(dat, mirtmodel, technical=list(
                customTheta=theta.k, customPriorFun=loglinear_prior ),
                pars=mod.pars, verbose=TRUE )
# estimated parameters from the user customized prior distribution.
mod15b@nest <- as.integer(sum(mod.pars$est) + 3)
#* extract item parameters
coef1 <- sirt::mirt.wrapper.coef(mod15b)$coef

#** compare estimated item parameters
dfr <- data.frame( "gdm"=mod15a$item$b.Cat1, "mirt"=coef1$d )
rownames(dfr) <- colnames(dat)
round(t(dfr),4)
  ##            A1     A2      A3      A4 B1      B2     B3     B4     C1    C2     C3    C4
  ##   gdm  0.9818 0.1538 -0.7837 -1.3197  0 -1.0902 1.6088 -0.170 1.9778 0.006 1.1859 0.135
  ##   mirt 0.9829 0.1548 -0.7826 -1.3186  0 -1.0892 1.6099 -0.169 1.9790 0.007 1.1870 0.136
# compare estimated theta distribution
dfr <- data.frame( "gdm"=mod15a$pi.k, "mirt"=mod15b@Prior[[1]] )
round(t(dfr),4)
  ##        1 2     3     4      5      6      7      8      9     10     11     12     13
  ##   gdm  0 0 1e-04 9e-04 0.0056 0.0231 0.0652 0.1299 0.1881 0.2038 0.1702 0.1129 0.0612
  ##   mirt 0 0 1e-04 9e-04 0.0056 0.0232 0.0653 0.1300 0.1881 0.2038 0.1702 0.1128 0.0611
  ##            14    15
  ##   gdm  0.0279 0.011
  ##   mirt 0.0278 0.011

## End(Not run)

Datasets from Reckase' Book Multidimensional Item Response Theory

Description

Some simulated datasets from Reckase (2009).

Usage

data(data.reck21)
data(data.reck61DAT1)
data(data.reck61DAT2)
data(data.reck73C1a)
data(data.reck73C1b)
data(data.reck75C2)
data(data.reck78ExA)
data(data.reck79ExB)

Format

The format of the data.reck21 (Table 2.1, p. 45) is:

List of 2
$ data: num [1:2500, 1:50] 0 0 0 1 1 0 0 0 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:50] "I0001" "I0002" "I0003" "I0004" ...
$ pars:'data.frame':
..$ a: num [1:50] 1.83 1.38 1.47 1.53 0.88 0.82 1.02 1.19 1.15 0.18 ...
..$ b: num [1:50] 0.91 0.81 0.06 -0.8 0.24 0.99 1.23 -0.47 2.78 -3.85 ...
..$ c: num [1:50] 0 0 0 0.25 0.21 0.29 0.26 0.19 0 0.21 ...
The format of the datasets data.reck61DAT1 and data.reck61DAT2 (Table 6.1, p. 153) is

List of 4
$ data : num [1:2500, 1:30] 1 0 0 1 1 0 0 1 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:30] "A01" "A02" "A03" "A04" ...
$ pars :'data.frame':
..$ a1: num [1:30] 0.747 0.46 0.861 1.014 0.552 ...
..$ a2: num [1:30] 0.025 0.0097 0.0067 0.008 0.0204 0.0064 0.0861 ...
..$ a3: num [1:30] 0.1428 0.0692 0.404 0.047 0.1482 ...
..$ d : num [1:30] 0.183 -0.192 -0.466 -0.434 -0.443 ...
$ mu : num [1:3] -0.4 -0.7 0.1
$ sigma: num [1:3, 1:3] 1.21 0.297 1.232 0.297 0.81 ...

The dataset data.reck61DAT2 has correlated dimensions while data.reck61DAT1 has uncorrelated dimensions.
Datasets data.reck73C1a and data.reck73C1b use item parameters from Table 7.3 (p. 188). The dataset C1a has uncorrelated dimensions, while C1b has perfectly correlated dimensions. The items are sensitive to 3 dimensions. The format of the datasets is

List of 4
$ data : num [1:2500, 1:30] 1 0 1 1 1 0 1 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:30] "A01" "A02" "A03" "A04" ...
$ pars :'data.frame': 30 obs. of 4 variables:
..$ a1: num [1:30] 0.747 0.46 0.861 1.014 0.552 ...
..$ a2: num [1:30] 0.025 0.0097 0.0067 0.008 0.0204 0.0064 ...
..$ a3: num [1:30] 0.1428 0.0692 0.404 0.047 0.1482 ...
..$ d : num [1:30] 0.183 -0.192 -0.466 -0.434 -0.443 ...
$ mu : num [1:3] 0 0 0
$ sigma: num [1:3, 1:3] 0.167 0.236 0.289 0.236 0.334 ...
The dataset data.reck75C2 is simulated using item parameters from Table 7.5 (p. 191). It contains items which are sensitive to only one dimension but individuals which have abilities in three uncorrelated dimensions. The format is

List of 4
$ data : num [1:2500, 1:30] 0 0 1 1 1 0 0 1 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:30] "A01" "A02" "A03" "A04" ...
$ pars :'data.frame': 30 obs. of 4 variables:
..$ a1: num [1:30] 0.56 0.48 0.67 0.57 0.54 0.74 0.7 0.59 0.63 0.64 ...
..$ a2: num [1:30] 0.62 0.53 0.63 0.69 0.58 0.69 0.75 0.63 0.64 0.64 ...
..$ a3: num [1:30] 0.46 0.42 0.43 0.51 0.41 0.48 0.46 0.5 0.51 0.46 ...
..$ d : num [1:30] 0.1 0.06 -0.38 0.46 0.14 0.31 0.06 -1.23 0.47 1.06 ...
$ mu : num [1:3] 0 0 0
$ sigma: num [1:3, 1:3] 1 0 0 0 1 0 0 0 1
The dataset data.reck78ExA contains simulated item responses from Table 7.8 (p. 204 ff.). There are three item clusters and two ability dimensions. The format is

List of 4
$ data : num [1:2500, 1:50] 0 1 1 0 1 0 0 0 0 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:50] "A01" "A02" "A03" "A04" ...
$ pars :'data.frame': 50 obs. of 3 variables:
..$ a1: num [1:50] 0.889 1.057 1.047 1.178 1.029 ...
..$ a2: num [1:50] 0.1399 0.0432 0.016 0.0231 0.2347 ...
..$ d : num [1:50] 0.2724 1.2335 -0.0918 -0.2372 0.8471 ...
$ mu : num [1:2] 0 0
$ sigma: num [1:2, 1:2] 1 0 0 1
The dataset data.reck79ExB contains simulated item responses from Table 7.9 (p. 207 ff.). There are three item clusters and three ability dimensions. The format is

List of 4
$ data : num [1:2500, 1:50] 1 1 0 1 0 0 0 1 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:50] "A01" "A02" "A03" "A04" ...
$ pars :'data.frame': 50 obs. of 4 variables:
..$ a1: num [1:50] 0.895 1.032 1.036 1.163 1.022 ...
..$ a2: num [1:50] 0.052 0.132 0.144 0.13 0.165 ...
..$ a3: num [1:50] 0.0722 0.1923 0.0482 0.1321 0.204 ...
..$ d : num [1:50] 0.2724 1.2335 -0.0918 -0.2372 0.8471 ...
$ mu : num [1:3] 0 0 0
$ sigma: num [1:3, 1:3] 1 0 0 0 1 0 0 0 1

Source

Simulated datasets

References

Reckase, M. (2009). Multidimensional item response theory. New York: Springer. doi:10.1007/978-0-387-89976-3

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.reck21 dataset, Table 2.1, p. 45
#############################################################################
data(data.reck21)

dat <- data.reck21$dat      # extract dataset

# items with zero guessing parameters
guess0 <- c( 1, 2, 3, 9,11,27,30,35,45,49,50 )
I <- ncol(dat)

#***
# Model 1: 3PL estimation using rasch.mml2
est.c <- est.a <- 1:I
est.c[ guess0 ] <- 0
mod1 <- sirt::rasch.mml2( dat, est.a=est.a, est.c=est.c, mmliter=300 )
summary(mod1)

#***
# Model 2: 3PL estimation using smirt
Q <- matrix(1,I,1)
mod2 <- sirt::smirt( dat, Qmatrix=Q, est.a="2PL", est.c=est.c, increment.factor=1.01)
summary(mod2)

#***
# Model 3: estimation in mirt package
library(mirt)
itemtype <- rep("3PL", I )
itemtype[ guess0 ] <- "2PL"
mod3 <- mirt::mirt(dat, 1, itemtype=itemtype, verbose=TRUE)
summary(mod3)

c3 <- unlist( coef(mod3) )[ 1:(4*I) ]
c3 <- matrix( c3, I, 4, byrow=TRUE )
# compare estimates of rasch.mml2, smirt and true parameters
round( cbind( mod1$item$c, mod2$item$c,c3[,3],data.reck21$pars$c ), 2 )
round( cbind( mod1$item$a, mod2$item$a.Dim1,c3[,1], data.reck21$pars$a ), 2 )
round( cbind( mod1$item$b, mod2$item$b.Dim1 / mod2$item$a.Dim1, - c3[,2] / c3[,1],
            data.reck21$pars$b ), 2 )

#############################################################################
# EXAMPLE 2: data.reck61 dataset, Table 6.1, p. 153
#############################################################################

data(data.reck61DAT1)
dat <- data.reck61DAT1$data

#***
# Model 1: Exploratory factor analysis

#-- Model 1a: tam.fa in TAM
library(TAM)
mod1a <- TAM::tam.fa( dat, irtmodel="efa", nfactors=3 )
# varimax rotation
varimax(mod1a$B.stand)

# Model 1b: EFA in NOHARM (Promax rotation)
mod1b <- sirt::R2noharm( dat=dat, model.type="EFA",  dimensions=3,
              writename="reck61__3dim_efa", noharm.path="c:/NOHARM",dec=",")
summary(mod1b)

# Model 1c: EFA with noharm.sirt
mod1c <- sirt::noharm.sirt( dat=dat, dimensions=3  )
summary(mod1c)
plot(mod1c)

# Model 1d: EFA with 2 dimensions in noharm.sirt
mod1d <- sirt::noharm.sirt( dat=dat, dimensions=2  )
summary(mod1d)
plot(mod1d, efa.load.min=.2)   # plot loadings of at least .20

#***
# Model 2: Confirmatory factor analysis

#-- Model 2a: tam.fa in TAM
dims <- c( rep(1,10), rep(3,10), rep(2,10)  )
Qmatrix <- matrix( 0, nrow=30, ncol=3 )
Qmatrix[ cbind( 1:30, dims) ] <- 1
mod2a <- TAM::tam.mml.2pl( dat,Q=Qmatrix,
            control=list( snodes=1000, QMC=TRUE, maxiter=200) )
summary(mod2a)

#-- Model 2b: smirt in sirt
mod2b <- sirt::smirt( dat,Qmatrix=Qmatrix, est.a="2PL", maxiter=20, qmcnodes=1000 )
summary(mod2b)

#-- Model 2c: rasch.mml2 in sirt
mod2c <- sirt::rasch.mml2( dat,Qmatrix=Qmatrix, est.a=1:30,
                mmliter=200, theta.k=seq(-5,5,len=11) )
summary(mod2c)

#-- Model 2d: mirt in mirt
cmodel <- mirt::mirt.model("
     F1=1-10
     F2=21-30
     F3=11-20
     COV=F1*F2, F1*F3, F2*F3 " )
mod2d <- mirt::mirt(dat, cmodel, verbose=TRUE)
summary(mod2d)
coef(mod2d)

#-- Model 2e: CFA in NOHARM
# specify covariance pattern
P.pattern <- matrix( 1, ncol=3, nrow=3 )
P.init <- .4*P.pattern
diag(P.pattern) <- 0
diag(P.init) <- 1
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 0, nrow=30, ncol=3 )
F.pattern[1:10,1] <- 1
F.pattern[21:30,2] <- 1
F.pattern[11:20,3] <- 1
F.init <- F.pattern
# estimate model
mod2e <- sirt::R2noharm( dat=dat, model.type="CFA", P.pattern=P.pattern,
            P.init=P.init, F.pattern=F.pattern, F.init=F.init,
            writename="reck61__3dim_cfa", noharm.path="c:/NOHARM",dec=",")
summary(mod2e)

#-- Model 2f: CFA with noharm.sirt
mod2f <- sirt::noharm.sirt( dat=dat, Fval=F.init, Fpatt=F.pattern,
                 Pval=P.init, Ppatt=P.pattern )
summary(mod2f)

#############################################################################
# EXAMPLE 3: DETECT analysis data.reck78ExA and data.reck79ExB
#############################################################################

data(data.reck78ExA)
data(data.reck79ExB)

#************************
# Example A
dat <- data.reck78ExA$data
#- estimate person score
score <- stats::qnorm( ( rowMeans( dat )+.5 )  / ( ncol(dat) + 1 ) )
#- extract item cluster
itemcluster <- substring( colnames(dat), 1, 1 )
#- confirmatory DETECT Item cluster
detectA <- sirt::conf.detect( data=dat, score=score, itemcluster=itemcluster )
  ##          unweighted weighted
  ##   DETECT      0.571    0.571
  ##   ASSI        0.523    0.523
  ##   RATIO       0.757    0.757

#- exploratory DETECT analysis
detect_explA <- sirt::expl.detect(data=dat, score, nclusters=10, N.est=nrow(dat)/2  )
  ##  Optimal Cluster Size is  5  (Maximum of DETECT Index)
  ##     N.Cluster N.items N.est N.val         size.cluster DETECT.est ASSI.est
  ##   1         2      50  1250  1250                31-19      0.531    0.404
  ##   2         3      50  1250  1250             10-19-21      0.554    0.407
  ##   3         4      50  1250  1250           10-19-14-7      0.630    0.509
  ##   4         5      50  1250  1250         10-19-3-7-11      0.653    0.546
  ##   5         6      50  1250  1250       10-12-7-3-7-11      0.593    0.458
  ##   6         7      50  1250  1250      10-12-7-3-7-9-2      0.604    0.474
  ##   7         8      50  1250  1250    10-12-7-3-3-9-4-2      0.608    0.481
  ##   8         9      50  1250  1250  10-12-7-3-3-5-4-2-4      0.617    0.494
  ##   9        10      50  1250  1250 10-5-7-7-3-3-5-4-2-4      0.592    0.460

# cluster membership
cluster_membership <- detect_explA$itemcluster$cluster3
# Cluster 1:
colnames(dat)[ cluster_membership==1 ]
  ##   [1] "A01" "A02" "A03" "A04" "A05" "A06" "A07" "A08" "A09" "A10"
# Cluster 2:
colnames(dat)[ cluster_membership==2 ]
  ##    [1] "B11" "B12" "B13" "B14" "B15" "B16" "B17" "B18" "B19" "B20" "B21" "B22"
  ##   [13] "B23" "B25" "B26" "B27" "B28" "B29" "B30"
# Cluster 3:
colnames(dat)[ cluster_membership==3 ]
  ##    [1] "B24" "C31" "C32" "C33" "C34" "C35" "C36" "C37" "C38" "C39" "C40" "C41"
  ##   [13] "C42" "C43" "C44" "C45" "C46" "C47" "C48" "C49" "C50"

#************************
# Example B
dat <- data.reck79ExB$data
#- estimate person score
score <- stats::qnorm( ( rowMeans( dat )+.5 )  / ( ncol(dat) + 1 ) )
#- extract item cluster
itemcluster <- substring( colnames(dat), 1, 1 )
#- confirmatory DETECT Item cluster
detectB <- sirt::conf.detect( data=dat, score=score, itemcluster=itemcluster )
  ##          unweighted weighted
  ##   DETECT      0.715    0.715
  ##   ASSI        0.624    0.624
  ##   RATIO       0.855    0.855

#- exploratory DETECT analysis
detect_explB <- sirt::expl.detect(data=dat, score, nclusters=10, N.est=nrow(dat)/2  )
  ##   Optimal Cluster Size is  4  (Maximum of DETECT Index)
  ##
  ##     N.Cluster N.items N.est N.val         size.cluster DETECT.est ASSI.est
  ##   1         2      50  1250  1250                30-20      0.665    0.546
  ##   2         3      50  1250  1250             10-20-20      0.686    0.585
  ##   3         4      50  1250  1250           10-20-8-12      0.728    0.644
  ##   4         5      50  1250  1250         10-6-14-8-12      0.654    0.553
  ##   5         6      50  1250  1250       10-6-14-3-12-5      0.659    0.561
  ##   6         7      50  1250  1250      10-6-14-3-7-5-5      0.664    0.576
  ##   7         8      50  1250  1250     10-6-7-7-3-7-5-5      0.616    0.518
  ##   8         9      50  1250  1250   10-6-7-7-3-5-5-5-2      0.612    0.512
  ##   9        10      50  1250  1250 10-6-7-7-3-5-3-5-2-2      0.613    0.512

## End(Not run)

Some Example Datasets for the `sirt` Package

Description

Some example datasets for the sirt package.

Usage

data(data.si01)
data(data.si02)
data(data.si03)
data(data.si04)
data(data.si05)
data(data.si06)
data(data.si07)
data(data.si08)
data(data.si09)
data(data.si10)

Format

The format of the dataset data.si01 is:

'data.frame': 1857 obs. of 3 variables:
$ idgroup: int 1 1 1 1 1 1 1 1 1 1 ...
$ item1 : int NA NA NA NA NA NA NA NA NA NA ...
$ item2 : int 4 4 4 4 4 4 4 2 4 4 ...
The dataset data.si02 is the Stouffer-Toby-dataset published in Lindsay, Clogg and Grego (1991; Table 1, p.97, Cross-classification A):

List of 2
$ data : num [1:16, 1:4] 1 0 1 0 1 0 1 0 1 0 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:4] "I1" "I2" "I3" "I4"
$ weights: num [1:16] 42 1 6 2 6 1 7 2 23 4 ...
The format of the dataset data.si03 (containing item parameters of two studies) is:

'data.frame': 27 obs. of 3 variables:
$ item : Factor w/ 27 levels "M1","M10","M11",..: 1 12 21 22 ...
$ b_study1: num 0.297 1.163 0.151 -0.855 -1.653 ...
$ b_study2: num 0.72 1.118 0.351 -0.861 -1.593 ...
The dataset data.si04 is adapted from Bartolucci, Montanari and Pandolfi (2012; Table 4, Table 7). The data contains 4999 persons, 79 items on 5 dimensions. See rasch.mirtlc for using the data in an analysis.

List of 3
$ data : num [1:4999, 1:79] 0 1 1 0 1 1 0 0 1 1 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:79] "A01" "A02" "A03" "A04" ...
$ itempars :'data.frame': 79 obs. of 4 variables:
..$ item : Factor w/ 79 levels "A01","A02","A03",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ dim : num [1:79] 1 1 1 1 1 1 1 1 1 1 ...
..$ gamma : num [1:79] 1 1 1 1 1 1 1 1 1 1 ...
..$ gamma.beta: num [1:79] -0.189 0.25 0.758 1.695 1.022 ...
$ distribution: num [1:9, 1:7] 1 2 3 4 5 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:7] "class" "A" "B" "C" ...
The dataset data.si05 contains double ratings of two exchangeable raters for three items which are in Ex1, Ex2 and Ex3, respectively.

List of 3
$ Ex1:'data.frame': 199 obs. of 2 variables:
..$ C7040: num [1:199] NA 1 0 1 1 0 0 0 1 0 ...
..$ C7041: num [1:199] 1 1 0 0 0 0 0 0 1 0 ...
$ Ex2:'data.frame': 2000 obs. of 2 variables:
..$ rater1: num [1:2000] 2 0 3 1 2 2 0 0 0 0 ...
..$ rater2: num [1:2000] 4 1 3 2 1 0 0 0 0 2 ...
$ Ex3:'data.frame': 2000 obs. of 2 variables:
..$ rater1: num [1:2000] 5 1 6 2 3 3 0 0 0 0 ...
..$ rater2: num [1:2000] 7 2 6 3 2 1 0 1 0 3 ...
The dataset data.si06 contains multiple choice item responses. The correct alternative is denoted as 0, distractors are indicated by the codes 1, 2 or 3.

'data.frame': 4441 obs. of 14 variables:
$ WV01: num 0 0 0 0 0 0 0 0 0 3 ...
$ WV02: num 0 0 0 3 0 0 0 0 0 1 ...
$ WV03: num 0 1 0 0 0 0 0 0 0 0 ...
$ WV04: num 0 0 0 0 0 0 0 0 0 1 ...
$ WV05: num 3 1 1 1 0 0 1 1 0 2 ...
$ WV06: num 0 1 3 0 0 0 2 0 0 1 ...
$ WV07: num 0 0 0 0 0 0 0 0 0 0 ...
$ WV08: num 0 1 1 0 0 0 0 0 0 0 ...
$ WV09: num 0 0 0 0 0 0 0 0 0 2 ...
$ WV10: num 1 1 3 0 0 2 0 0 0 0 ...
$ WV11: num 0 0 0 0 0 0 0 0 0 0 ...
$ WV12: num 0 0 0 2 0 0 2 0 0 0 ...
$ WV13: num 3 1 1 3 0 0 3 0 0 0 ...
$ WV14: num 3 1 2 3 0 3 1 3 3 0 ...
The dataset data.si07 contains parameters of the empirical illustration of DeCarlo (2020). The simulation function sim_fun can be used for simulating data from the IRSDT model (see DeCarlo, 2020)

List of 3
$ pars :'data.frame': 16 obs. of 3 variables:
..$ item: Factor w/ 16 levels "I01","I02","I03",..: 1 2 3 4 5 6 7 8 9 10 ...
..$ b : num [1:16] -1.1 -0.18 1.44 1.78 -1.19 0.45 -1.12 0.33 0.82 -0.43 ...
..$ d : num [1:16] 2.69 4.6 6.1 3.11 3.2 ...
$ trait :'data.frame': 20 obs. of 2 variables:
..$ x : num [1:20] 0.025 0.075 0.125 0.175 0.225 0.275 0.325 0.375 0.425 0.475 ...
..$ prob: num [1:20] 0.0238 0.1267 0.105 0.0594 0.0548 ...
$ sim_fun:function (lambda, b, d, items)
The dataset data.si08 contains 5 items with respect to knowledge about lung cancer and the kind of information acquisition (Goodman, 1970; see also Rasch, Kubinger & Yanagida, 2011). L1: reading newspapers, L2: listening radio, L3: reading books and magazines, L4: attending talks, L5: knowledge about lung cancer

'data.frame': 32 obs. of 6 variables:
$ L1 : num 1 1 1 1 1 1 1 1 1 1 ...
$ L2 : num 1 1 1 1 1 1 1 1 0 0 ...
$ L3 : num 1 1 1 1 0 0 0 0 1 1 ...
$ L4 : num 1 1 0 0 1 1 0 0 1 1 ...
$ L5 : num 1 0 1 0 1 0 1 0 1 0 ...
$ wgt: num 23 8 102 67 8 4 35 59 27 18 ...
The dataset data.si09 was used in Fischer and Karl (2019) and they asked employees in a eight countries, to report whether they typically help other employees (helping behavior, seven items, help) and whether they make suggestions to improve work conditions and products (voice behavior, five items, voice). Individuals responded to these items on a 1-7 Likert-type scale. The dataset was downloaded from https://osf.io/wkx8c/.

'data.frame': 5201 obs. of 13 variables:
$ country: Factor w/ 8 levels "BRA","CAN","KEN",..: 5 5 5 5 5 5 5 5 5 5 ...
$ help1 : int 6 6 5 5 5 6 6 6 4 6 ...
$ help2 : int 3 6 5 6 6 6 6 6 6 7 ...
$ help3 : int 5 6 6 7 7 6 5 6 6 7 ...
$ help4 : int 7 6 5 6 6 7 7 6 6 7 ...
$ help5 : int 5 5 5 6 6 6 6 6 6 7 ...
$ help6 : int 3 4 5 6 6 7 7 6 6 5 ...
$ help7 : int 5 4 4 5 5 7 7 6 6 6 ...
$ voice1 : int 3 6 5 6 4 7 6 6 5 7 ...
$ voice2 : int 3 6 4 7 6 5 6 6 4 7 ...
$ voice3 : int 6 6 5 7 6 5 6 6 6 5 ...
$ voice4 : int 6 6 6 5 5 7 5 6 6 6 ...
$ voice5 : int 6 7 4 7 6 6 6 6 5 7 ...
The dataset data.si10 contains votes of 435 members of the U.S. House of Representatives, 267 Democrates and 168 Republicans. The dataset was used by Fop and Murphy (2017).

'data.frame': 435 obs. of 17 variables:
$ party : Factor w/ 2 levels "democrat","republican": 2 2 1 1 1 1 1 2 2 1 ...
$ vote01: num 0 0 NA 0 1 0 0 0 0 1 ...
$ vote02: num 1 1 1 1 1 1 1 1 1 1 ...
$ vote03: num 0 0 1 1 1 1 0 0 0 1 ...
$ vote04: num 1 1 NA 0 0 0 1 1 1 0 ...
$ vote05: num 1 1 1 NA 1 1 1 1 1 0 ...
$ vote06: num 1 1 1 1 1 1 1 1 1 0 ...
$ vote07: num 0 0 0 0 0 0 0 0 0 1 ...
$ vote08: num 0 0 0 0 0 0 0 0 0 1 ...
$ vote09: num 0 0 0 0 0 0 0 0 0 1 ...
$ vote10: num 1 0 0 0 0 0 0 0 0 0 ...
$ vote11: num NA 0 1 1 1 0 0 0 0 0 ...
$ vote12: num 1 1 0 0 NA 0 0 0 1 0 ...
$ vote13: num 1 1 1 1 1 1 NA 1 1 0 ...
$ vote14: num 1 1 1 0 1 1 1 1 1 0 ...
$ vote15: num 0 0 0 0 1 1 1 NA 0 NA ...
$ vote16: num 1 NA 0 1 1 1 1 1 1 NA ...

References

Bartolucci, F., Montanari, G. E., & Pandolfi, S. (2012). Dimensionality of the latent structure and item selection via latent class multidimensional IRT models. Psychometrika, 77(4), 782-802. doi:10.1007/s11336-012-9278-0

DeCarlo, L. T. (2020). An item response model for true-false exams based on signal detection theory. Applied Psychological Measurement, 34(3). 234-248. doi:10.1177/0146621619843823

Fischer, R., & Karl, J. A. (2019). A primer to (cross-cultural) multi-group invariance testing possibilities in R. Frontiers in Psychology | Cultural Psychology, 10:1507. doi:10.3389/fpsyg.2019.01507

Fop, M., & Murphy, T. B. (2018). Variable selection methods for model-based clustering. Statistics Surveys, 12, 18-65. https://doi.org/10.1214/18-SS119

Goodman, L. A. (1970). The multivariate analysis of qualitative data: Interactions among multiple classifications. Journal of the American Statistical Association, 65(329), 226-256. doi:10.1080/01621459.1970.10481076

Lindsay, B., Clogg, C. C., & Grego, J. (1991). Semiparametric estimation in the Rasch model and related exponential response models, including a simple latent class model for item analysis. Journal of the American Statistical Association, 86(413), 96-107. doi:10.1080/01621459.1991.10475008

Rasch, D., Kubinger, K. D., & Yanagida, T. (2011). Statistics in psychology using R and SPSS. New York: Wiley. doi:10.1002/9781119979630

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Nested logit model multiple choice dataset data.si06
#############################################################################

data(data.si06, package="sirt")
dat <- data.si06

#** estimate 2PL nested logit model
library(mirt)
mod1 <- mirt::mirt( dat, model=1, itemtype="2PLNRM", key=rep(0,ncol(dat) ),
            verbose=TRUE  )
summary(mod1)
cmod1 <- sirt::mirt.wrapper.coef(mod1)$coef
cmod1[,-1] <- round( cmod1[,-1], 3)

#** normalize item parameters according Suh and Bolt (2010)
cmod2 <- cmod1

# slope parameters
ind <-  grep("ak",colnames(cmod2))
h1 <- cmod2[,ind ]
cmod2[,ind] <- t( apply( h1, 1, FUN=function(ll){ ll - mean(ll) } ) )
# item intercepts
ind <-  paste0( "d", 0:9 )
ind <- which( colnames(cmod2) %in% ind )
h1 <- cmod2[,ind ]
cmod2[,ind] <- t( apply( h1, 1, FUN=function(ll){ ll - mean(ll) } ) )
cmod2[,-1] <- round( cmod2[,-1], 3)

#############################################################################
# EXAMPLE 2: Item response modle based on signal detection theory (IRSDT model)
#############################################################################

data(data.si07, package="sirt")
data <- data.si07

#-- simulate data
set.seed(98)
N <- 2000 # define sample size
# generate membership scores
lambda <- sample(size=N, x=data$trait$x, prob=data$trait$prob, replace=TRUE)
b <- data$pars$b
d <- data$pars$d
items <- data$pars$item
dat <- data$sim_fun(lambda=lambda, b=b, d=d, items=items)

#- estimate IRSDT model as a grade of membership model with two classes
problevels <- seq( 0.025, 0.975, length=20 )
mod1 <- sirt::gom.em( dat, K=2, problevels=problevels )
summary(mod1)

## End(Not run)

Dataset TIMSS Mathematics

Description

This datasets contains TIMSS mathematics data from 345 students on 25 items.

Usage

data(data.timss)

Format

This dataset is a list. data is the dataset containing student ID (idstud), a dummy variable for female (girl) and student age (age). The following variables (starting with M in the variable name are items.

The format is:

List of 2
$ data:'data.frame':
..$ idstud : num [1:345] 4e+09 4e+09 4e+09 4e+09 4e+09 ...
..$ girl : int [1:345] 0 0 0 0 0 0 0 0 1 0 ...
..$ age : num [1:345] 10.5 10 10.25 10.25 9.92 ...
..$ M031286 : int [1:345] 0 0 0 1 1 0 1 0 1 0 ...
..$ M031106 : int [1:345] 0 0 0 1 1 0 1 1 0 0 ...
..$ M031282 : int [1:345] 0 0 0 1 1 0 1 1 0 0 ...
..$ M031227 : int [1:345] 0 0 0 0 1 0 0 0 0 0 ...
[...]
..$ M041203 : int [1:345] 0 0 0 1 1 0 0 0 0 1 ...
$ item:'data.frame':
..$ item : Factor w/ 25 levels "M031045","M031068",..: ...
..$ Block : Factor w/ 2 levels "M01","M02": 1 1 1 1 1 1 ..
..$ Format : Factor w/ 2 levels "CR","MC": 1 1 1 1 2 ...
..$ Content.Domain : Factor w/ 3 levels "Data Display",..: 3 3 3 3 ...
..$ Cognitive.Domain: Factor w/ 3 levels "Applying","Knowing",..: 2 3 3 ..

TIMSS 2007 Grade 8 Mathematics and Science Russia

Description

This TIMSS 2007 dataset contains item responses of 4472 eigth grade Russian students in Mathematics and Science.

Usage

data(data.timss07.G8.RUS)

Format

The datasets contains raw responses (raw), scored responses (scored) and item informations (iteminfo).

The format of the dataset is:

List of 3
$ raw :'data.frame':
..$ idstud : num [1:4472] 3010101 3010102 3010104 3010105 3010106 ...
..$ M022043 : atomic [1:4472] NA 1 4 NA NA NA NA NA NA NA ...
.. ..- attr(*, "value.labels")=Named num [1:7] 9 6 5 4 3 2 1
.. .. ..- attr(*, "names")=chr [1:7] "OMITTED" "NOT REACHED" "E" "D*" ...
[...]
..$ M032698 : atomic [1:4472] NA NA NA NA NA NA NA 2 1 NA ...
.. ..- attr(*, "value.labels")=Named num [1:6] 9 6 4 3 2 1
.. .. ..- attr(*, "names")=chr [1:6] "OMITTED" "NOT REACHED" "D" "C" ...
..$ M032097 : atomic [1:4472] NA NA NA NA NA NA NA 2 3 NA ...
.. ..- attr(*, "value.labels")=Named num [1:6] 9 6 4 3 2 1
.. .. ..- attr(*, "names")=chr [1:6] "OMITTED" "NOT REACHED" "D" "C*" ...
.. [list output truncated]
$ scored : num [1:4472, 1:443] 3010101 3010102 3010104 3010105 3010106 ...
..- attr(*, "dimnames")=List of 2
.. ..$ : NULL
.. ..$ : chr [1:443] "idstud" "M022043" "M022046" "M022049" ...
$ iteminfo:'data.frame':
..$ item : Factor w/ 442 levels "M022043","M022046",..: 1 2 3 4 5 6 21 7 8 17 ...
..$ content : Factor w/ 8 levels "Algebra","Biology",..: 7 7 6 1 6 7 4 6 7 7 ...
..$ topic : Factor w/ 49 levels "Algebraic Expression",..: 32 32 41 29 ...
..$ cognitive : Factor w/ 3 levels "Applying","Knowing",..: 2 1 3 2 1 1 1 1 2 1 ...
..$ item.type : Factor w/ 2 levels "CR","MC": 2 1 2 2 1 2 2 2 2 1 ...
..$ N.options : Factor w/ 4 levels "-"," -","4","5": 4 1 3 4 1 4 4 4 3 1 ...
..$ key : Factor w/ 7 levels "-"," -","A","B",..: 6 1 6 7 1 5 5 4 6 1 ...
..$ max.points: int [1:442] 1 1 1 1 1 1 1 1 1 2 ...
..$ item.label: Factor w/ 432 levels "1 teacher for every 12 students ",..: 58 351 ...

Source

TIMSS 2007 8th Grade, Russian Sample

Dataset Used in Stoyan, Pommerening and Wuensche (2018)

Description

Dataset used in Stoyan, Pommerening and Wuensche (2018; see also Pommerening et al., 2018). In the dataset, 15 forest managers classify 387 trees either as trees to be maintained or as trees to be removed. They assign tree marks, either 0 or 1, where mark 1 means remove.

Usage

data(data.trees)

Format

The dataset has the following structure.

'data.frame': 387 obs. of 16 variables:
$ Number: int 142 184 9 300 374 42 382 108 125 201 ...
$ FM1 : int 1 1 1 1 1 1 1 1 1 0 ...
$ FM2 : int 1 1 1 0 1 1 1 1 1 1 ...
$ FM3 : int 1 0 1 1 1 1 1 1 1 1 ...
$ FM4 : int 1 1 1 1 1 1 0 1 1 1 ...
$ FM5 : int 1 1 1 1 1 1 0 0 0 1 ...
$ FM6 : int 1 1 1 1 0 1 1 1 1 0 ...
$ FM7 : int 1 0 1 1 0 0 1 0 1 1 ...
$ FM8 : int 1 1 1 1 1 0 0 1 0 1 ...
$ FM9 : int 1 1 0 1 1 1 1 0 1 1 ...
$ FM10 : int 0 1 1 0 1 1 1 1 0 0 ...
$ FM11 : int 1 1 1 1 0 1 1 0 1 0 ...
$ FM12 : int 1 1 1 1 1 1 0 1 0 0 ...
$ FM13 : int 0 1 0 0 1 1 1 1 1 1 ...
$ FM14 : int 1 1 1 1 1 0 1 1 1 1 ...
$ FM15 : int 1 1 0 1 1 0 1 0 0 1 ...

Source

https://www.pommerening.org/wiki/images/d/dc/CoedyBreninSortedforPublication.txt

References

Pommerening, A., Ramos, C. P., Kedziora, W., Haufe, J., & Stoyan, D. (2018). Rating experiments in forestry: How much agreement is there in tree marking? PloS ONE, 13(3), e0194747. doi:10.1371/journal.pone.0194747

Stoyan, D., Pommerening, A., & Wuensche, A. (2018). Rater classification by means of set-theoretic methods applied to forestry data. Journal of Environmental Statistics, 8(2), 1-17.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Latent class models, latent trait models, mixed membership models
#############################################################################

data(data.trees, package="sirt")
dat <- data.trees[,-1]
I <- ncol(dat)

#** latent class models with 2, 3, and 4 classes
problevels <- seq( 0, 1, len=2 )
mod02 <- sirt::gom.em(dat, K=2, problevels, model="GOM")
mod03 <- sirt::gom.em(dat, K=3, problevels, model="GOM")
mod04 <- sirt::gom.em(dat, K=4, problevels, model="GOM")

#** grade of membership models
mod11 <- sirt::gom.em(dat, K=2, theta0.k=10*seq(-1,1,len=11), model="GOMnormal")
problevels <- seq( 0, 1, len=3 )
mod12 <- sirt::gom.em(dat, K=2, problevels, model="GOM")
mod13 <- sirt::gom.em(dat, K=3, problevels, model="GOM")
mod14 <- sirt::gom.em(dat, K=4, problevels, model="GOM")
problevels <- seq( 0, 1, len=4 )
mod22 <- sirt::gom.em(dat, K=2, problevels, model="GOM")
mod23 <- sirt::gom.em(dat, K=3, problevels, model="GOM")
mod24 <- sirt::gom.em(dat, K=4, problevels, model="GOM")

#** latent trait models
#- 1PL
mod31 <- sirt::rasch.mml2(dat)
#- 2PL
mod32 <- sirt::rasch.mml2(dat, est.a=1:I)

#- model comparison
IRT.compareModels(mod02, mod03, mod04, mod11, mod12, mod13, mod14,
                     mod22, mod23, mod24, mod31, mod32)

#-- inspect model results
summary(mod12)
round( cbind( mod12$theta.k, mod12$pi.k ),3)

summary(mod13)
round(cbind( mod13$theta.k, mod13$pi.k ),3)

## End(Not run)

Converting a Data Frame from Wide Format in a Long Format

Description

Converts a data frame in wide format into long format.

Usage

data.wide2long(dat, id=NULL, X=NULL, Q=NULL)

Arguments

dat

Data frame with item responses and a person identifier if id !=NULL.

id

An optional string with the variable name of the person identifier.

X

Data frame with person covariates for inclusion in the data frame of long format

Q

Data frame with item predictors. Item labels must be included as a column named by "item".

Value

Data frame in long format

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.pisaRead
#############################################################################
miceadds::library_install("lme4")

data(data.pisaRead)
dat <- data.pisaRead$data
Q <- data.pisaRead$item   # item predictors

# define items
items <- colnames(dat)[ substring( colnames(dat), 1, 1 )=="R" ]
dat1 <- dat[, c( "idstud", items ) ]
# matrix with person predictors
X <- dat[, c("idschool", "hisei", "female", "migra") ]

# create dataset in long format
dat.long <- sirt::data.wide2long( dat=dat1, id="idstud", X=X, Q=Q )

#***
# Model 1: Rasch model
mod1 <- lme4::glmer( resp ~ 0 + ( 1 | idstud ) + as.factor(item), data=dat.long,
            family="binomial", verbose=TRUE)
summary(mod1)

#***
# Model 2: Rasch model and inclusion of person predictors
mod2 <- lme4::glmer( resp ~ 0 + ( 1 | idstud ) + as.factor(item) + female + hisei + migra,
           data=dat.long, family="binomial", verbose=TRUE)
summary(mod2)

#***
# Model 3: LLTM
mod3 <- lme4::glmer(resp ~ (1|idstud) + as.factor(ItemFormat) + as.factor(TextType),
            data=dat.long, family="binomial", verbose=TRUE)
summary(mod3)

#############################################################################
# EXAMPLE 2: Rasch model in lme4
#############################################################################

set.seed(765)
N <- 1000  # number of persons
I <- 10    # number of items
b <- seq(-2,2,length=I)
dat <- sirt::sim.raschtype( stats::rnorm(N,sd=1.2), b=b )
dat.long <- sirt::data.wide2long( dat=dat )
#***
# estimate Rasch model with lmer
library(lme4)
mod1 <- lme4::glmer( resp ~ 0 + as.factor( item ) + ( 1 | id_index), data=dat.long,
             verbose=TRUE, family="binomial")
summary(mod1)
  ##   Random effects:
  ##    Groups   Name        Variance Std.Dev.
  ##    id_index (Intercept) 1.454    1.206
  ##   Number of obs: 10000, groups: id_index, 1000
  ##
  ##   Fixed effects:
  ##                        Estimate Std. Error z value Pr(>|z|)
  ##   as.factor(item)I0001  2.16365    0.10541  20.527  < 2e-16 ***
  ##   as.factor(item)I0002  1.66437    0.09400  17.706  < 2e-16 ***
  ##   as.factor(item)I0003  1.21816    0.08700  14.002  < 2e-16 ***
  ##   as.factor(item)I0004  0.68611    0.08184   8.383  < 2e-16 ***
  ##   [...]

## End(Not run)

Calculation of the DETECT and polyDETECT Index

Description

This function calculated the DETECT and polyDETECT index (Stout, Habing, Douglas & Kim, 1996; Zhang & Stout, 1999a; Zhang, 2007). At first, conditional covariances have to be estimated using the ccov.np function.

Usage

detect.index(ccovtable, itemcluster)

Arguments

ccovtable

A value of ccov.np.

itemcluster

Item cluster for each item. The order of entries must correspond to the columns in data (submitted to ccov.np).

References

Stout, W., Habing, B., Douglas, J., & Kim, H. R. (1996). Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331-354.

Zhang, J., & Stout, W. (1999a). Conditional covariance structure of generalized compensatory multidimensional items. Psychometrika, 64, 129-152.

Zhang, J., & Stout, W. (1999b). The theoretical DETECT index of dimensionality and its application to approximate simple structure. Psychometrika, 64, 213-249.

Zhang, J. (2007). Conditional covariance theory and DETECT for polytomous items. Psychometrika, 72, 69-91.

Differential Item Functioning using Logistic Regression Analysis

Description

This function assesses differential item functioning using logistic regression analysis (Zumbo, 1999).

Usage

dif.logistic.regression(dat, group, score,quant=1.645)

Arguments

dat

Data frame with dichotomous item responses

group

Group identifier

score

Ability estimate, e.g. the WLE.

quant

Used quantile of the normal distribution for assessing statistical significance

Details

Items are classified into A (negligible DIF), B (moderate DIF) and C (large DIF) levels according to the ETS classification system (Longford, Holland & Thayer, 1993, p. 175). See also Monahan, McHorney, Stump and Perkins (2007) for further DIF effect size classifications.

Value

A data frame with following variables:

itemnr

Numeric index of the item

sortDIFindex

Rank of item with respect to the uniform DIF (from negative to positive values)

item

Item name

N

Sample size per item

R

Value of group variable for reference group

F

Value of group variable for focal group

nR

Sample size per item in reference group

nF

Sample size per item in focal group

p

Item p value

pR

Item p value in reference group

pF

Item p value in focal group

pdiff

Item p value differences

pdiff.adj

Adjusted p value difference

uniformDIF

Uniform DIF estimate

se.uniformDIF

Standard error of uniform DIF

t.uniformDIF

The t value for uniform DIF

sig.uniformDIF

Significance label for uniform DIF

DIF.ETS

DIF classification according to the ETS classification system (see Details)

uniform.EBDIF

Empirical Bayes estimate of uniform DIF (Longford, Holland & Thayer, 1993) which takes degree of DIF standard error into account

DIF.SD

Value of the DIF standard deviation

nonuniformDIF

Nonuniform DIF estimate

se.nonuniformDIF

Standard error of nonuniform DIF

t.nonuniformDIF

The t value for nonuniform DIF

sig.nonuniformDIF

Significance label for nonuniform DIF

References

Longford, N. T., Holland, P. W., & Thayer, D. T. (1993). Stability of the MH D-DIF statistics across populations. In P. W. Holland & H. Wainer (Eds.). Differential Item Functioning (pp. 171-196). Hillsdale, NJ: Erlbaum.

Magis, D., Beland, S., Tuerlinckx, F., & De Boeck, P. (2010). A general framework and an R package for the detection of dichotomous differential item functioning. Behavior Research Methods, 42(3), 847-862. doi:10.3758/BRM.42.3.847

Monahan, P. O., McHorney, C. A., Stump, T. E., & Perkins, A. J. (2007). Odds ratio, delta, ETS classification, and standardization measures of DIF magnitude for binary logistic regression. Journal of Educational and Behavioral Statistics, 32(1), 92-109. doi:10.3102/1076998606298035

Zumbo, B. D. (1999). A handbook on the theory and methods of differential item functioning (DIF): Logistic regression modeling as a unitary framework for binary and Likert-type (ordinal) item scores. Ottawa ON: Directorate of Human Resources Research and Evaluation, Department of National Defense.

Examples

#############################################################################
# EXAMPLE 1: Mathematics data | Gender DIF
#############################################################################

data( data.math )
dat <- data.math$data
items <- grep( "M", colnames(dat))

# estimate item parameters and WLEs
mod <- sirt::rasch.mml2( dat[,items] )
wle <- sirt::wle.rasch( dat[,items], b=mod$item$b )$theta

# assess DIF by logistic regression
mod1 <- sirt::dif.logistic.regression( dat=dat[,items], score=wle, group=dat$female)

# calculate DIF variance
dif1 <- sirt::dif.variance( dif=mod1$uniformDIF, se.dif=mod1$se.uniformDIF )
dif1$unweighted.DIFSD
  ## > dif1$unweighted.DIFSD
  ## [1] 0.1963958

# calculate stratified DIF variance
# stratification based on domains
dif2 <- sirt::dif.strata.variance( dif=mod1$uniformDIF, se.dif=mod1$se.uniformDIF,
              itemcluster=data.math$item$domain )
  ## $unweighted.DIFSD
  ## [1] 0.1455916

## Not run: 
#****
# Likelihood ratio test and graphical model test in eRm package
miceadds::library_install("eRm")
# estimate Rasch model
res <- eRm::RM( dat[,items] )
summary(res)
# LR-test with respect to female
lrres <- eRm::LRtest(res, splitcr=dat$female)
summary(lrres)
# graphical model test
eRm::plotGOF(lrres)

#############################################################################
# EXAMPLE 2: Comparison with Mantel-Haenszel test
#############################################################################

library(TAM)
library(difR)

#*** (1) simulate data
set.seed(776)
N <- 1500   # number of persons per group
I <- 12     # number of items
mu2 <- .5   # impact (group difference)
sd2 <- 1.3  # standard deviation group 2

# define item difficulties
b <- seq( -1.5, 1.5, length=I)
# simulate DIF effects
bdif <- scale( stats::rnorm(I, sd=.6 ), scale=FALSE )[,1]
# item difficulties per group
b1 <- b + 1/2 * bdif
b2 <- b - 1/2 * bdif
# simulate item responses
dat1 <- sirt::sim.raschtype( theta=stats::rnorm(N, mean=0, sd=1 ), b=b1 )
dat2 <- sirt::sim.raschtype( theta=stats::rnorm(N, mean=mu2, sd=sd2 ), b=b2 )
dat <- rbind( dat1, dat2 )
group <- rep( c(1,2), each=N ) # define group indicator

#*** (2) scale data
mod <- TAM::tam.mml( dat, group=group )
summary(mod)

#*** (3) extract person parameter estimates
mod_eap <- mod$person$EAP
mod_wle <- tam.wle( mod )$theta

#*********************************
# (4) techniques for assessing differential item functioning

# Model 1: assess DIF by logistic regression and WLEs
dif1 <- sirt::dif.logistic.regression( dat=dat, score=mod_wle, group=group)
# Model 2: assess DIF by logistic regression and EAPs
dif2 <- sirt::dif.logistic.regression( dat=dat, score=mod_eap, group=group)
# Model 3: assess DIF by Mantel-Haenszel statistic
dif3 <- difR::difMH(Data=dat, group=group, focal.name="1",  purify=FALSE )
print(dif3)
  ##  Mantel-Haenszel Chi-square statistic:
  ##
  ##        Stat.    P-value
  ##  I0001  14.5655   0.0001 ***
  ##  I0002 300.3225   0.0000 ***
  ##  I0003   2.7160   0.0993 .
  ##  I0004 191.6925   0.0000 ***
  ##  I0005   0.0011   0.9740
  ##  [...]
  ##  Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
  ##  Detection threshold: 3.8415 (significance level: 0.05)
  ##
  ##  Effect size (ETS Delta scale):
  ##
  ##  Effect size code:
  ##   'A': negligible effect
  ##   'B': moderate effect
  ##   'C': large effect
  ##
  ##        alphaMH deltaMH
  ##  I0001  1.3908 -0.7752 A
  ##  I0002  0.2339  3.4147 C
  ##  I0003  1.1407 -0.3093 A
  ##  I0004  2.8515 -2.4625 C
  ##  I0005  1.0050 -0.0118 A
  ##  [...]
  ##
  ##  Effect size codes: 0 'A' 1.0 'B' 1.5 'C'
  ##   (for absolute values of 'deltaMH')

# recompute DIF parameter from alphaMH
uniformDIF3 <- log(dif3$alphaMH)

# compare different DIF statistics
dfr <- data.frame( "bdif"=bdif, "LR_wle"=dif1$uniformDIF,
        "LR_eap"=dif2$uniformDIF, "MH"=uniformDIF3 )
round( dfr, 3 )
  ##       bdif LR_wle LR_eap     MH
  ##  1   0.236  0.319  0.278  0.330
  ##  2  -1.149 -1.473 -1.523 -1.453
  ##  3   0.140  0.122  0.038  0.132
  ##  4   0.957  1.048  0.938  1.048
  ##  [...]
colMeans( abs( dfr[,-1] - bdif ))
  ##      LR_wle     LR_eap         MH
  ##  0.07759187 0.19085743 0.07501708

## End(Not run)

Stratified DIF Variance

Description

Calculation of stratified DIF variance

Usage

dif.strata.variance(dif, se.dif, itemcluster)

Arguments

dif

Vector of uniform DIF effects

se.dif

Standard error of uniform DIF effects

itemcluster

Vector of item strata

Value

A list with following entries:

stratadif

Summary statistics of DIF effects within item strata

weighted.DIFSD

Weighted DIF standard deviation

unweigted.DIFSD

DIF standard deviation

References

DIF Variance

Description

This function calculates the variance of DIF effects, the so called DIF variance (Longford, Holland & Thayer, 1993).

Usage

dif.variance(dif, se.dif, items=paste("item", 1:length(dif), sep="") )

Arguments

dif

Vector of uniform DIF effects

se.dif

Standard error of uniform DIF effects

items

Optional vector of item names

Value

A list with following entries

weighted.DIFSD

Weighted DIF standard deviation

unweigted.DIFSD

DIF standard deviation

mean.se.dif

Mean of standard errors of DIF effects

eb.dif

Empirical Bayes estimates of DIF effects

References

Maximum Likelihood Estimation of the Dirichlet Distribution

Description

Maximum likelihood estimation of the parameters of the Dirichlet distribution

Usage

dirichlet.mle(x, weights=NULL, eps=10^(-5), convcrit=1e-05, maxit=1000,
     oldfac=.3, progress=FALSE)

Arguments

x

Data frame with N observations and K variables of a Dirichlet distribution

weights

Optional vector of frequency weights

eps

Tolerance number which is added to prevent from logarithms of zero

convcrit

Convergence criterion

maxit

Maximum number of iterations

oldfac

Convergence acceleration factor. It must be a parameter between 0 and 1.

progress

Display iteration progress?

Value

A list with following entries

alpha

Vector of \alpha parameters

alpha0

The concentration parameter \alpha_0=\sum_k \alpha_k

xsi

Vector of proportions \xi_k=\alpha_k / \alpha_0

References

Minka, T. P. (2012). Estimating a Dirichlet distribution. Technical Report.

Examples

#############################################################################
# EXAMPLE 1: Simulate and estimate Dirichlet distribution
#############################################################################

# (1) simulate data
set.seed(789)
N <- 200
probs <- c(.5, .3, .2 )
alpha0 <- .5
alpha <- alpha0*probs
alpha <- matrix( alpha, nrow=N, ncol=length(alpha), byrow=TRUE  )
x <- sirt::dirichlet.simul( alpha )

# (2) estimate Dirichlet parameters
dirichlet.mle(x)
  ##   $alpha
  ##   [1] 0.24507708 0.14470944 0.09590745
  ##   $alpha0
  ##   [1] 0.485694
  ##   $xsi
  ##   [1] 0.5045916 0.2979437 0.1974648

## Not run: 
#############################################################################
# EXAMPLE 2: Fitting Dirichlet distribution with frequency weights
#############################################################################

# define observed data
x <- scan( nlines=1)
    1 0   0 1   .5 .5
x <- matrix( x, nrow=3, ncol=2, byrow=TRUE)

# transform observations x into (0,1)
eps <- .01
x <- ( x + eps ) / ( 1 + 2 * eps )

# compare results with likelihood fitting package maxLik
miceadds::library_install("maxLik")
# define likelihood function
dirichlet.ll <- function(param) {
    ll <- sum( weights * log( ddirichlet( x, param ) ) )
    ll
}

#*** weights 10-10-1
weights <- c(10, 10, 1 )
mod1a <- sirt::dirichlet.mle( x, weights=weights )
mod1a
# estimation in maxLik
mod1b <- maxLik::maxLik(loglik, start=c(.5,.5))
print( mod1b )
coef( mod1b )

#*** weights 10-10-10
weights <- c(10, 10, 10 )
mod2a <- sirt::dirichlet.mle( x, weights=weights )
mod2a
# estimation in maxLik
mod2b <- maxLik::maxLik(loglik, start=c(.5,.5))
print( mod2b )
coef( mod2b )

#*** weights 30-10-2
weights <- c(30, 10, 2 )
mod3a <- sirt::dirichlet.mle( x, weights=weights )
mod3a
# estimation in maxLik
mod3b <- maxLik::maxLik(loglik, start=c(.25,.25))
print( mod3b )
coef( mod3b )

## End(Not run)

Simulation of a Dirichlet Distributed Vectors

Description

This function makes random draws from a Dirichlet distribution.

Usage

dirichlet.simul(alpha)

Arguments

alpha

A matrix with \bold{\alpha} parameters of the Dirichlet distribution

Value

A data frame with Dirichlet distributed responses

Examples

#############################################################################
# EXAMPLE 1: Simulation with two components
#############################################################################

set.seed(789)
N <- 2000
probs <- c(.7, .3)    # define (extremal) class probabilities

#*** alpha0=.2  -> nearly crisp latent classes
alpha0 <- .2
alpha <- alpha0*probs
alpha <- matrix( alpha, nrow=N, ncol=length(alpha), byrow=TRUE  )
x <- sirt::dirichlet.simul( alpha )
htitle <- expression(paste( alpha[0], "=.2, ", p[1], "=.7"   ) )
hist( x[,1], breaks=seq(0,1,len=20), main=htitle)

#*** alpha0=3 -> strong deviation from crisp membership
alpha0 <- 3
alpha <- alpha0*probs
alpha <- matrix( alpha, nrow=N, ncol=length(alpha), byrow=TRUE  )
x <- sirt::dirichlet.simul( alpha )
htitle <- expression(paste( alpha[0], "=3, ", p[1], "=.7"   ) )
hist( x[,1], breaks=seq(0,1,len=20), main=htitle)

## Not run: 
#############################################################################
# EXAMPLE 2: Simulation with three components
#############################################################################

set.seed(986)
N <- 2000
probs <- c( .5, .35, .15 )

#*** alpha0=.2
alpha0 <- .2
alpha <- alpha0*probs
alpha <- matrix( alpha, nrow=N, ncol=length(alpha), byrow=TRUE  )
x <- sirt::dirichlet.simul( alpha )
htitle <- expression(paste( alpha[0], "=.2, ", p[1], "=.7"   ) )
miceadds::library_install("ade4")
ade4::triangle.plot(x, label=NULL, clabel=1)

#*** alpha0=3
alpha0 <- 3
alpha <- alpha0*probs
alpha <- matrix( alpha, nrow=N, ncol=length(alpha), byrow=TRUE  )
x <- sirt::dirichlet.simul( alpha )
htitle <- expression(paste( alpha[0], "=3, ", p[1], "=.7"   ) )
ade4::triangle.plot(x, label=NULL, clabel=1)

## End(Not run)

Comparing Regression Parameters of Different lavaan Models Fitted to the Same Dataset

Description

The function dmlavaan compares model parameters from different lavaan models fitted to the same dataset. This leads to dependent coefficients. Statistical inference is either conducted by M-estimation (i.e., robust sandwich method; method="bootstrap") or bootstrap (method="bootstrap"). See Mize et al. (2019) or Weesie (1999) for more details.

Usage

dmlavaan(fun1, args1, fun2, args2, method="sandwich", R=50)

Arguments

fun1

lavaan function of the first model (e.g., "lavaan", "cfa", or "sem")

args1

arguments for lavaan function in the first model

fun2

lavaan function of the second model (e.g., "lavaan", "cfa", or "sem")

args2

arguments for lavaan function in the second model

method

estimation method for standard errors

R

Number of bootstrap samples

Details

In bootstrap estimation, a normal approximation is applied in the computation of confidence intervals. Hence, R could be chosen relatively small.

TO DO (not yet implemented):

1)	inclusion of sampling weights
2)	cluster robust standard errors in hierarchical sampling
3)	stratification

Value

A list with following entries

coef

Model parameters of both models

vcov

Covariance matrix of model parameters of both models

partable

Parameter table containing all univariate model parameters

...

More entries

References

Mize, T.D., Doan, L., & Long, J.S. (2019). A general framework for comparing predictions and marginal effects across models. Sociological Methodology, 49(1), 152-189. doi:10.1177/0081175019852763

Weesie, J. (1999) Seemingly unrelated estimation and the cluster-adjusted sandwich estimator. Stata Technical Bulletin, 9, 231-248.

Examples

## Not run: 
############################################################################
# EXAMPLE 1: Confirmatory factor analysis with and without fourth item
#############################################################################

#**** simulate data
N <- 200  # number of persons
I <- 4    # number of items

# loadings and error correlations
lam <- seq(.7,.4, len=I)
PSI <- diag( 1-lam^2 )

# define some model misspecification
sd_error <- .1
S1 <- matrix( c( -1.84, 0.39,-0.68, 0.13,
  0.39,-1.31,-0.07,-0.27,
 -0.68,-0.07, 0.90, 1.91,
  0.13,-0.27, 1.91,-0.56 ), nrow=4, ncol=4, byrow=TRUE)
S1 <- ( S1 - mean(S1) ) / sd(S1) * sd_error

Sigma <- lam %*% t(lam) + PSI + S1
dat <- MASS::mvrnorm(n=N, mu=rep(0,I), Sigma=Sigma)
colnames(dat) <- paste0("X",1:4)
dat <- as.data.frame(dat)
rownames(Sigma) <- colnames(Sigma) <- colnames(dat)


#*** define two lavaan models
lavmodel1 <- "F=~ X1 + X2 + X3 + X4"
lavmodel2 <- "F=~ X1 + X2 + X3"

#*** define lavaan estimation arguments and functions
fun2 <- fun1 <- "cfa"
args1 <- list( model=lavmodel1, data=dat, std.lv=TRUE, estimator="MLR")
args2 <- args1
args2$model <- lavmodel2

#* run model comparison
res1 <- sirt::dmlavaan( fun1=fun1, args1=args1, fun2=fun2, args2=args2)

# inspect results
sirt:::print_digits(res1$partable, digits=3)

## End(Not run)

Computation of Eigenvalues of Many Symmetric Matrices

Description

This function computes the eigenvalue decomposition of N symmetric positive definite matrices. The eigenvalues are computed by the Rayleigh quotient method (Lange, 2010, p. 120). In addition, the inverse matrix can be calculated.

Usage

eigenvalues.manymatrices(Sigma.all, itermax=10, maxconv=0.001,
    inverse=FALSE )

Arguments

Sigma.all

An N \times D^2 matrix containing the D^2 entries of N symmetric matrices of dimension D \times D

itermax

Maximum number of iterations

maxconv

Convergence criterion for convergence of eigenvectors

inverse

A logical which indicates if the inverse matrix shall be calculated

Value

A list with following entries

lambda

Matrix with eigenvalues

U

An N \times D^2 Matrix of orthonormal eigenvectors

logdet

Vector of logarithm of determinants

det

Vector of determinants

Sigma.inv

Inverse matrix if inverse=TRUE.

References

Lange, K. (2010). Numerical Analysis for Statisticians. New York: Springer.

Examples

# define matrices
Sigma <- diag(1,3)
Sigma[ lower.tri(Sigma) ] <- Sigma[ upper.tri(Sigma) ] <- c(.4,.6,.8 )
Sigma1 <- Sigma

Sigma <- diag(1,3)
Sigma[ lower.tri(Sigma) ] <- Sigma[ upper.tri(Sigma) ] <- c(.2,.1,.99 )
Sigma2 <- Sigma

# collect matrices in a "super-matrix"
Sigma.all <- rbind( matrix( Sigma1, nrow=1, byrow=TRUE),
                matrix( Sigma2, nrow=1, byrow=TRUE) )
Sigma.all <- Sigma.all[ c(1,1,2,2,1 ), ]

# eigenvalue decomposition
m1 <- sirt::eigenvalues.manymatrices( Sigma.all )
m1

# eigenvalue decomposition for Sigma1
s1 <- svd(Sigma1)
s1

Equating in the Generalized Logistic Rasch Model

Description

This function does the linking in the generalized logistic item response model. Only item difficulties (b item parameters) are allowed. Mean-mean linking and the methods of Haebara and Stocking-Lord are implemented (Kolen & Brennan, 2004).

Usage

equating.rasch(x, y, theta=seq(-4, 4, len=100),
       alpha1=0, alpha2=0)

Arguments

x

Matrix with two columns: First column items, second column item difficulties

y

Matrix with two columns: First columns item, second column item difficulties

theta

Vector of theta values at which the linking functions should be evaluated. If a weighting according to a prespecified normal distribution N( \mu,\sigma^2) is aimed, then choose theta=stats::qnorm( seq(.001, .999, len=100), mean=mu, sd=sigma)

alpha1

Fixed \alpha_1 parameter in the generalized item response model

alpha2

Fixed \alpha_2 parameter in the generalized item response model

Value

B.est

Estimated linking constants according to the methods Mean.Mean (Mean-mean linking), Haebara (Haebara method) and Stocking.Lord (Stocking-Lord method).

descriptives

Descriptives of the linking. The linking error (linkerror) is calculated under the assumption of simple random sampling of items

anchor

Original and transformed item parameters of anchor items

transf.par

Original and transformed item parameters of all items

References

Kolen, M. J., & Brennan, R. L. (2004). Test Equating, Scaling, and Linking: Methods and Practices. New York: Springer.

Examples

#############################################################################
# EXAMPLE 1: Linking item parameters of the PISA study
#############################################################################

data(data.pisaPars)
pars <- data.pisaPars

# linking the two studies with the Rasch model
mod <- sirt::equating.rasch(x=pars[,c("item","study1")], y=pars[,c("item","study2")])
  ##   Mean.Mean    Haebara Stocking.Lord
  ## 1   0.08828 0.08896269    0.09292838

## Not run: 
#*** linking using the plink package
# The plink package is not available on CRAN anymore.
# You can download the package with
# utils::install.packages("plink", repos="http://www2.uaem.mx/r-mirror")
library(plink)
I <- nrow(pars)
pm <- plink::as.poly.mod(I)
# linking parameters
plink.pars1 <- list( "study1"=data.frame( 1, pars$study1, 0 ),
                     "study2"=data.frame( 1, pars$study2, 0 ) )
      # the parameters are arranged in the columns:
      # Discrimination, Difficulty, Guessing Parameter
# common items
common.items <- cbind("study1"=1:I,"study2"=1:I)
# number of categories per item
cats.item <- list( "study1"=rep(2,I), "study2"=rep(2,I))
# convert into plink object
x <- plink::as.irt.pars( plink.pars1, common.items, cat=cats.item,
          poly.mod=list(pm,pm))
# linking using plink: first group is reference group
out <- plink::plink(x, rescale="MS", base.grp=1, D=1.7)
# summary for linking
summary(out)
  ##   -------  group2/group1*  -------
  ##   Linking Constants
  ##
  ##                        A         B
  ##   Mean/Mean     1.000000 -0.088280
  ##   Mean/Sigma    1.000000 -0.088280
  ##   Haebara       1.000000 -0.088515
  ##   Stocking-Lord 1.000000 -0.096610
# extract linked parameters
pars.out <- plink::link.pars(out)

## End(Not run)

Jackknife Equating Error in Generalized Logistic Rasch Model

Description

This function estimates the linking error in linking based on Jackknife (Monseur & Berezner, 2007).

Usage

equating.rasch.jackknife(pars.data, display=TRUE,
   se.linkerror=FALSE, alpha1=0, alpha2=0)

Arguments

pars.data

Data frame with four columns: jackknife unit (1st column), item parameter study 1 (2nd column), item parameter study 2 (3rd column), item (4th column)

display

Display progress?

se.linkerror

Compute standard error of the linking error

alpha1

Fixed \alpha_1 parameter in the generalized item response model

alpha2

Fixed \alpha_2 parameter in the generalized item response model

Value

A list with following entries:

pars.data

Used item parameters

itemunits

Used units for jackknife

descriptives

Descriptives for Jackknife. linkingerror.jackknife is the estimated linking error.

References

Monseur, C., & Berezner, A. (2007). The computation of equating errors in international surveys in education. Journal of Applied Measurement, 8, 323-335.

Examples

#############################################################################
# EXAMPLE 1: Linking errors PISA study
#############################################################################

data(data.pisaPars)
pars <- data.pisaPars

# Linking error: Jackknife unit is the testlet
vars <- c("testlet","study1","study2","item")
res1 <- sirt::equating.rasch.jackknife(pars[, vars])
res1$descriptives
  ##   N.items N.units      shift        SD linkerror.jackknife SE.SD.jackknife
  ## 1      25       8 0.09292838 0.1487387          0.04491197      0.03466309

# Linking error: Jackknife unit is the item
res2 <- sirt::equating.rasch.jackknife(pars[, vars ] )
res2$descriptives
  ##   N.items N.units      shift        SD linkerror.jackknife SE.SD.jackknife
  ## 1      25      25 0.09292838 0.1487387          0.02682839      0.02533327

Exploratory DETECT Analysis

Description

This function estimates the DETECT index (Stout, Habing, Douglas & Kim, 1996; Zhang & Stout, 1999a, 1999b) in an exploratory way. Conditional covariances of itempairs are transformed into a distance matrix such that items are clustered by the hierarchical Ward algorithm (Roussos, Stout & Marden, 1998). Note that the function will not provide the same output as the original DETECT software.

Usage

expl.detect(data, score, nclusters, N.est=NULL, seed=NULL, bwscale=1.1,
    smooth=TRUE, use_sum_score=FALSE, hclust_method="ward.D", estsample=NULL)

Arguments

data

An N \times I data frame of dichotomous or polytomous responses. Missing responses are allowed.

score

An ability estimate, e.g. the WLE, sum score or mean score

nclusters

Maximum number of clusters used in the exploratory analysis

N.est

Number of students in a (possible) validation of the DETECT index. N.est students are drawn at random from data.

seed

Random seed

bwscale

Bandwidth scale factor

smooth

Logical indicating whether smoothing should be applied for conditional covariance estimation

use_sum_score

Logical indicating whether sum score should be used. With this option, the bias corrected conditional covariance of Zhang and Stout (1999) is used.

hclust_method

Clustering method used as the argument method in stats::hclust.

estsample

Optional vector of subject indices that defines the estimation sample

Value

A list with following entries

detect.unweighted

Unweighted DETECT statistics

detect.weighted

Weighted DETECT statistics. Weighting is done proportionally to sample sizes of item pairs.

clusterfit

Fit of the cluster method

itemcluster

Cluster allocations

use_sum_score

References

Roussos, L. A., Stout, W. F., & Marden, J. I. (1998). Using new proximity measures with hierarchical cluster analysis to detect multidimensionality. Journal of Educational Measurement, 35, 1-30.

Stout, W., Habing, B., Douglas, J., & Kim, H. R. (1996). Conditional covariance-based nonparametric multidimensionality assessment. Applied Psychological Measurement, 20, 331-354.

Zhang, J., & Stout, W. (1999a). Conditional covariance structure of generalized compensatory multidimensional items, Psychometrika, 64, 129-152.

Zhang, J., & Stout, W. (1999b). The theoretical DETECT index of dimensionality and its application to approximate simple structure, Psychometrika, 64, 213-249.

Functional Unidimensional Item Response Model

Description

Estimates the functional unidimensional item response model for dichotomous data (Ip, Molenberghs, Chen, Goegebeur & De Boeck, 2013). Either the IRT model is estimated using a probit link and employing tetrachoric correlations or item discriminations and intercepts of a pre-estimated multidimensional IRT model are provided as input.

Usage

f1d.irt(dat=NULL, nnormal=1000, nfactors=3, A=NULL, intercept=NULL,
    mu=NULL, Sigma=NULL, maxiter=100, conv=10^(-5), progress=TRUE)

Arguments

dat

Data frame with dichotomous item responses

nnormal

Number of \theta_p grid points for approximating the normal distribution

nfactors

Number of dimensions to be estimated

A

Matrix of item discriminations (if the IRT model is already estimated)

intercept

Vector of item intercepts (if the IRT model is already estimated)

mu

Vector of estimated means. In the default it is assumed that all means are zero.

Sigma

Estimated covariance matrix. In the default it is the identity matrix.

maxiter

Maximum number of iterations

conv

Convergence criterion

progress

Display progress? The default is TRUE.

Details

The functional unidimensional item response model (F1D model) for dichotomous item responses is based on a multidimensional model with a link function g (probit or logit):

P( X_{pi}=1 | \bold{\theta}_p )= g( \sum_d a_{id} \theta_{pd} - d_i )

It is assumed that \bold{\theta}_p is multivariate normally distribution with a zero mean vector and identity covariance matrix.

The F1D model estimates unidimensional item response functions such that

P( X_{pi}=1 | \theta_p^\ast ) \approx g \left( a_{i}^\ast \theta_{p}^\ast - d_i^\ast \right)

The optimization function F minimizes the deviations of the approximation equations

a_{i}^\ast \theta_{p}^\ast - d_i^\ast \approx \sum_d a_{id} \theta_{pd} - d_i

The optimization function F is defined by

F( \{ a_i^\ast, d_i^\ast \}_i, \{ \theta_p^\ast \}_p )= \sum_p \sum_i w_p ( a_{id} \theta_{pd} - d_i- a_{i}^\ast \theta_{p}^\ast + d_i^\ast )^2 \rightarrow Min!

All items i are equally weighted whereas the ability distribution of persons p are weighted according to the multivariate normal distribution (using weights w_p). The estimation is conducted using an alternating least squares algorithm (see Ip et al. 2013 for a different algorithm). The ability distribution \theta_p^\ast of the functional unidimensional model is assumed to be standardized, i.e. does have a zero mean and a standard deviation of one.

Value

A list with following entries:

item

Data frame with estimated item parameters: Item intercepts for the functional unidimensional a_{i}^\ast (ai.ast) and the ('ordinary') unidimensional (ai0) item response model. The same holds for item intercepts d_{i}^\ast (di.ast and di0 respectively).

person

Data frame with estimated \theta_p^\ast distribution. Locations are theta.ast with corresponding probabilities in wgt.

A

Estimated or provided item discriminations

intercept

Estimated or provided intercepts

dat

Used dataset

tetra

Object generated by tetrachoric2 if dat is specified as input. This list entry is useful for applying greenyang.reliability.

References

Examples

#############################################################################
# EXAMPLE 1: Dataset Mathematics data.math | Exploratory multidimensional model
#############################################################################
data(data.math)
dat <- ( data.math$data )[, -c(1,2) ] # select Mathematics items

#****
# Model 1: Functional unidimensional model based on original data

#++ (1) estimate model with 3 factors
mod1 <- sirt::f1d.irt( dat=dat, nfactors=3)

#++ (2) plot results
     par(mfrow=c(1,2))
# Intercepts
plot( mod1$item$di0, mod1$item$di.ast, pch=16, main="Item Intercepts",
        xlab=expression( paste( d[i], " (Unidimensional Model)" )),
        ylab=expression( paste( d[i], " (Functional Unidimensional Model)" )))
abline( lm(mod1$item$di.ast ~ mod1$item$di0), col=2, lty=2 )
# Discriminations
plot( mod1$item$ai0, mod1$item$ai.ast, pch=16, main="Item Discriminations",
        xlab=expression( paste( a[i], " (Unidimensional Model)" )),
        ylab=expression( paste( a[i], " (Functional Unidimensional Model)" )))
abline( lm(mod1$item$ai.ast ~ mod1$item$ai0), col=2, lty=2 )
     par(mfrow=c(1,1))

#++ (3) estimate bifactor model and Green-Yang reliability
gy1 <- sirt::greenyang.reliability( mod1$tetra, nfactors=3 )

## Not run: 
#****
# Model 2: Functional unidimensional model based on estimated multidimensional
#          item response model

#++ (1) estimate 2-dimensional exploratory factor analysis with 'smirt'
I <- ncol(dat)
Q <- matrix( 1, I,2 )
Q[1,2] <- 0
variance.fixed <- cbind( 1,2,0 )
mod2a <- sirt::smirt( dat, Qmatrix=Q, irtmodel="comp", est.a="2PL",
                variance.fixed=variance.fixed, maxiter=50)
#++ (2) input estimated discriminations and intercepts for
#       functional unidimensional model
mod2b <- sirt::f1d.irt( A=mod2a$a, intercept=mod2a$b )

#############################################################################
# EXAMPLE 2: Dataset Mathematics data.math | Confirmatory multidimensional model
#############################################################################

data(data.math)
library(TAM)

# dataset
dat <- data.math$data
dat <- dat[, grep("M", colnames(dat) ) ]
# extract item informations
iteminfo <- data.math$item
I <- ncol(dat)
# define Q-matrix
Q <- matrix( 0, nrow=I, ncol=3 )
Q[ grep( "arith", iteminfo$domain ), 1 ] <- 1
Q[ grep( "Meas", iteminfo$domain ), 2 ] <- 1
Q[ grep( "geom", iteminfo$domain ), 3 ] <- 1

# fit three-dimensional model in TAM
mod1 <- TAM::tam.mml.2pl(  dat, Q=Q, control=list(maxiter=40, snodes=1000) )
summary(mod1)

# specify functional unidimensional model
intercept <- mod1$xsi[, c("xsi") ]
names(intercept) <- rownames(mod1$xsi)
fumod1 <- sirt::f1d.irt( A=mod1$B[,2,], intercept=intercept, Sigma=mod1$variance)
fumod1$item

## End(Not run)

Fitting the ISOP and ADISOP Model for Frequency Tables

Description

Fit the isotonic probabilistic model (ISOP; Scheiblechner, 1995) and the additive isotonic probabilistic model (ADISOP; Scheiblechner, 1999).

Usage

fit.isop(freq.correct, wgt, conv=1e-04, maxit=100,
      progress=TRUE, calc.ll=TRUE)

fit.adisop(freq.correct, wgt, conv=1e-04, maxit=100,
      epsilon=0.01, progress=TRUE, calc.ll=TRUE)

Arguments

freq.correct

Frequency table

wgt

Weights for frequency table (number of persons in each cell)

conv

Convergence criterion

maxit

Maximum number of iterations

epsilon

Additive constant to handle cell frequencies of 0 or 1 in fit.adisop

progress

Display progress?

calc.ll

Calculate log-likelihood values? The default is TRUE.

Details

See isop.dich for more details of the ISOP and ADISOP model.

Value

A list with following entries

fX

Fitted frequency table

ResX

Residual frequency table

fit

Fit statistic: weighted least squares of deviations between observed and expected frequencies

item.sc

Estimated item parameters

person.sc

Estimated person parameters

ll

Log-likelihood of the model

freq.fitted

Fitted frequencies in a long data frame

Note

For fitting the ADISOP model it is recommended to first fit the ISOP model and then proceed with the fitted frequency table from ISOP (see Examples).

References

Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP). Psychometrika, 60, 281-304.

Scheiblechner, H. (1999). Additive conjoint isotonic probabilistic models (ADISOP). Psychometrika, 64, 295-316.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################

data(data.read)
dat <- as.matrix( data.read)
dat.resp <- 1 - is.na(dat) # response indicator matrix
I <- ncol(dat)

#***
# (1) Data preparation
#     actually only freq.correct and wgt are needed
#     but these matrices must be computed in advance.

# different scores of students
stud.p <- rowMeans( dat, na.rm=TRUE )
# different item p values
item.p <- colMeans( dat, na.rm=TRUE )
item.ps <- sort( item.p, index.return=TRUE)
dat <- dat[,  item.ps$ix ]
# define score groups students
scores <- sort( unique( stud.p ) )
SC <- length(scores)
# create table
freq.correct <- matrix( NA, SC, I )
wgt <- freq.correct
# percent correct
a1 <- stats::aggregate( dat==1, list( stud.p ), mean, na.rm=TRUE )
freq.correct <- a1[,-1]
# weights
a1 <- stats::aggregate( dat.resp, list( stud.p ), sum, na.rm=TRUE )
wgt <- a1[,-1]

#***
# (2) Fit ISOP model
res.isop <- sirt::fit.isop( freq.correct, wgt )
# fitted frequency table
res.isop$fX

#***
# (3) Fit ADISOP model
# use monotonely smoothed frequency table from ISOP model
res.adisop <- sirt::fit.adisop( freq.correct=res.isop$fX, wgt )
# fitted frequency table
res.adisop$fX

Clustering for Continuous Fuzzy Data

Description

This function performs clustering for continuous fuzzy data for which membership functions are assumed to be Gaussian (Denoeux, 2013). The mixture is also assumed to be Gaussian and (conditionally cluster membership) independent.

Usage

fuzcluster(dat_m, dat_s, K=2, nstarts=7, seed=NULL, maxiter=100,
     parmconv=0.001, fac.oldxsi=0.75, progress=TRUE)

## S3 method for class 'fuzcluster'
summary(object,...)

Arguments

dat_m

Centers for individual item specific membership functions

dat_s

Standard deviations for individual item specific membership functions

K

Number of latent classes

nstarts

Number of random starts. The default is 7 random starts.

seed

Simulation seed. If one value is provided, then only one start is performed.

maxiter

Maximum number of iterations

parmconv

Maximum absolute change in parameters

fac.oldxsi

Convergence acceleration factor which should take values between 0 and 1. The default is 0.75.

progress

An optional logical indicating whether iteration progress should be displayed.

object

Object of class fuzcluster

...

Further arguments to be passed

Value

A list with following entries

deviance

Deviance

iter

Number of iterations

pi_est

Estimated class probabilities

mu_est

Cluster means

sd_est

Cluster standard deviations

posterior

Individual posterior distributions of cluster membership

seed

Simulation seed for cluster solution

ic

Information criteria

References

Denoeux, T. (2013). Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Transactions on Knowledge and Data Engineering, 25, 119-130.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: 2 classes and 3 items
#############################################################################

#*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
# simulate data (2 classes and 3 items)
set.seed(876)
library(mvtnorm)
Ntot <- 1000  # number of subjects
# define SDs for simulating uncertainty
sd_uncertain <- c( .2, 1, 2 )

dat_m <- NULL   # data frame containing mean of membership function
dat_s <- NULL   # data frame containing SD of membership function

# *** Class 1
pi_class <- .6
Nclass <- Ntot * pi_class
mu <- c(3,1,0)
Sigma <- diag(3)
# simulate data
dat_m1 <- mvtnorm::rmvnorm( Nclass, mean=mu, sigma=Sigma )
dat_s1 <- matrix( stats::runif( Nclass * 3 ), nrow=Nclass )
for ( ii in 1:3){ dat_s1[,ii] <- dat_s1[,ii] * sd_uncertain[ii] }
dat_m <- rbind( dat_m, dat_m1 )
dat_s <- rbind( dat_s, dat_s1 )

# *** Class 2
pi_class <- .4
Nclass <- Ntot * pi_class
mu <- c(0,-2,0.4)
Sigma <- diag(c(0.5, 2, 2 ) )
# simulate data
dat_m1 <- mvtnorm::rmvnorm( Nclass, mean=mu, sigma=Sigma )
dat_s1 <- matrix( stats::runif( Nclass * 3 ), nrow=Nclass )
for ( ii in 1:3){ dat_s1[,ii] <- dat_s1[,ii] * sd_uncertain[ii] }
dat_m <- rbind( dat_m, dat_m1 )
dat_s <- rbind( dat_s, dat_s1 )
colnames(dat_s) <- colnames(dat_m) <- paste0("I", 1:3 )

#*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
# estimation

#*** Model 1: Clustering with 8 random starts
res1 <- sirt::fuzcluster(K=2,dat_m, dat_s, nstarts=8, maxiter=25)
summary(res1)
  ##  Number of iterations=22 (Seed=5090 )
  ##  ---------------------------------------------------
  ##  Class probabilities (2 Classes)
  ##  [1] 0.4083 0.5917
  ##
  ##  Means
  ##           I1      I2     I3
  ##  [1,] 0.0595 -1.9070 0.4011
  ##  [2,] 3.0682  1.0233 0.0359
  ##
  ##  Standard deviations
  ##         [,1]   [,2]   [,3]
  ##  [1,] 0.7238 1.3712 1.2647
  ##  [2,] 0.9740 0.8500 0.7523

#*** Model 2: Clustering with one start with seed 4550
res2 <- sirt::fuzcluster(K=2,dat_m, dat_s, nstarts=1, seed=5090 )
summary(res2)

#*** Model 3: Clustering for crisp data
#             (assuming no uncertainty, i.e. dat_s=0)
res3 <- sirt::fuzcluster(K=2,dat_m, dat_s=0*dat_s, nstarts=30, maxiter=25)
summary(res3)
  ##  Class probabilities (2 Classes)
  ##  [1] 0.3645 0.6355
  ##
  ##  Means
  ##           I1      I2      I3
  ##  [1,] 0.0463 -1.9221  0.4481
  ##  [2,] 3.0527  1.0241 -0.0008
  ##
  ##  Standard deviations
  ##         [,1]   [,2]   [,3]
  ##  [1,] 0.7261 1.4541 1.4586
  ##  [2,] 0.9933 0.9592 0.9535

#*** Model 4: kmeans cluster analysis
res4 <- stats::kmeans( dat_m, centers=2 )
  ##   K-means clustering with 2 clusters of sizes 607, 393
  ##   Cluster means:
  ##             I1        I2          I3
  ##   1 3.01550780  1.035848 -0.01662275
  ##   2 0.03448309 -2.008209  0.48295067

## End(Not run)

Estimation of a Discrete Distribution for Fuzzy Data (Data in Belief Function Framework)

Description

This function estimates a discrete distribution for uncertain data based on the belief function framework (Denoeux, 2013; see Details).

Usage

fuzdiscr(X, theta0=NULL, maxiter=200, conv=1e-04)

Arguments

X

Matrix with fuzzy data. Rows corresponds to subjects and columns to values of the membership function

theta0

Initial vector of parameter estimates

maxiter

Maximum number of iterations

conv

Convergence criterion

Details

For n subjects, membership functions m_n(k) are observed which indicate the belief in data X_n=k. The membership function is interpreted as epistemic uncertainty (Denoeux, 2011). However, associated parameters in statistical models are crisp which means that models are formulated at the basis of precise (crisp) data if they would be observed.

In the present estimation problem of a discrete distribution, the parameters of interest are category probabilities \theta_k=P( X=k).

The parameter estimation follows the evidential EM algorithm (Denoeux, 2013).

Value

Vector of probabilities of the discrete distribution

References

Denoeux, T. (2011). Maximum likelihood estimation from fuzzy data using the EM algorithm. Fuzzy Sets and Systems, 183, 72-91.

Denoeux, T. (2013). Maximum likelihood estimation from uncertain data in the belief function framework. IEEE Transactions on Knowledge and Data Engineering, 25, 119-130.

Examples

#############################################################################
# EXAMPLE 1: Binomial distribution Denoeux Example 4.3 (2013)
#############################################################################

#*** define uncertain data
X_alpha <- function( alpha ){
    Q <- matrix( 0, 6, 2 )
    Q[5:6,2] <- Q[1:3,1] <- 1
    Q[4,] <- c( alpha, 1 - alpha )
    return(Q)
        }

# define data for alpha=0.5
X <- X_alpha( alpha=.5 )
  ##   > X
  ##        [,1] [,2]
  ##   [1,]  1.0  0.0
  ##   [2,]  1.0  0.0
  ##   [3,]  1.0  0.0
  ##   [4,]  0.5  0.5
  ##   [5,]  0.0  1.0
  ##   [6,]  0.0  1.0

  ## The fourth observation has equal plausibility for the first and the
  ## second category.

# parameter estimate uncertain data
fuzdiscr( X )
  ##   > sirt::fuzdiscr( X )
  ##   [1] 0.5999871 0.4000129

# parameter estimate pseudo likelihood
colMeans( X )
  ##   > colMeans( X )
  ##   [1] 0.5833333 0.4166667
##-> Observations are weighted according to belief function values.

#*****
# plot parameter estimates as function of alpha
alpha <- seq( 0, 1, len=100 )
res <- sapply( alpha, FUN=function(aa){
             X <- X_alpha( alpha=aa )
             c( sirt::fuzdiscr( X )[1], colMeans( X )[1] )
                    } )
# plot
plot( alpha, res[1,], xlab=expression(alpha), ylab=expression( theta[alpha] ), type="l",
        main="Comparison Belief Function and Pseudo-Likelihood (Example 1)")
lines( alpha, res[2,], lty=2, col=2)
legend( 0, .67, c("Belief Function", "Pseudo-Likelihood" ), col=c(1,2), lty=c(1,2) )

#############################################################################
# EXAMPLE 2: Binomial distribution (extends Example 1)
#############################################################################

X_alpha <- function( alpha ){
    Q <- matrix( 0, 6, 2 )
    Q[6,2] <- Q[1:2,1] <- 1
    Q[3:5,] <- matrix( c( alpha, 1 - alpha ), 3, 2, byrow=TRUE)
    return(Q)
        }

X <- X_alpha( alpha=.5 )
alpha <- seq( 0, 1, len=100 )
res <- sapply( alpha, FUN=function(aa){
           X <- X_alpha( alpha=aa )
           c( sirt::fuzdiscr( X )[1], colMeans( X )[1] )
                    } )
# plot
plot( alpha, res[1,], xlab=expression(alpha), ylab=expression( theta[alpha] ), type="l",
        main="Comparison Belief Function and Pseudo-Likelihood (Example 2)")
lines( alpha, res[2,], lty=2, col=2)
legend( 0, .67, c("Belief Function", "Pseudo-Likelihood" ), col=c(1,2), lty=c(1,2) )

#############################################################################
# EXAMPLE 3: Multinomial distribution with three categories
#############################################################################

# define uncertain data
X <- matrix( c( 1,0,0, 1,0,0,   0,1,0, 0,0,1, .7, .2, .1,
         .4, .6, 0 ), 6, 3, byrow=TRUE )
  ##   > X
  ##        [,1] [,2] [,3]
  ##   [1,]  1.0  0.0  0.0
  ##   [2,]  1.0  0.0  0.0
  ##   [3,]  0.0  1.0  0.0
  ##   [4,]  0.0  0.0  1.0
  ##   [5,]  0.7  0.2  0.1
  ##   [6,]  0.4  0.6  0.0

##->  Only the first four observations are crisp.

#*** estimation for uncertain data
fuzdiscr( X )
  ##   > sirt::fuzdiscr( X )
  ##   [1] 0.5772305 0.2499931 0.1727764

#*** estimation pseudo-likelihood
colMeans(X)
  ##   > colMeans(X)
  ##   [1] 0.5166667 0.3000000 0.1833333

##-> Obviously, the treatment uncertainty is different in belief function
##   and in pseudo-likelihood framework.

Discrete (Rasch) Grade of Membership Model

Description

This function estimates the grade of membership model (Erosheva, Fienberg & Joutard, 2007; also called mixed membership model) by the EM algorithm assuming a discrete membership score distribution. The function is restricted to dichotomous item responses.

Usage

gom.em(dat, K=NULL, problevels=NULL, weights=NULL, model="GOM", theta0.k=seq(-5,5,len=15),
    xsi0.k=exp(seq(-6, 3, len=15)), max.increment=0.3, numdiff.parm=1e-4,
    maxdevchange=1e-6, globconv=1e-4, maxiter=1000, msteps=4, mstepconv=0.001,
    theta_adjust=FALSE, lambda.inits=NULL, lambda.index=NULL, pi.k.inits=NULL,
    newton_raphson=TRUE, optimizer="nlminb", progress=TRUE)

## S3 method for class 'gom'
summary(object, file=NULL, ...)

## S3 method for class 'gom'
anova(object,...)

## S3 method for class 'gom'
logLik(object,...)

## S3 method for class 'gom'
IRT.irfprob(object,...)

## S3 method for class 'gom'
IRT.likelihood(object,...)

## S3 method for class 'gom'
IRT.posterior(object,...)

## S3 method for class 'gom'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.gom'
summary(object,...)

Arguments

dat

Data frame with dichotomous responses

K

Number of classes (only applies for model="GOM")

problevels

Vector containing probability levels for membership functions (only applies for model="GOM"). If a specific space of probability levels should be estimated, then a matrix can be supplied (see Example 1, Model 2a).

weights

Optional vector of sampling weights

model

The type of grade of membership model. The default "GOM" is the nonparametric grade of membership model. A parametric multivariate normal representation can be requested by "GOMnormal". The probabilities and membership functions specifications described in Details are called via "GOMRasch".

theta0.k

Vector of \tilde{\theta}_k grid (applies only for model="GOMRasch")

xsi0.k

Vector of \xi_p grid (applies only for model="GOMRasch")

max.increment

Maximum increment

numdiff.parm

Numerical differentiation parameter

maxdevchange

Convergence criterion for change in relative deviance

globconv

Global convergence criterion for parameter change

maxiter

Maximum number of iterations

msteps

Number of iterations within a M step

mstepconv

Convergence criterion within a M step

theta_adjust

Logical indicating whether multivariate normal distribution should be adaptively chosen during the EM algorithm.

lambda.inits

Initial values for item parameters

lambda.index

Optional integer matrix with integers indicating equality constraints among \lambda item parameters

pi.k.inits

Initial values for distribution parameters

newton_raphson

Logical indicating whether Newton-Raphson should be used for final iterations

optimizer

Type of optimizer. Can be "optim" or "nlminb".

progress

Display iteration progress? Default is TRUE.

object

Object of class gom

file

Optional file name for summary output

...

Further arguments to be passed

Details

The item response model of the grade of membership model (Erosheva, Fienberg & Junker, 2002; Erosheva, Fienberg & Joutard, 2007) with K classes for dichotomous correct responses X_{pi} of person p on item i is as follows (model="GOM")

P(X_{pi}=1 | g_{p1}, \ldots, g_{pK} )=\sum_k \lambda_{ik} g_{pk} \quad, \quad \sum_{k=1}^K g_{pk}=1 \quad, \quad 0 \leq g_{pk} \leq 1

In most applications (e.g. Erosheva et al., 2007), the grade of membership function \{g_{pk}\} is assumed to follow a Dirichlet distribution. In our gom.em implementation the membership function is assumed to be discretely represented by a grid u=(u_1, \ldots, u_L) with entries between 0 and 1 (e.g. seq(0,1,length=5) with L=5). The values g_{pk} of the membership function can then only take values in \{ u_1, \ldots, u_L \} with the restriction \sum_k g_{pk} \sum_l \bold{1}(g_{pk}=u_l )=1. The grid u is specified by using the argument problevels.

The Rasch grade of membership model (model="GOMRasch") poses constraints on probabilities \lambda_{ik} and membership functions g_{pk}. The membership function of person p is parameterized by a location parameter \theta_p and a variability parameter \xi_p. Each class k is represented by a location parameter \tilde{\theta}_k. The membership function is defined as

g_{pk} \propto \exp \left[ - \frac{ (\theta_p - \tilde{\theta}_k)^2 }{2 \xi_p^2 } \right]

The person parameter \theta_p indicates the usual 'ability', while \xi_p describes the individual tendency to change between classes 1,\ldots,K and their corresponding locations \tilde{\theta}_1, \ldots,\tilde{\theta}_K. The extremal class probabilities \lambda_{ik} follow the Rasch model

\lambda_{ik}=invlogit( \tilde{\theta}_k - b_i )= \frac{ \exp( \tilde{\theta}_k - b_i ) }{ 1 + \exp( \tilde{\theta}_k - b_i ) }

Putting these assumptions together leads to the model equation

P(X_{pi}=1 | g_{p1}, \ldots, g_{pK} )= P(X_{pi}=1 | \theta_p, \xi_p )= \sum_k \frac{ \exp( \tilde{\theta}_k - b_i ) }{ 1 + \exp(\tilde{\theta}_k - b_i ) } \cdot \exp \left[ - \frac{ (\theta_p - \tilde{\theta}_k)^2 }{2 \xi_p^2 } \right]

In the extreme case of a very small \xi_p=\varepsilon > 0 and \theta_p=\theta_0, the Rasch model is obtained

P(X_{pi}=1 | \theta_p, \xi_p )= P(X_{pi}=1 | \theta_0, \varepsilon )= \frac{ \exp( \theta_0 - b_i ) }{ 1 + \exp( \theta_0 - b_i ) }

See Erosheva et al. (2002), Erosheva (2005, 2006) or Galyart (2015) for a comparison of grade of membership models with latent trait models and latent class models.

The grade of membership model is also published under the name Bernoulli aspect model, see Bingham, Kaban and Fortelius (2009).

Value

A list with following entries:

deviance

Deviance

ic

Information criteria

item

Data frame with item parameters

person

Data frame with person parameters

EAP.rel

EAP reliability (only applies for model="GOMRasch")

MAP

Maximum aposteriori estimate of the membership function

EAP

EAP estimate for individual membership scores

classdesc

Descriptives for class membership

lambda

Estimated response probabilities \lambda_{ik}

se.lambda

Standard error for estimated response probabilities \lambda_{ik}

mu

Mean of the distribution of (\theta_p, \xi_p) (only applies for model="GOMRasch")

Sigma

Covariance matrix of (\theta_p, \xi_p) (only applies for model="GOMRasch")

b

Estimated item difficulties (only applies for model="GOMRasch")

se.b

Standard error of estimated difficulties (only applies for model="GOMRasch")

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior

probs

Array with response probabilities

n.ik

Expected counts

iter

Number of iterations

I

Number of items

K

Number of classes

TP

Number of discrete integration points for (g_{p1},...,g_{pK})

theta.k

Used grid of membership functions

...

Further values

References

Bingham, E., Kaban, A., & Fortelius, M. (2009). The aspect Bernoulli model: multiple causes of presences and absences. Pattern Analysis and Applications, 12(1), 55-78.

Erosheva, E. A. (2005). Comparing latent structures of the grade of membership, Rasch, and latent class models. Psychometrika, 70, 619-628.

Erosheva, E. A. (2006). Latent class representation of the grade of membership model. Seattle: University of Washington.

Erosheva, E. A., Fienberg, S. E., & Junker, B. W. (2002). Alternative statistical models and representations for large sparse multi-dimensional contingency tables. Annales-Faculte Des Sciences Toulouse Mathematiques, 11, 485-505.

Erosheva, E. A., Fienberg, S. E., & Joutard, C. (2007). Describing disability through individual-level mixture models for multivariate binary data. Annals of Applied Statistics, 1, 502-537.

Galyardt, A. (2015). Interpreting mixed membership models: Implications of Erosheva's representation theorem. In E. M. Airoldi, D. Blei, E. A. Erosheva, & S. E. Fienberg (Eds.). Handbook of Mixed Membership Models (pp. 39-65). Chapman & Hall.

Examples

#############################################################################
# EXAMPLE 1: PISA data mathematics
#############################################################################

data(data.pisaMath)
dat <- data.pisaMath$data
dat <- dat[, grep("M", colnames(dat)) ]

#***
# Model 1: Discrete GOM with 3 classes and 5 probability levels
problevels <- seq( 0, 1, len=5 )
mod1 <- sirt::gom.em( dat, K=3, problevels, model="GOM")
summary(mod1)

## Not run: 
#-- some plots

#* multivariate scatterplot
car::scatterplotMatrix(mod1$EAP, regLine=FALSE, smooth=FALSE, pch=16, cex=.4)
#* ternary plot
vcd::ternaryplot(mod1$EAP, pch=16, col=1, cex=.3)

#***
# Model 1a: Multivariate normal distribution
problevels <- seq( 0, 1, len=5 )
mod1a <- sirt::gom.em( dat, K=3, theta0.k=seq(-15,15,len=21), model="GOMnormal" )
summary(mod1a)

#***
# Model 2: Discrete GOM with 4 classes and 5 probability levels
problevels <- seq( 0, 1, len=5 )
mod2 <- sirt::gom.em( dat, K=4, problevels,  model="GOM"  )
summary(mod2)

# model comparison
smod1 <- IRT.modelfit(mod1)
smod2 <- IRT.modelfit(mod2)
IRT.compareModels(smod1,smod2)

#***
# Model 2a: Estimate discrete GOM with 4 classes and restricted space of probability levels
#  the 2nd, 4th and 6th class correspond to "intermediate stages"
problevels <- scan()
 1  0  0  0
.5 .5  0  0
 0  1  0  0
 0 .5 .5  0
 0  0  1  0
 0  0 .5 .5
 0  0  0  1

problevels <- matrix( problevels, ncol=4, byrow=TRUE)
mod2a <- sirt::gom.em( dat, K=4, problevels,  model="GOM" )
# probability distribution for latent classes
cbind( mod2a$theta.k, mod2a$pi.k )
  ##        [,1] [,2] [,3] [,4]       [,5]
  ##   [1,]  1.0  0.0  0.0  0.0 0.17214630
  ##   [2,]  0.5  0.5  0.0  0.0 0.04965676
  ##   [3,]  0.0  1.0  0.0  0.0 0.09336660
  ##   [4,]  0.0  0.5  0.5  0.0 0.06555719
  ##   [5,]  0.0  0.0  1.0  0.0 0.27523678
  ##   [6,]  0.0  0.0  0.5  0.5 0.08458620
  ##   [7,]  0.0  0.0  0.0  1.0 0.25945016

## End(Not run)

#***
# Model 3: Rasch GOM
mod3 <- sirt::gom.em( dat, model="GOMRasch", maxiter=20 )
summary(mod3)

#***
# Model 4: 'Ordinary' Rasch model
mod4 <- sirt::rasch.mml2( dat )
summary(mod4)

## Not run: 
#############################################################################
# EXAMPLE 2: Grade of membership model with 2 classes
#############################################################################

#********* DATASET 1 *************
# define an ordinary 2 latent class model
set.seed(8765)
I <- 10
prob.class1 <- stats::runif( I, 0, .35 )
prob.class2 <- stats::runif( I, .70, .95 )
probs <- cbind( prob.class1, prob.class2 )

# define classes
N <- 1000
latent.class <- c( rep( 1, 1/4*N ), rep( 2,3/4*N ) )

# simulate item responses
dat <- matrix( NA, nrow=N, ncol=I )
for (ii in 1:I){
    dat[,ii] <- probs[ ii, latent.class ]
    dat[,ii] <- 1 * ( stats::runif(N) < dat[,ii] )
}
colnames(dat) <- paste0( "I", 1:I)

# Model 1: estimate latent class model
mod1 <- sirt::gom.em(dat, K=2, problevels=c(0,1), model="GOM" )
summary(mod1)
# Model 2: estimate GOM
mod2 <- sirt::gom.em(dat, K=2, problevels=seq(0,1,0.5), model="GOM" )
summary(mod2)
# estimated distribution
cbind( mod2$theta.k, mod2$pi.k )
  ##       [,1] [,2]        [,3]
  ##  [1,]  1.0  0.0 0.243925644
  ##  [2,]  0.5  0.5 0.006534278
  ##  [3,]  0.0  1.0 0.749540078

#********* DATASET 2 *************
# define a 2-class model with graded membership
set.seed(8765)
I <- 10
prob.class1 <- stats::runif( I, 0, .35 )
prob.class2 <- stats::runif( I, .70, .95 )
prob.class3 <- .5*prob.class1+.5*prob.class2  # probabilities for 'fuzzy class'
probs <- cbind( prob.class1, prob.class2, prob.class3)
# define classes
N <- 1000
latent.class <- c( rep(1,round(1/3*N)),rep(2,round(1/2*N)),rep(3,round(1/6*N)))
# simulate item responses
dat <- matrix( NA, nrow=N, ncol=I )
for (ii in 1:I){
    dat[,ii] <- probs[ ii, latent.class ]
    dat[,ii] <- 1 * ( stats::runif(N) < dat[,ii] )
        }
colnames(dat) <- paste0( "I", 1:I)

#** Model 1: estimate latent class model
mod1 <- sirt::gom.em(dat, K=2, problevels=c(0,1), model="GOM" )
summary(mod1)

#** Model 2: estimate GOM
mod2 <- sirt::gom.em(dat, K=2, problevels=seq(0,1,0.5), model="GOM" )
summary(mod2)
# inspect distribution
cbind( mod2$theta.k, mod2$pi.k )
  ##       [,1] [,2]      [,3]
  ##  [1,]  1.0  0.0 0.3335666
  ##  [2,]  0.5  0.5 0.1810114
  ##  [3,]  0.0  1.0 0.4854220

#***
# Model2m: estimate discrete GOM in mirt
# define latent classes
Theta <- scan( nlines=1)
   1 0   .5 .5    0 1
Theta <- matrix( Theta, nrow=3, ncol=2,byrow=TRUE)
# define mirt model
I <- ncol(dat)
#*** create customized item response function for mirt model
name <- 'gom'
par <- c("a1"=-1, "a2"=1 )
est <- c(TRUE, TRUE)
P.gom <- function(par,Theta,ncat){
    # GOM for two extremal classes
    pext1 <- stats::plogis(par[1])
    pext2 <- stats::plogis(par[2])
    P1 <- Theta[,1]*pext1 + Theta[,2]*pext2
    cbind(1-P1, P1)
}
# create item response function
icc_gom <- mirt::createItem(name, par=par, est=est, P=P.gom)
#** define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  # number of latent Theta classes
  TP <- nrow(Theta)
  # prior in initial iteration
  if ( is.null(Etable) ){ prior <- rep( 1/TP, TP ) }
  # process Etable (this is correct for datasets without missing data)
  if ( ! is.null(Etable) ){
    # sum over correct and incorrect expected responses
    prior <- ( rowSums(Etable[, seq(1,2*I,2)]) + rowSums(Etable[,seq(2,2*I,2)]) )/I
                 }
  prior <- prior / sum(prior)
  return(prior)
}
#*** estimate discrete GOM in mirt package
mod2m <- mirt::mirt(dat, 1, rep( "icc_gom",I), customItems=list("icc_gom"=icc_gom),
           technical=list( customTheta=Theta, customPriorFun=lca_prior)  )
# correct number of estimated parameters
mod2m@nest <- as.integer(sum(mod.pars$est) + nrow(Theta)-1 )
# extract log-likelihood and compute AIC and BIC
mod2m@logLik
( AIC <- -2*mod2m@logLik+2*mod2m@nest )
( BIC <- -2*mod2m@logLik+log(mod2m@Data$N)*mod2m@nest )
# extract coefficients
( cmod2m <- sirt::mirt.wrapper.coef(mod2m) )
# compare estimated distributions
round( cbind( "sirt"=mod2$pi.k, "mirt"=mod2m@Prior[[1]] ), 5 )
  ##           sirt    mirt
  ##   [1,] 0.33357 0.33627
  ##   [2,] 0.18101 0.17789
  ##   [3,] 0.48542 0.48584
# compare estimated item parameters
dfr <- data.frame( "sirt"=mod2$item[,4:5] )
dfr$mirt <- apply(cmod2m$coef[, c("a1", "a2") ], 2, stats::plogis )
round(dfr,4)
  ##      sirt.lam.Cl1 sirt.lam.Cl2 mirt.a1 mirt.a2
  ##   1        0.1157       0.8935  0.1177  0.8934
  ##   2        0.0790       0.8360  0.0804  0.8360
  ##   3        0.0743       0.8165  0.0760  0.8164
  ##   4        0.0398       0.8093  0.0414  0.8094
  ##   5        0.1273       0.7244  0.1289  0.7243
  ##   [...]

#############################################################################
# EXAMPLE 3: Lung cancer dataset; using sampling weights
#############################################################################

data(data.si08, package="sirt")
dat <- data.si08

#- Latent class model with 3 classes
problevels <- c(0,1)
mod1 <- sirt::gom.em( dat[,1:5], weights=dat$wgt, K=3, problevels=problevels )
summary(mod1)

#- Grade of membership model with discrete distribution
problevels <- seq(0,1,length=5)
mod2 <- sirt::gom.em( dat[,1:5], weights=dat$wgt, K=3, problevels=problevels )
summary(mod2)

#- Grade of membership model with multivariate normal distribution
mod3 <- sirt::gom.em( dat[,1:5], weights=dat$wgt, K=3, theta0.k=10*seq(-1,1,len=11),
            model="GOMnormal", optimizer="nlminb" )
summary(mod3)

## End(Not run)

Grade of Membership Model (Joint Maximum Likelihood Estimation)

Description

This function estimates the grade of membership model employing a joint maximum likelihood estimation method (Erosheva, 2002; p. 23ff.).

Usage

gom.jml(dat, K=2, seed=NULL, globconv=0.001, maxdevchange=0.001,
        maxiter=600, min.lambda=0.001, min.g=0.001)

Arguments

dat

Data frame of dichotomous item responses

K

Number of classes

seed

Seed value of random number generator. Deterministic starting values are used for the default value NULL.

globconv

Global parameter convergence criterion

maxdevchange

Maximum change in relative deviance

maxiter

Maximum number of iterations

min.lambda

Minimum \lambda_{ik} parameter to be estimated

min.g

Minimum g_{pk} parameter to be estimated

Details

The item response model of the grade of membership model with K classes for dichotomous correct responses X_{pi} of person p on item i is

P(X_{pi}=1 | g_{p1}, \ldots, g_{pK} )=\sum_k \lambda_{ik} g_{pk} \quad, \quad \sum_k g_{pk}=1

Value

A list with following entries:

lambda

Data frame of item parameters \lambda_{ik}

g

Data frame of individual membership scores g_{pk}

g.mean

Mean membership scores

gcut

Discretized membership scores

gcut.distr

Distribution of discretized membership scores

K

Number of classes

deviance

Deviance

ic

Information criteria

N

Number of students

score

Person score

iter

Number of iterations

datproc

List with processed data (recoded data, starting values, ...)

...

Further values

References

Erosheva, E. A. (2002). Grade of membership and latent structure models with application to disability survey data. PhD thesis, Carnegie Mellon University, Department of Statistics.

Examples

#############################################################################
# EXAMPLE 1: TIMSS data
#############################################################################

data( data.timss)
dat <- data.timss$data[, grep("M", colnames(data.timss$data) ) ]

# 2 Classes (deterministic starting values)
m2 <- sirt::gom.jml(dat,K=2, maxiter=10 )
summary(m2)

## Not run: 
# 3 Classes with fixed seed and maximum number of iterations
m3 <- sirt::gom.jml(dat,K=3, maxiter=50,seed=89)
summary(m3)

## End(Not run)

Reliability for Dichotomous Item Response Data Using the Method of Green and Yang (2009)

Description

This function estimates the model-based reliability of dichotomous data using the Green & Yang (2009) method. The underlying factor model is D-dimensional where the dimension D is specified by the argument nfactors. The factor solution is subject to the application of the Schmid-Leiman transformation (see Reise, 2012; Reise, Bonifay, & Haviland, 2013; Reise, Moore, & Haviland, 2010).

Usage

greenyang.reliability(object.tetra, nfactors)

Arguments

object.tetra

Object as the output of the function tetrachoric, the fa.parallel.poly from the psych package or the tetrachoric2 function (from sirt). This object can also be created as a list by the user where the tetrachoric correlation must must be in the list entry rho and the thresholds must be in the list entry thresh.

nfactors

Number of factors (dimensions)

Value

A data frame with columns:

coefficient

Name of the reliability measure. omega_1 (Omega) is the reliability estimate for the total score for dichotomous data based on a one-factor model, omega_t (Omega Total) is the estimate for a D-dimensional model. For the nested factor model, omega_h (Omega Asymptotic) is the reliability of the general factor model, omega_ha (Omega Hierarchical Asymptotic) eliminates item-specific variance. The explained common variance (ECV) explained by the common factor is based on the D-dimensional but does not take item thresholds into account. The amount of explained variance ExplVar is defined as the quotient of the first eigenvalue of the tetrachoric correlation matrix to the sum of all eigenvalues. The statistic EigenvalRatio is the ratio of the first and second eigenvalue.

dimensions

Number of dimensions

estimate

Reliability estimate

Note

This function needs the psych package.

References

Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74, 155-167.

Reise, S. P. (2012). The rediscovery of bifactor measurement models. Multivariate Behavioral Research, 47, 667-696.

Reise, S. P., Bonifay, W. E., & Haviland, M. G. (2013). Scoring and modeling psychological measures in the presence of multidimensionality. Journal of Personality Assessment, 95, 129-140.

Reise, S. P., Moore, T. M., & Haviland, M. G. (2010). Bifactor models and rotations: Exploring the extent to which multidimensional data yield univocal scale scores, Journal of Personality Assessment, 92, 544-559.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Reliability estimation of Reading dataset data.read
#############################################################################
miceadds::library_install("psych")
set.seed(789)
data( data.read )
dat <- data.read

# calculate matrix of tetrachoric correlations
dat.tetra <- psych::tetrachoric(dat)      # using tetrachoric from psych package
dat.tetra2 <- sirt::tetrachoric2(dat)       # using tetrachoric2 from sirt package

# perform parallel factor analysis
fap <- psych::fa.parallel.poly(dat, n.iter=1 )
  ##   Parallel analysis suggests that the number of factors=3
  ##   and the number of components=2

# parallel factor analysis based on tetrachoric correlation matrix
##       (tetrachoric2)
fap2 <- psych::fa.parallel(dat.tetra2$rho, n.obs=nrow(dat),  n.iter=1 )
  ## Parallel analysis suggests that the number of factors=6
  ## and the number of components=2
  ## Note that in this analysis, uncertainty with respect to thresholds is ignored.

# calculate reliability using a model with 4 factors
greenyang.reliability( object.tetra=dat.tetra, nfactors=4 )
  ##                                            coefficient dimensions estimate
  ## Omega Total (1D)                               omega_1          1    0.771
  ## Omega Total (4D)                               omega_t          4    0.844
  ## Omega Hierarchical (4D)                        omega_h          4    0.360
  ## Omega Hierarchical Asymptotic (4D)            omega_ha          4    0.427
  ## Explained Common Variance (4D)                     ECV          4    0.489
  ## Explained Variance (First Eigenvalue)          ExplVar         NA   35.145
  ## Eigenvalue Ratio (1st to 2nd Eigenvalue) EigenvalRatio         NA    2.121

# calculation of Green-Yang-Reliability based on tetrachoric correlations
#   obtained by tetrachoric2
greenyang.reliability( object.tetra=dat.tetra2, nfactors=4 )

# The same result will be obtained by using fap as the input
greenyang.reliability( object.tetra=fap, nfactors=4 ) 
## End(Not run)

Alignment Procedure for Linking under Approximate Invariance

Description

The function invariance.alignment performs alignment under approximate invariance for G groups and I items (Asparouhov & Muthen, 2014; Byrne & van de Vijver, 2017; DeMars, 2020; Finch, 2016; Fischer & Karl, 2019; Flake & McCoach, 2018; Kim et al., 2017; Marsh et al., 2018; Muthen & Asparouhov, 2014, 2018; Pokropek, Davidov & Schmidt, 2019). It is assumed that item loadings and intercepts are previously estimated as a unidimensional factor model under the assumption of a factor with zero mean and a variance of one.

The function invariance_alignment_constraints postprocesses the output of the invariance.alignment function and estimates item parameters under equality constraints for prespecified absolute values of parameter tolerance.

The function invariance_alignment_simulate simulates a one-factor model for multiple groups for given matrices of \nu and \lambda parameters of item intercepts and item slopes (see Example 6).

The function invariance_alignment_cfa_config estimates one-factor models separately for each group as a preliminary step for invariance alignment (see Example 6). Sampling weights are accommodated by the argument weights. The computed variance matrix vcov by this function can be used to obtain standard errors in the invariance.alignment function if it is supplied as the argument vcov.

Usage

invariance.alignment(lambda, nu, wgt=NULL, align.scale=c(1, 1),
    align.pow=c(.5, .5), eps=1e-3, psi0.init=NULL, alpha0.init=NULL, center=FALSE,
    optimizer="optim", fixed=NULL, meth=1, vcov=NULL, eps_grid=seq(0,-10, by=-.5),
    num_deriv=FALSE, ...)

## S3 method for class 'invariance.alignment'
summary(object, digits=3, file=NULL, ...)

invariance_alignment_constraints(model, lambda_parm_tol, nu_parm_tol )

## S3 method for class 'invariance_alignment_constraints'
summary(object, digits=3, file=NULL, ...)

invariance_alignment_simulate(nu, lambda, err_var, mu, sigma, N, output="data",
     groupwise=FALSE, exact=FALSE)

invariance_alignment_cfa_config(dat, group, weights=NULL, model="2PM", verbose=FALSE, ...)

Arguments

lambda

A G \times I matrix with item loadings

nu

A G \times I matrix with item intercepts

wgt

A G \times I matrix for weighing groups for each item

align.scale

A vector of length two containing scale parameter a_\lambda and a_\nu (see Details)

align.pow

A vector of length two containing power p_\lambda and p_\nu (see Details)

eps

A parameter in the optimization function

psi0.init

An optional vector of initial \psi_0 parameters

alpha0.init

An optional vector of initial \alpha_0 parameters

center

Logical indicating whether estimated means and standard deviations should be centered.

optimizer

Name of the optimizer chosen for alignment. Options are "optim" (using stats::optim) or "nlminb" (using stats::nlminb).

fixed

Logical indicating whether SD of first group should be fixed to one. If fixed=FALSE, the product of all SDs is set to one. If NULL, then fixed is automatically chosen by default. For many groups, fixed=FALSE is chosen.

meth

Type of method used for optimization function. meth=1 is the default and the optimization function used in Mplus. meth=2 uses logarithmized item loadings in alignment. The choice meth=4 uses the constraint \prod_g \psi_g=1 and adds the penalty \lambda \sum_g \alpha_g^2 for a fixed value \lambda that depends on the weights wgt (similar to Mplus' free method). The choice meth=3 only uses the constraint \prod_g \psi_g=1 (similar to Mplus' FIXED method).

vcov

Variance matrix produced by invariance_alignment_cfa_config for standard error computation. If a matrix is provided, standard errors are computed.

eps_grid

Grid of logarithmized epsilon values in optimization

num_deriv

Logical indicating whether numerical derivatives should be used

object

Object of class invariance.alignment

digits

Number of digits used for rounding

file

Optional file name in which summary should be sunk

...

Further optional arguments to be passed

model

Model of class invariance.alignment. For invariance_alignment_cfa_config: Model type: "2PM" for two-parameter model with unequal loadings and "1PM" with equal loadings and equal residual variances

lambda_parm_tol

Parameter tolerance for \lambda parameters

nu_parm_tol

Parameter tolerance for \nu parameters

err_var

Error variance

mu

Vector of means

sigma

Vector of standard deviations

N

Vector of sample sizes per group

output

Specifies output type: "data" for dataset and "suffstat" for sufficient statistics (i.e., means and covariance matrices)

groupwise

Logical indicating whether group-wise output is requested

exact

Logical indicating whether distributions should be exactly preserved in simulated data

dat

Dataset with items or a list containing sufficient statistics

group

Vector containing group indicators

weights

Optional vector of sampling weights

verbose

Logical indicating whether progress should be printed

Details

For G groups and I items, item loadings \lambda_{ig0} and intercepts \nu_{ig0} are available and have been estimated in a 1-dimensional factor analysis assuming a standardized factor.

The alignment procedure searches means \alpha_{g0} and standard deviations \psi_{g0} using an alignment optimization function F. This function is defined as

F=\sum_i \sum_{ g_1 < g_2} w_{i,g1} w_{i,g2} f_\lambda( \lambda_{i g_1,1} - \lambda_{i g_2,1} ) + \sum_i \sum_{ g_1 < g_2} w_{i,g1} w_{i,g2} f_\nu( \nu_{i g_1,1} - \nu_{i g_2,1} )

where the aligned item parameters \lambda_{i g,1} and \nu_{i g,1} are defined such that

\lambda_{i g,1}=\lambda_{i g 0} / \psi_{g0} \qquad \mbox{and} \qquad \nu_{i g,1}=\nu_{i g 0} - \alpha_{g0} \lambda_{ig0} / \psi_{g0}

and the optimization functions are defined as

f_\lambda (x)=| x/ a_\lambda | ^{p_\lambda} \approx [ ( x/ a_\lambda )^2 + \varepsilon ]^{p_\lambda / 2} \qquad \mbox{and} \qquad f_\nu (x)=| x/ a_\nu ]^{p_\nu} \approx [ ( x/ a_\nu )^2 + \varepsilon ]^{p_\nu / 2}

using a small \varepsilon > 0 (e.g. .001) to obtain a differentiable optimization function. For p_\nu=0 or p_\lambda=0, the optimization function essentially counts the number of different parameter and mimicks a L_0 penalty which is zero iff the argument is zero and one otherwise. It is approximated by

f(x)=x^2 (x^2 + \varepsilon )^{-1}

(O'Neill & Burke, 2023).

For identification reasons, the product \Pi_g \psi_{g0} (meth=0,0.5) of all group standard deviations or \psi_1 (meth=1,2) is set to one. The mean \alpha_{g0} of the first group is set to zero (meth=0.5,1,2) or a penalty function is added to the linking function (meth=0).

Note that Asparouhov and Muthen (2014) use a_\lambda=a_\nu=1 (which can be modified in align.scale) and p_\lambda=p_\nu=0.5 (which can be modified in align.pow). In case of p_\lambda=2, the penalty is approximately f_\lambda(x)=x^2 , in case of p_\lambda=0.5 it is approximately f_\lambda(x)=\sqrt{|x|} . Note that sirt used a different parametrization in versions up to 3.5. The p parameters have to be halved for consistency with previous versions (e.g., the Asparouhov & Muthen parametrization corresponds to p=.25; see also Fischer & Karl, 2019, for an application of the previous parametrization).

Effect sizes of approximate invariance based on R^2 have been proposed by Asparouhov and Muthen (2014). These are calculated separately for item loading and intercepts, resulting in R^2_\lambda and R^2_\nu measures which are included in the output es.invariance. In addition, the average correlation of aligned item parameters among groups (rbar) is reported.

Metric invariance means that all aligned item loadings \lambda_{ig,1} are equal across groups and therefore R^2_\lambda=1. Scalar invariance means that all aligned item loadings \lambda_{ig,1} and aligned item intercepts \nu_{ig,1} are equal across groups and therefore R^2_\lambda=1 and R^2_\nu=1 (see Vandenberg & Lance, 2000).

Value

A list with following entries

pars

Aligned distribution parameters

itempars.aligned

Aligned item parameters for all groups

es.invariance

Effect sizes of approximate invariance

lambda.aligned

Aligned \lambda_{i g,1} parameters

lambda.resid

Residuals of \lambda_{i g,1} parameters

nu.aligned

Aligned \nu_{i g,1} parameters

nu.resid

Residuals of \nu_{i g,1} parameters

Niter

Number of iterations for f_\lambda and f_\nu optimization functions

fopt

Minimum optimization value

align.scale

Used alignment scale parameters

align.pow

Used alignment power parameters

vcov

Estimated variance matrix of aligned means and standard deviations

References

Asparouhov, T., & Muthen, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling, 21(4), 1-14. doi:10.1080/10705511.2014.919210

Byrne, B. M., & van de Vijver, F. J. R. (2017). The maximum likelihood alignment approach to testing for approximate measurement invariance: A paradigmatic cross-cultural application. Psicothema, 29(4), 539-551. doi:10.7334/psicothema2017.178

DeMars, C. E. (2020). Alignment as an alternative to anchor purification in DIF analyses. Structural Equation Modeling, 27(1), 56-72. doi:10.1080/10705511.2019.1617151

Finch, W. H. (2016). Detection of differential item functioning for more than two groups: A Monte Carlo comparison of methods. Applied Measurement in Education, 29,(1), 30-45, doi:10.1080/08957347.2015.1102916

Flake, J. K., & McCoach, D. B. (2018). An investigation of the alignment method with polytomous indicators under conditions of partial measurement invariance. Structural Equation Modeling, 25(1), 56-70. doi:10.1080/10705511.2017.1374187

Kim, E. S., Cao, C., Wang, Y., & Nguyen, D. T. (2017). Measurement invariance testing with many groups: A comparison of five approaches. Structural Equation Modeling, 24(4), 524-544. doi:10.1080/10705511.2017.1304822

Marsh, H. W., Guo, J., Parker, P. D., Nagengast, B., Asparouhov, T., Muthen, B., & Dicke, T. (2018). What to do when scalar invariance fails: The extended alignment method for multi-group factor analysis comparison of latent means across many groups. Psychological Methods, 23(3), 524-545. doi: 10.1037/met0000113

Muthen, B., & Asparouhov, T. (2014). IRT studies of many groups: The alignment method. Frontiers in Psychology | Quantitative Psychology and Measurement, 5:978. doi:10.3389/fpsyg.2014.00978

Muthen, B., & Asparouhov, T. (2018). Recent methods for the study of measurement invariance with many groups: Alignment and random effects. Sociological Methods & Research, 47(4), 637-664. doi:10.1177/0049124117701488

O'Neill, M., & Burke, K. (2023). Variable selection using a smooth information criterion for distributional regression models. Statistics and Computing, 33(3), 71. doi:10.1007/s11222-023-10204-8

Pokropek, A., Davidov, E., & Schmidt, P. (2019). A Monte Carlo simulation study to assess the appropriateness of traditional and newer approaches to test for measurement invariance. Structural Equation Modeling, 26(5), 724-744. doi:10.1080/10705511.2018.1561293

Vandenberg, R. J., & Lance, C. E. (2000). A review and synthesis of the measurement invariance literature: Suggestions, practices, and recommendations for organizational research. Organizational Research Methods, 3, 4-70. doi:10.1177/109442810031002s

Examples

#############################################################################
# EXAMPLE 1: Item parameters cultural activities
#############################################################################

data(data.activity.itempars, package="sirt")
lambda <- data.activity.itempars$lambda
nu <- data.activity.itempars$nu
Ng <-  data.activity.itempars$N
wgt <- matrix( sqrt(Ng), length(Ng), ncol(nu) )

#***
# Model 1: Alignment using a quadratic loss function
mod1 <- sirt::invariance.alignment( lambda, nu, wgt, align.pow=c(2,2) )
summary(mod1)

#****
# Model 2: Different powers for alignment
mod2 <- sirt::invariance.alignment( lambda, nu, wgt,  align.pow=c(.5,1),
              align.scale=c(.95,.95))
summary(mod2)

# compare means from Models 1 and 2
plot( mod1$pars$alpha0, mod2$pars$alpha0, pch=16,
    xlab="M (Model 1)", ylab="M (Model 2)", xlim=c(-.3,.3), ylim=c(-.3,.3) )
lines( c(-1,1), c(-1,1), col="gray")
round( cbind( mod1$pars$alpha0, mod2$pars$alpha0 ), 3 )
round( mod1$nu.resid, 3)
round( mod2$nu.resid,3 )

# L0 penalty
mod2b <- sirt::invariance.alignment( lambda, nu, wgt,  align.pow=c(0,0),
              align.scale=c(.3,.3))
summary(mod2b)

#****
# Model 3: Low powers for alignment of scale and power
# Note that setting increment.factor larger than 1 seems necessary
mod3 <- sirt::invariance.alignment( lambda, nu, wgt, align.pow=c(.5,.75),
            align.scale=c(.55,.55), psi0.init=mod1$psi0, alpha0.init=mod1$alpha0 )
summary(mod3)

# compare mean and SD estimates of Models 1 and 3
plot( mod1$pars$alpha0, mod3$pars$alpha0, pch=16)
plot( mod1$pars$psi0, mod3$pars$psi0, pch=16)

# compare residuals for Models 1 and 3
# plot lambda
plot( abs(as.vector(mod1$lambda.resid)), abs(as.vector(mod3$lambda.resid)),
      pch=16, xlab="Residuals lambda (Model 1)",
      ylab="Residuals lambda (Model 3)", xlim=c(0,.1), ylim=c(0,.1))
lines( c(-3,3),c(-3,3), col="gray")
# plot nu
plot( abs(as.vector(mod1$nu.resid)), abs(as.vector(mod3$nu.resid)),
      pch=16, xlab="Residuals nu (Model 1)", ylab="Residuals nu (Model 3)",
      xlim=c(0,.4),ylim=c(0,.4))
lines( c(-3,3),c(-3,3), col="gray")

## Not run: 
#############################################################################
# EXAMPLE 2: Comparison 4 groups | data.inv4gr
#############################################################################

data(data.inv4gr)
dat <- data.inv4gr
miceadds::library_install("semTools")

model1 <- "
    F=~ I01 + I02 + I03 + I04 + I05 + I06 + I07 + I08 + I09 + I10 + I11
    F ~~ 1*F
    "

res <- semTools::measurementInvariance(model1, std.lv=TRUE, data=dat, group="group")
  ##   Measurement invariance tests:
  ##
  ##   Model 1: configural invariance:
  ##       chisq        df    pvalue       cfi     rmsea       bic
  ##     162.084   176.000     0.766     1.000     0.000 95428.025
  ##
  ##   Model 2: weak invariance (equal loadings):
  ##       chisq        df    pvalue       cfi     rmsea       bic
  ##     519.598   209.000     0.000     0.973     0.039 95511.835
  ##
  ##   [Model 1 versus model 2]
  ##     delta.chisq      delta.df delta.p.value     delta.cfi
  ##         357.514        33.000         0.000         0.027
  ##
  ##   Model 3: strong invariance (equal loadings + intercepts):
  ##       chisq        df    pvalue       cfi     rmsea       bic
  ##    2197.260   239.000     0.000     0.828     0.091 96940.676
  ##
  ##   [Model 1 versus model 3]
  ##     delta.chisq      delta.df delta.p.value     delta.cfi
  ##        2035.176        63.000         0.000         0.172
  ##
  ##   [Model 2 versus model 3]
  ##     delta.chisq      delta.df delta.p.value     delta.cfi
  ##        1677.662        30.000         0.000         0.144
  ##

# extract item parameters separate group analyses
ipars <- lavaan::parameterEstimates(res$fit.configural)
# extract lambda's: groups are in rows, items in columns
lambda <- matrix( ipars[ ipars$op=="=~", "est"], nrow=4,  byrow=TRUE)
colnames(lambda) <- colnames(dat)[-1]
# extract nu's
nu <- matrix( ipars[ ipars$op=="~1"  & ipars$se !=0, "est" ], nrow=4,  byrow=TRUE)
colnames(nu) <- colnames(dat)[-1]

# Model 1: least squares optimization
mod1 <- sirt::invariance.alignment( lambda=lambda, nu=nu )
summary(mod1)
  ##   Effect Sizes of Approximate Invariance
  ##          loadings intercepts
  ##   R2       0.9826     0.9972
  ##   sqrtU2   0.1319     0.0526
  ##   rbar     0.6237     0.7821
  ##   -----------------------------------------------------------------
  ##   Group Means and Standard Deviations
  ##     alpha0  psi0
  ##   1  0.000 0.965
  ##   2 -0.105 1.098
  ##   3 -0.081 1.011
  ##   4  0.171 0.935

# Model 2: sparse target function
mod2 <- sirt::invariance.alignment( lambda=lambda, nu=nu, align.pow=c(.5,.5) )
summary(mod2)
  ##   Effect Sizes of Approximate Invariance
  ##          loadings intercepts
  ##   R2       0.9824     0.9972
  ##   sqrtU2   0.1327     0.0529
  ##   rbar     0.6237     0.7856
  ##   -----------------------------------------------------------------
  ##   Group Means and Standard Deviations
  ##     alpha0  psi0
  ##   1 -0.002 0.965
  ##   2 -0.107 1.098
  ##   3 -0.083 1.011
  ##   4  0.170 0.935

#############################################################################
# EXAMPLE 3: European Social Survey data.ess2005
#############################################################################

data(data.ess2005)
lambda <- data.ess2005$lambda
nu <- data.ess2005$nu

# Model 1: least squares optimization
mod1 <- sirt::invariance.alignment( lambda=lambda, nu=nu, align.pow=c(2,2) )
summary(mod1)

# Model 2: sparse target function and definition of scales
mod2 <- sirt::invariance.alignment( lambda=lambda, nu=nu, control=list(trace=2) )
summary(mod2)

#############################################################################
# EXAMPLE 4: Linking with item parameters containing outliers
#############################################################################

# see Help file in linking.robust

# simulate some item difficulties in the Rasch model
I <- 38
set.seed(18785)
itempars <- data.frame("item"=paste0("I",1:I) )
itempars$study1 <- stats::rnorm( I, mean=.3, sd=1.4 )
# simulate DIF effects plus some outliers
bdif <- stats::rnorm(I, mean=.4, sd=.09) +
             (stats::runif(I)>.9 )*rep( 1*c(-1,1)+.4, each=I/2 )
itempars$study2 <- itempars$study1 + bdif
# create input for function invariance.alignment
nu <- t( itempars[,2:3] )
colnames(nu) <- itempars$item
lambda <- 1+0*nu

# linking using least squares optimization
mod1 <- sirt::invariance.alignment( lambda=lambda, nu=nu )
summary(mod1)
  ##   Group Means and Standard Deviations
  ##          alpha0 psi0
  ##   study1 -0.286    1
  ##   study2  0.286    1

# linking using powers of .5
mod2 <- sirt::invariance.alignment( lambda=lambda, nu=nu, align.pow=c(1,1) )
summary(mod2)
  ##   Group Means and Standard Deviations
  ##          alpha0 psi0
  ##   study1 -0.213    1
  ##   study2  0.213    1

# linking using powers of .25
mod3 <- sirt::invariance.alignment( lambda=lambda, nu=nu, align.pow=c(.5,.5) )
summary(mod3)
  ##   Group Means and Standard Deviations
  ##          alpha0 psi0
  ##   study1 -0.207    1
  ##   study2  0.207    1

#############################################################################
# EXAMPLE 5: Linking gender groups with data.math
#############################################################################

data(data.math)
dat <- data.math$data
dat.male <- dat[ dat$female==0, substring( colnames(dat),1,1)=="M"  ]
dat.female <- dat[ dat$female==1, substring( colnames(dat),1,1)=="M"  ]

#*************************
# Model 1: Linking using the Rasch model
mod1m <- sirt::rasch.mml2( dat.male )
mod1f <- sirt::rasch.mml2( dat.female )

# create objects for invariance.alignment
nu <- rbind( mod1m$item$thresh, mod1f$item$thresh )
colnames(nu) <- mod1m$item$item
rownames(nu) <- c("male", "female")
lambda <- 1+0*nu

# mean of item difficulties
round( rowMeans(nu), 3 )

# Linking using least squares optimization
res1a <- sirt::invariance.alignment( lambda, nu, align.scale=c( .3, .5 ) )
summary(res1a)

# Linking using optimization with absolute value function (pow=.5)
res1b <- sirt::invariance.alignment( lambda, nu, align.scale=c( .3, .5 ),
                align.pow=c(1,1) )
summary(res1b)

#-- compare results with Haberman linking
I <- ncol(dat.male)
itempartable <- data.frame( "study"=rep( c("male", "female"), each=I ) )
itempartable$item <- c( paste0(mod1m$item$item),  paste0(mod1f$item$item) )
itempartable$a <- 1
itempartable$b <- c( mod1m$item$b, mod1f$item$b )
# estimate linking parameters
res1c <- sirt::linking.haberman( itempars=itempartable )

#-- results of sirt::equating.rasch
x <- itempartable[ 1:I, c("item", "b") ]
y <- itempartable[ I + 1:I, c("item", "b") ]
res1d <- sirt::equating.rasch( x, y )
round( res1d$B.est, 3 )
  ##     Mean.Mean Haebara Stocking.Lord
  ##   1     0.032   0.032         0.029

#*************************
# Model 2: Linking using the 2PL model
I <- ncol(dat.male)
mod2m <- sirt::rasch.mml2( dat.male, est.a=1:I)
mod2f <- sirt::rasch.mml2( dat.female, est.a=1:I)

# create objects for invariance.alignment
nu <- rbind( mod2m$item$thresh, mod2f$item$thresh )
colnames(nu) <- mod2m$item$item
rownames(nu) <- c("male", "female")
lambda <- rbind( mod2m$item$a, mod2f$item$a )
colnames(lambda) <- mod2m$item$item
rownames(lambda) <- c("male", "female")

res2a <- sirt::invariance.alignment( lambda, nu, align.scale=c( .3, .5 ) )
summary(res2a)

res2b <- sirt::invariance.alignment( lambda, nu, align.scale=c( .3, .5 ),
                align.pow=c(1,1) )
summary(res2b)

# compare results with Haberman linking
I <- ncol(dat.male)
itempartable <- data.frame( "study"=rep( c("male", "female"), each=I ) )
itempartable$item <- c( paste0(mod2m$item$item),  paste0(mod2f$item$item ) )
itempartable$a <- c( mod2m$item$a, mod2f$item$a )
itempartable$b <- c( mod2m$item$b, mod2f$item$b )
# estimate linking parameters
res2c <- sirt::linking.haberman( itempars=itempartable )

#############################################################################
# EXAMPLE 6: Data from Asparouhov & Muthen (2014) simulation study
#############################################################################

G <- 3  # number of groups
I <- 5  # number of items
# define lambda and nu parameters
lambda <- matrix(1, nrow=G, ncol=I)
nu <- matrix(0, nrow=G, ncol=I)

# define size of noninvariance
dif <- 1

#- 1st group: N(0,1)
lambda[1,3] <- 1+dif*.4; nu[1,5] <- dif*.5

#- 2nd group: N(0.3,1.5)
gg <- 2 ; mu <- .3; sigma <- sqrt(1.5)
lambda[gg,5] <- 1-.5*dif; nu[gg,1] <- -.5*dif
nu[gg,] <- nu[gg,] + mu*lambda[gg,]
lambda[gg,] <- lambda[gg,] * sigma

#- 3nd group: N(.8,1.2)
gg <- 3 ; mu <- .8; sigma <- sqrt(1.2)
lambda[gg,4] <- 1-.7*dif; nu[gg,2] <- -.5*dif
nu[gg,] <- nu[gg,] + mu*lambda[gg,]
lambda[gg,] <- lambda[gg,] * sigma

# define alignment scale
align.scale <- c(.2,.4)   # Asparouhov and Muthen use c(1,1)
# define alignment powers
align.pow <- c(.5,.5)   # as in Asparouhov and Muthen

#*** estimate alignment parameters
mod1 <- sirt::invariance.alignment( lambda, nu, eps=.01, optimizer="optim",
            align.scale=align.scale, align.pow=align.pow, center=FALSE )
summary(mod1)

#--- find parameter constraints for prespecified tolerance
cmod1 <- sirt::invariance_alignment_constraints(model=mod1, nu_parm_tol=.4,
            lambda_parm_tol=.2 )
summary(cmod1)

#############################################################################
# EXAMPLE 7: Similar to Example 6, but with data simulation and CFA estimation
#############################################################################

#--- data simulation

set.seed(65)
G <- 3  # number of groups
I <- 5  # number of items
# define lambda and nu parameters
lambda <- matrix(1, nrow=G, ncol=I)
nu <- matrix(0, nrow=G, ncol=I)
err_var <- matrix(1, nrow=G, ncol=I)

# define size of noninvariance
dif <- 1
#- 1st group: N(0,1)
lambda[1,3] <- 1+dif*.4; nu[1,5] <- dif*.5
#- 2nd group: N(0.3,1.5)
gg <- 2 ;
lambda[gg,5] <- 1-.5*dif; nu[gg,1] <- -.5*dif
#- 3nd group: N(.8,1.2)
gg <- 3
lambda[gg,4] <- 1-.7*dif; nu[gg,2] <- -.5*dif
#- define distributions of groups
mu <- c(0,.3,.8)
sigma <- sqrt(c(1,1.5,1.2))
N <- rep(1000,3) # sample sizes per group

#* simulate data
dat <- sirt::invariance_alignment_simulate(nu, lambda, err_var, mu, sigma, N)
head(dat)

#--- estimate CFA models
pars <- sirt::invariance_alignment_cfa_config(dat[,-1], group=dat$group)
print(pars)

#--- invariance alignment
# define alignment scale
align.scale <- c(.2,.4)
# define alignment powers
align.pow <- c(.5,.5)
mod1 <- sirt::invariance.alignment( lambda=pars$lambda, nu=pars$nu, eps=.01,
            optimizer="optim", align.scale=align.scale, align.pow=align.pow, center=FALSE)
#* find parameter constraints for prespecified tolerance
cmod1 <- sirt::invariance_alignment_constraints(model=mod1, nu_parm_tol=.4,
            lambda_parm_tol=.2 )
summary(cmod1)

#--- estimate CFA models with sampling weights

#* simulate weights
weights <- stats::runif(sum(N), 0, 2)
#* estimate models
pars2 <- sirt::invariance_alignment_cfa_config(dat[,-1], group=dat$group, weights=weights)
print(pars2$nu)
print(pars$nu)

#--- estimate one-parameter model
pars <- sirt::invariance_alignment_cfa_config(dat[,-1], group=dat$group, model="1PM")
print(pars)

#############################################################################
# EXAMPLE 8: Computation of standard errors
#############################################################################

G <- 3  # number of groups
I <- 5  # number of items
# define lambda and nu parameters
lambda <- matrix(1, nrow=G, ncol=I)
nu <- matrix(0, nrow=G, ncol=I)

# define size of noninvariance
dif <- 1

mu1 <- c(0,.3,.8)
sigma1 <- c(1,1.25,1.1)

#- 1st group
lambda[1,3] <- 1+dif*.4; nu[1,5] <- dif*.5

#- 2nd group
gg <- 2
lambda[gg,5] <- 1-.5*dif; nu[gg,1] <- -.5*dif

#- 3nd group
gg <- 3
lambda[gg,4] <- 1-.7*dif; nu[gg,2] <- -.5*dif

dat <- sirt::invariance_alignment_simulate(nu=nu, lambda=lambda, err_var=1+0*lambda,
                mu=mu1, sigma=sigma1, N=500, output="data", exact=TRUE)

#* estimate CFA
res <- sirt::invariance_alignment_cfa_config(dat=dat[,-1], group=dat$group )

#- perform invariance alignment
eps <- .001
align.pow <- 0.5*rep(1,2)
lambda <- res$lambda
nu <- res$nu
mod1 <- sirt::invariance.alignment( lambda=lambda, nu=nu, eps=eps, optimizer="optim",
             align.pow=align.pow, meth=meth, vcov=res$vcov)
# variance matrix and standard errors
mod1$vcov
sqrt(diag(mod1$vcov))

## End(Not run)

Person Parameter Estimation

Description

Computes the maximum likelihood estimate (MLE), weighted likelihood estimate (WLE) and maximum aposterior estimate (MAP) of ability in unidimensional item response models (Penfield & Bergeron, 2005; Warm, 1989). Item response functions can be defined by the user.

Usage

IRT.mle(data, irffct, arg.list, theta=rep(0,nrow(data)), type="MLE",
     mu=0, sigma=1, maxiter=20, maxincr=3, h=0.001, convP=1e-04,
     maxval=9, progress=TRUE)

Arguments

data

Data frame with item responses

irffct

User defined item response (see Examples). Arguments must be specified in arg.list. The function must contain theta and ii (item index) as arguments.

theta

Initial ability estimate

arg.list

List of arguments for irffct.

type

Type of ability estimate. It can be "MLE" (the default), "WLE" or "MAP".

mu

Mean of normal prior distribution (for type="MAP")

sigma

Standard deviation of normal prior distribution (for type="MAP")

maxiter

Maximum number of iterations

maxincr

Maximum increment

h

Numerical differentiation parameter

convP

Convergence criterion

maxval

Maximum ability value to be estimated

progress

Logical indicating whether iteration progress should be displayed

Value

Data frame with estimated abilities (est) and its standard error (se).

References

Penfield, R. D., & Bergeron, J. M. (2005). Applying a weighted maximum likelihood latent trait estimator to the generalized partial credit model. Applied Psychological Measurement, 29, 218-233.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Generalized partial credit model
#############################################################################

data(data.ratings1)
dat <- data.ratings1

# estimate model
mod1 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             pid=dat$idstud, maxiter=15)
# extract dataset and item parameters
data <- mod1$procdata$dat2.NA
a <- mod1$ipars.dat2$a
b <- mod1$ipars.dat2$b
theta0 <- mod1$person$EAP
# define item response function for item ii
calc.pcm <- function( theta, a, b, ii ){
    K <- ncol(b)
    N <- length(theta)
    matrK <- matrix( 0:K, nrow=N, ncol=K+1, byrow=TRUE)
    eta <- a[ii] * theta * matrK - matrix( c(0,b[ii,]), nrow=N, ncol=K+1, byrow=TRUE)
    eta <- exp(eta)
    probs <- eta / rowSums(eta, na.rm=TRUE)
    return(probs)
}
arg.list <- list("a"=a, "b"=b )

# MLE
abil1 <- sirt::IRT.mle( data, irffct=calc.pcm, theta=theta0, arg.list=arg.list )
str(abil1)
# WLE
abil2 <- sirt::IRT.mle( data, irffct=calc.pcm, theta=theta0, arg.list=arg.list, type="WLE")
str(abil2)
# MAP with prior distribution N(.2, 1.3)
abil3 <- sirt::IRT.mle( data, irffct=calc.pcm, theta=theta0, arg.list=arg.list,
              type="MAP", mu=.2, sigma=1.3 )
str(abil3)

#############################################################################
# EXAMPLE 2: Rasch model
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)

# estimate Rasch model
mod1 <- sirt::rasch.mml2( dat )
summary(mod1)

# define item response function
irffct <- function( theta, b, ii){
    eta <- exp( theta - b[ii] )
    probs <- eta / ( 1 + eta )
    probs <- cbind( 1 - probs, probs )
    return(probs)
}
# initial person parameters and item parameters
theta0 <- mod1$person$EAP
arg.list <- list( "b"=mod1$item$b  )

# estimate WLE
abil <- sirt::IRT.mle( data=dat, irffct=irffct, arg.list=arg.list,
            theta=theta0, type="WLE")
# compare with wle.rasch function
theta <- sirt::wle.rasch( dat, b=mod1$item$b )
cbind( abil[,1], theta$theta, abil[,2], theta$se.theta )

#############################################################################
# EXAMPLE 3: Ramsay quotient model
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)

# estimate Ramsay model
mod1 <- sirt::rasch.mml2( dat, irtmodel="ramsay.qm" )
summary(mod1)
# define item response function
irffct <- function( theta, b, K, ii){
    eta <- exp( theta / b[ii] )
    probs <- eta / ( K[ii] + eta )
    probs <- cbind( 1 - probs, probs )
    return(probs)
}
# initial person parameters and item parameters
theta0 <- exp( mod1$person$EAP )
arg.list <- list( "b"=mod1$item2$b, "K"=mod1$item2$K )
# estimate MLE
res <- sirt::IRT.mle( data=dat, irffct=irffct, arg.list=arg.list, theta=theta0,
            maxval=20, maxiter=50)

## End(Not run)

Fit Unidimensional ISOP and ADISOP Model to Dichotomous and Polytomous Item Responses

Description

Fit the unidimensional isotonic probabilistic model (ISOP; Scheiblechner, 1995, 2007) and the additive istotonic probabilistic model (ADISOP; Scheiblechner, 1999). The isop.dich function can be used for dichotomous data while the isop.poly function can be applied to polytomous data. Note that for applying the ISOP model for polytomous data it is necessary that all items do have the same number of categories.

Usage

isop.dich(dat, score.breaks=NULL, merge.extreme=TRUE,
     conv=.0001, maxit=1000, epsilon=.025, progress=TRUE)

isop.poly( dat, score.breaks=seq(0,1,len=10 ),
     conv=.0001, maxit=1000, epsilon=.025, progress=TRUE )

## S3 method for class 'isop'
summary(object,...)

## S3 method for class 'isop'
plot(x,ask=TRUE,...)

Arguments

dat

Data frame with dichotomous or polytomous item responses

score.breaks

Vector with breaks to define score groups. For dichotomous data, the person score grouping is applied for the mean person score, for polytomous data it is applied to the modified percentile score.

merge.extreme

Merge extreme groups with zero and maximum score with succeeding score categories? The default is TRUE.

conv

Convergence criterion

maxit

Maximum number of iterations

epsilon

Additive constant to handle cell frequencies of 0 or 1 in fit.adisop

progress

Display progress?

object

Object of class isop (generated by isop.dich or isop.poly)

x

Object of class isop (generated by isop.dich or isop.poly)

ask

Ask for a new plot?

...

Further arguments to be passed

Details

The ISOP model for dichotomous data was firstly proposed by Irtel and Schmalhofer (1982). Consider person groups p (ordered from low to high scores) and items i (ordered from difficult to easy items). Here, F(p,i) denotes the proportion correct for item i in score group p, while n_{pi} denotes the number of persons in group p and on item i. The isotonic probabilistic model (Scheiblechner, 1995) monotonically smooths this distribution function F such that

P( X_{pi}=1 | p, i )=F^\ast( p, i )

where the two-dimensional distribution function F^\ast is isotonic in p and i. Model fit is assessed by the square root of weighted squares of deviations

Fit=\sqrt{ \frac{1}{I} \sum_{p,i} w_{pi} \left( F(p, i) - F^\ast(p,i ) \right )^2 }

with frequency weights w_{pi} and \sum_p w_{pi}=1 for every item i. The additive isotonic model (ADISOP; Scheiblechner, 1999) assumes the existence of person parameters \theta_p and item parameters \delta_i such that

P( X_{pi}=1 | p )=g( \theta_p + \delta_i )

and g is a nonparametrically estimated isotonic function. The functions isop.dich and isop.poly uses F^\ast from the ISOP models and estimates person and item parameters of the ADISOP model. For comparison, isop.dich also fits a model with the logistic function g which results in the Rasch model.

For polytomous data, the starting point is the empirical distribution function

P( X_i \le k | p )=F( k ; p, i )

which is increasing in the argument k (the item categories). The ISOP model is defined to be antitonic in p and i while items are ordered with respect to item P-scores and persons are ordered according to modified percentile scores (Scheiblechner, 2007). The estimated ISOP model results in a distribution function F^\ast. Using this function, the additive isotonic probabilistic model (ADISOP) aims at estimating a distribution function

P( X_i \le k ; p )=F^{\ast \ast} ( k ; p, i )=F^{ \ast \ast } ( k, \theta_p + \delta_i )

which is antitonic in k and in \theta_p + \delta_i. Due to this additive relation, the ADISOP scale values are claimed to be measured at interval scale level (Scheiblechner, 1999).

The ADISOP model is compared to the graded response model which is defined by the response equation

P( X_i \le k ; p )=g( \theta_p + \delta_i + \gamma_k )

where g denotes the logistic function. Estimated parameters are in the value fit.grm: person parameters \theta_p (person.sc), item parameters \delta_i (item.sc) and category parameters \gamma_k (cat.sc).

The calculation of person and item scores is explained in isop.scoring.

For an application of the ISOP and ADISOP model see Scheiblechner and Lutz (2009).

Value

A list with following entries:

freq.correct

Used frequency table (distribution function) for dichotomous and polytomous data

wgt

Used weights (frequencies)

prob.saturated

Frequencies of the saturated model

prob.isop

Fitted frequencies of the ISOP model

prob.adisop

Fitted frequencies of the ADISOP model

prob.logistic

Fitted frequencies of the logistic model (only for isop.dich)

prob.grm

Fitted frequencies of the graded response model (only for isop.poly)

ll

List with log-likelihood values

fit

Vector of fit statistics

person

Data frame of person parameters

item

Data frame of item parameters

p.itemcat

Frequencies for every item category

score.itemcat

Scoring points for every item category

fit.isop

Values of fitting the ISOP model (see fit.isop)

fit.isop

Values of fitting the ADISOP model (see fit.adisop)

fit.logistic

Values of fitting the logistic model (only for isop.dich)

fit.grm

Values of fitting the graded response model (only for isop.poly)

...

Further values

References

Irtel, H., & Schmalhofer, F. (1982). Psychodiagnostik auf Ordinalskalenniveau: Messtheoretische Grundlagen, Modelltest und Parameterschaetzung. Archiv fuer Psychologie, 134, 197-218.

Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP). Psychometrika, 60, 281-304.

Scheiblechner, H. (1999). Additive conjoint isotonic probabilistic models (ADISOP). Psychometrika, 64, 295-316.

Scheiblechner, H. (2007). A unified nonparametric IRT model for d-dimensional psychological test data (d-ISOP). Psychometrika, 72, 43-67.

Scheiblechner, H., & Lutz, R. (2009). Die Konstruktion eines optimalen eindimensionalen Tests mittels nichtparametrischer Testtheorie (NIRT) am Beispiel des MR SOC. Diagnostica, 55, 41-54.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading (dichotomous items)
#############################################################################

data(data.read)
dat <- as.matrix( data.read)
I <- ncol(dat)

# Model 1: ISOP Model (11 score groups)
mod1 <- sirt::isop.dich( dat )
summary(mod1)
plot(mod1)

## Not run: 
# Model 2: ISOP Model (5 score groups)
score.breaks <- seq( -.005, 1.005, len=5+1 )
mod2 <- sirt::isop.dich( dat, score.breaks=score.breaks)
summary(mod2)

#############################################################################
# EXAMPLE 2: Dataset PISA mathematics (dichotomous items)
#############################################################################

data(data.pisaMath)
dat <- data.pisaMath$data
dat <- dat[, grep("M", colnames(dat) ) ]

# fit ISOP model
# Note that for this model many iterations are needed
#   to reach convergence for ADISOP
mod1 <- sirt::isop.dich( dat, maxit=4000)
summary(mod1)

## End(Not run)

#############################################################################
# EXAMPLE 3: Dataset Students (polytomous items)
#############################################################################

# Dataset students: scale cultural activities
library(CDM)
data(data.Students, package="CDM")
dat <- stats::na.omit( data.Students[, paste0("act",1:4) ] )

# fit models
mod1 <- sirt::isop.poly( dat )
summary(mod1)
plot(mod1)

Scoring Persons and Items in the ISOP Model

Description

This function does the scoring in the isotonic probabilistic model (Scheiblechner, 1995, 2003, 2007). Person parameters are ordinally scaled but the ISOP model also allows specific objective (ordinal) comparisons for persons (Scheiblechner, 1995).

Usage

isop.scoring(dat,score.itemcat=NULL)

Arguments

dat

Data frame with dichotomous or polytomous item responses

score.itemcat

Optional data frame with scoring points for every item and every category (see Example 2).

Details

This function extracts the scoring rule of the ISOP model (if score.itemcat !=NULL) and calculates the modified percentile score for every person. The score s_{ik} for item i and category k is calculated as

s_{ik}=\sum_{j=0}^{k-1} f_{ij} - \sum_{j=k+1}^K f_{ij}=P( X_i < k ) - P( X_i > k )

where f_{ik} is the relative frequency of item i in category k and K is the maximum category. The modified percentile score \rho_p for subject p (mpsc in person) is defined by

\rho_p=\frac{1}{I} \sum_{i=1}^I \sum_{j=0}^K s_{ik} \mathbf{1}( X_{pi}=k )

Note that for dichotomous items, the sum score is a sufficient statistic for \rho_p but this is not the case for polytomous items. The modified percentile score \rho_p ranges between -1 and 1.

The modified item P-score \rho_i (Scheiblechner, 2007, p. 52) is defined by

\rho_i=\frac{1}{I-1} \cdot \sum_j \left[ P( X_j < X_i ) - P( X_j > X_i ) \right ]

Value

A list with following entries:

person

A data frame with person parameters. The modified percentile score \rho_p is denoted by mpsc.

item

Item statistics and scoring parameters. The item P-scores \rho_i are labeled as pscore.

p.itemcat

Frequencies for every item category

score.itemcat

Scoring points for every item category

distr.fct

Empirical distribution function

References

Scheiblechner, H. (1995). Isotonic ordinal probabilistic models (ISOP). Psychometrika, 60, 281-304.

Scheiblechner, H. (2003). Nonparametric IRT: Scoring functions and ordinal parameter estimation of isotonic probabilistic models (ISOP). Technical Report, Philipps-Universitaet Marburg.

Scheiblechner, H. (2007). A unified nonparametric IRT model for d-dimensional psychological test data (d-ISOP). Psychometrika, 72, 43-67.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################

data( data.read )
dat <- data.read

# Scoring according to the ISOP model
msc <- sirt::isop.scoring( dat )
# plot student scores
boxplot( msc$person$mpsc ~ msc$person$score )

#############################################################################
# EXAMPLE 2: Dataset students from CDM package | polytomous items
#############################################################################

library("CDM")
data( data.Students, package="CDM")
dat <- stats::na.omit(data.Students[, -c(1:2) ])

# Scoring according to the ISOP model
msc <- sirt::isop.scoring( dat )
# plot student scores
boxplot( msc$person$mpsc ~ msc$person$score )

# scoring with known scoring rule for activity items
items <- paste0( "act", 1:5 )
score.itemcat <- msc$score.itemcat
score.itemcat <- score.itemcat[ items, ]
msc2 <- sirt::isop.scoring( dat[,items], score.itemcat=score.itemcat )

Testing the ISOP Model

Description

This function performs tests of the W1 axiom of the ISOP model (Scheiblechner, 2003). Standard errors of the corresponding W1_i statistics are obtained by Jackknife.

Usage

isop.test(data, jackunits=20, weights=rep(1, nrow(data)))

## S3 method for class 'isop.test'
summary(object,...)

Arguments

data

Data frame with item responses

jackunits

A number of Jackknife units (if an integer is provided as the argument value) or a vector in the Jackknife units are already defined.

weights

Optional vector of sampling weights

object

Object of class isop.test

...

Further arguments to be passed

Value

A list with following entries

itemstat

Data frame with test and item statistics for the W1 axiom. The W1_i statistic is denoted as est while se is the corresponding standard error of the statistic. The sample size per item is N and M denotes the item mean.

Es

Number of concordances per item

Ed

Number of disconcordances per item

The W1_i statistics are printed by the summary method.

References

Scheiblechner, H. (2003). Nonparametric IRT: Testing the bi-isotonicity of isotonic probabilistic models (ISOP). Psychometrika, 68, 79-96.

Examples

#############################################################################
# EXAMPLE 1: ISOP model data.Students
#############################################################################

data(data.Students, package="CDM")
dat <- data.Students[, paste0("act",1:5) ]
dat <- dat[1:300, ]    # select first 300 students

# perform the ISOP test
mod <- sirt::isop.test(dat)
summary(mod)
  ## -> W1i statistics
  ##     parm   N     M   est    se      t
  ##   1 test 300    NA 0.430 0.036 11.869
  ##   2 act1 278 0.601 0.451 0.048  9.384
  ##   3 act2 275 0.473 0.473 0.035 13.571
  ##   4 act3 274 0.277 0.352 0.098  3.596
  ##   5 act4 291 1.320 0.381 0.054  7.103
  ##   6 act5 276 0.460 0.475 0.042 11.184

Latent Regression Model for the Generalized Logistic Item Response Model and the Linear Model for Normal Responses

Description

This function estimates a unidimensional latent regression model if a likelihood is specified, parameters from the generalized item response model (Stukel, 1988) or a mean and a standard error estimate for individual scores is provided as input. Item parameters are treated as fixed in the estimation.

Usage

latent.regression.em.raschtype(data=NULL, f.yi.qk=NULL, X,
    weights=rep(1, nrow(X)), beta.init=rep(0,ncol(X)),
    sigma.init=1, b=rep(0,ncol(X)), a=rep(1,length(b)),
    c=rep(0, length(b)), d=rep(1, length(b)), alpha1=0, alpha2=0,
    max.parchange=1e-04, theta.list=seq(-5, 5, len=20),
    maxiter=300, progress=TRUE )

latent.regression.em.normal(y, X, sig.e, weights=rep(1, nrow(X)),
    beta.init=rep(0, ncol(X)), sigma.init=1, max.parchange=1e-04,
    maxiter=300, progress=TRUE)

## S3 method for class 'latent.regression'
summary(object,...)

Arguments

data

An N \times I data frame of dichotomous item responses. If no data frame is supplied, then a user can input the individual likelihood f.yi.qk.

f.yi.qk

An optional matrix which contains the individual likelihood. This matrix is produced by rasch.mml2 or rasch.copula2. The use of this argument allows the estimation of the latent regression model independent of the parameters of the used item response model.

X

An N \times K matrix of K covariates in the latent regression model. Note that the intercept (i.e. a vector of ones) must be included in X.

weights

Student weights (optional).

beta.init

Initial regression coefficients (optional).

sigma.init

Initial residual standard deviation (optional).

b

Item difficulties (optional). They must only be provided if the likelihood f.yi.qk is not given as an input.

a

Item discriminations (optional).

c

Guessing parameter (lower asymptotes) (optional).

d

One minus slipping parameter (upper asymptotes) (optional).

alpha1

Upper tail parameter \alpha_1 in the generalized logistic item response model. Default is 0.

alpha2

Lower tail parameter \alpha_2 parameter in the generalized logistic item response model. Default is 0.

max.parchange

Maximum change in regression parameters

theta.list

Grid of person ability where theta is evaluated

maxiter

Maximum number of iterations

progress

An optional logical indicating whether computation progress should be displayed.

y

Individual scores

sig.e

Standard errors for individual scores

object

Object of class latent.regression

...

Further arguments to be passed

Details

In the output Regression Parameters the fraction of missing information (fmi) is reported which is the increase of variance in regression parameter estimates because ability is defined as a latent variable. The effective sample size pseudoN.latent corresponds to a sample size when the ability would be available with a reliability of one.

Value

A list with following entries

iterations

Number of iterations needed

maxiter

Maximal number of iterations

max.parchange

Maximum change in parameter estimates

coef

Coefficients

summary.coef

Summary of regression coefficients

sigma

Estimate of residual standard deviation

vcov.simple

Covariance parameters of estimated parameters (simplified version)

vcov.latent

Covariance parameters of estimated parameters which accounts for latent ability

post

Individual posterior distribution

EAP

Individual EAP estimates

SE.EAP

Standard error estimates of EAP

explvar

Explained variance in latent regression

totalvar

Total variance in latent regression

rsquared

Explained variance R^2 in latent regression

Note

Using the defaults in a, c, d, alpha1 and alpha2 corresponds to the Rasch model.

References

Adams, R., & Wu. M. (2007). The mixed-coefficients multinomial logit model: A generalized form of the Rasch model. In M. von Davier & C. H. Carstensen (Eds.). Multivariate and mixture distribution Rasch models: Extensions and applications (pp. 57-76). New York: Springer. doi:10.1007/978-0-387-49839-3_4

Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56(2), 177-196. doi:10.1007/BF02294457

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

Examples

#############################################################################
#  EXAMPLE 1: PISA Reading | Rasch model for dichotomous data
#############################################################################

data(data.pisaRead, package="sirt")
dat <- data.pisaRead$data
items <- grep("R", colnames(dat))
# define matrix of covariates
X <- cbind( 1, dat[, c("female","hisei","migra" ) ] )

#***
# Model 1: Latent regression model in the Rasch model
# estimate Rasch model
mod1 <- sirt::rasch.mml2( dat[,items] )
# latent regression model
lm1 <- sirt::latent.regression.em.raschtype( data=dat[,items ], X=X, b=mod1$item$b )

## Not run: 
#***
# Model 2: Latent regression with generalized link function
# estimate alpha parameters for link function
mod2 <- sirt::rasch.mml2( dat[,items], est.alpha=TRUE)
# use model estimated likelihood for latent regression model
lm2 <- sirt::latent.regression.em.raschtype( f.yi.qk=mod2$f.yi.qk,
            X=X, theta.list=mod2$theta.k)

#***
# Model 3: Latent regression model based on Rasch copula model
testlets <- paste( data.pisaRead$item$testlet)
itemclusters <- match( testlets, unique(testlets) )
# estimate Rasch copula model
mod3 <- sirt::rasch.copula2( dat[,items], itemcluster=itemclusters )
# use model estimated likelihood for latent regression model
lm3 <- sirt::latent.regression.em.raschtype( f.yi.qk=mod3$f.yi.qk,
                X=X, theta.list=mod3$theta.k)

#############################################################################
# EXAMPLE 2: Simulated data according to the Rasch model
#############################################################################

set.seed(899)
I <- 21     # number of items
b <- seq(-2,2, len=I)   # item difficulties
n <- 2000       # number of students

# simulate theta and covariates
theta <- stats::rnorm( n )
x <- .7 * theta + stats::rnorm( n, .5 )
y <- .2 * x+ .3*theta + stats::rnorm( n, .4 )
dfr <- data.frame( theta, 1, x, y )

# simulate Rasch model
dat1 <- sirt::sim.raschtype( theta=theta, b=b )

# estimate latent regression
mod <- sirt::latent.regression.em.raschtype( data=dat1, X=dfr[,-1], b=b )
  ## Regression Parameters
  ##
  ##        est se.simple     se        t p   beta    fmi N.simple pseudoN.latent
  ## X1 -0.2554    0.0208 0.0248 -10.2853 0 0.0000 0.2972     2000       1411.322
  ## x   0.4113    0.0161 0.0193  21.3037 0 0.4956 0.3052     2000       1411.322
  ## y   0.1715    0.0179 0.0213   8.0438 0 0.1860 0.2972     2000       1411.322
  ##
  ## Residual Variance=0.685
  ## Explained Variance=0.3639
  ## Total Variance=1.049
  ##                 R2=0.3469

# compare with linear model (based on true scores)
summary( stats::lm( theta  ~ x + y, data=dfr ) )
  ## Coefficients:
  ##             Estimate Std. Error t value Pr(>|t|)
  ## (Intercept) -0.27821    0.01984  -14.02   <2e-16 ***
  ## x            0.40747    0.01534   26.56   <2e-16 ***
  ## y            0.18189    0.01704   10.67   <2e-16 ***
  ## ---
  ##
  ## Residual standard error: 0.789 on 1997 degrees of freedom
  ## Multiple R-squared: 0.3713,     Adjusted R-squared: 0.3707

#***********
# define guessing parameters (lower asymptotes) and
# upper asymptotes ( 1 minus slipping parameters)
cI <- rep(.2, I)        # all items get a guessing parameter of .2
cI[ c(7,9) ] <- .25     # 7th and 9th get a guessing parameter of .25
dI <- rep( .95, I )    # upper asymptote of .95
dI[ c(7,11) ] <- 1        # 7th and 9th item have an asymptote of 1

# latent regression model
mod1 <- sirt::latent.regression.em.raschtype( data=dat1, X=dfr[,-1],
           b=b, c=cI, d=dI    )
  ## Regression Parameters
  ##
  ##        est se.simple     se        t p   beta    fmi N.simple pseudoN.latent
  ## X1 -0.7929    0.0243 0.0315 -25.1818 0 0.0000 0.4044     2000       1247.306
  ## x   0.5025    0.0188 0.0241  20.8273 0 0.5093 0.3936     2000       1247.306
  ## y   0.2149    0.0209 0.0266   8.0850 0 0.1960 0.3831     2000       1247.306
  ##
  ## Residual Variance=0.9338
  ## Explained Variance=0.5487
  ## Total Variance=1.4825
  ##                 R2=0.3701

#############################################################################
# EXAMPLE 3: Measurement error in dependent variable
#############################################################################

set.seed(8766)
N <- 4000       # number of persons
X <- stats::rnorm(N)           # independent variable
Z <- stats::rnorm(N)           # independent variable
y <- .45 * X + .25 * Z + stats::rnorm(N)   # dependent variable true score
sig.e <- stats::runif( N, .5, .6 )       # measurement error standard deviation
yast <- y + stats::rnorm( N, sd=sig.e ) # dependent variable measured with error

#****
# Model 1: Estimation with latent.regression.em.raschtype using
#          individual likelihood
# define theta grid for evaluation of density
theta.list <- mean(yast) + stats::sd(yast) * seq( - 5, 5, length=21)
# compute individual likelihood
f.yi.qk <- stats::dnorm( outer( yast, theta.list, "-" ) / sig.e )
f.yi.qk <- f.yi.qk / rowSums(f.yi.qk)
# define predictor matrix
X1 <- as.matrix(data.frame( "intercept"=1, "X"=X, "Z"=Z ))

# latent regression model
res <- sirt::latent.regression.em.raschtype( f.yi.qk=f.yi.qk,
                    X=X1, theta.list=theta.list)
  ##   Regression Parameters
  ##
  ##                est se.simple     se       t      p   beta    fmi N.simple pseudoN.latent
  ##   intercept 0.0112    0.0157 0.0180  0.6225 0.5336 0.0000 0.2345     4000       3061.998
  ##   X         0.4275    0.0157 0.0180 23.7926 0.0000 0.3868 0.2350     4000       3061.998
  ##   Z         0.2314    0.0156 0.0178 12.9868 0.0000 0.2111 0.2349     4000       3061.998
  ##
  ##   Residual Variance=0.9877
  ##   Explained Variance=0.2343
  ##   Total Variance=1.222
  ##                   R2=0.1917

#****
# Model 2: Estimation with latent.regression.em.normal
res2 <- sirt::latent.regression.em.normal( y=yast, sig.e=sig.e, X=X1)
  ##   Regression Parameters
  ##
  ##                est se.simple     se       t      p   beta    fmi N.simple pseudoN.latent
  ##   intercept 0.0112    0.0157 0.0180  0.6225 0.5336 0.0000 0.2345     4000       3062.041
  ##   X         0.4275    0.0157 0.0180 23.7927 0.0000 0.3868 0.2350     4000       3062.041
  ##   Z         0.2314    0.0156 0.0178 12.9870 0.0000 0.2111 0.2349     4000       3062.041
  ##
  ##   Residual Variance=0.9877
  ##   Explained Variance=0.2343
  ##   Total Variance=1.222
  ##                   R2=0.1917

  ## -> Results between Model 1 and Model 2 are identical because they use
  ##    the same input.

#***
# Model 3: Regression model based on true scores y
mod3 <- stats::lm( y ~ X + Z )
summary(mod3)
  ##   Coefficients:
  ##               Estimate Std. Error t value Pr(>|t|)
  ##   (Intercept)  0.02364    0.01569   1.506    0.132
  ##   X            0.42401    0.01570  27.016   <2e-16 ***
  ##   Z            0.23804    0.01556  15.294   <2e-16 ***
  ##   Residual standard error: 0.9925 on 3997 degrees of freedom
  ##   Multiple R-squared:  0.1923,    Adjusted R-squared:  0.1919
  ##   F-statistic: 475.9 on 2 and 3997 DF,  p-value: < 2.2e-16

#***
# Model 4: Regression model based on observed scores yast
mod4 <- stats::lm( yast ~ X + Z )
summary(mod4)
  ##   Coefficients:
  ##               Estimate Std. Error t value Pr(>|t|)
  ##   (Intercept)  0.01101    0.01797   0.613     0.54
  ##   X            0.42716    0.01797  23.764   <2e-16 ***
  ##   Z            0.23174    0.01783  13.001   <2e-16 ***
  ##   Residual standard error: 1.137 on 3997 degrees of freedom
  ##   Multiple R-squared:  0.1535,    Adjusted R-squared:  0.1531
  ##   F-statistic: 362.4 on 2 and 3997 DF,  p-value: < 2.2e-16

## End(Not run)

Converting a `lavaan` Model into a `mirt` Model

Description

Converts a lavaan model into a mirt model. Optionally, the model can be estimated with the mirt::mirt function (est.mirt=TRUE) or just mirt syntax is generated (est.mirt=FALSE).

Extensions of the lavaan syntax include guessing and slipping parameters (operators ?=g1 and ?=s1) and a shortage operator for item groups (see __). See TAM::lavaanify.IRT for more details.

Usage

lavaan2mirt(dat, lavmodel, est.mirt=TRUE, poly.itemtype="gpcm", ...)

Arguments

dat

Dataset with item responses

lavmodel

Model specified in lavaan syntax (see lavaan::lavaanify)

est.mirt

An optional logical indicating whether the model should be estimated with mirt::mirt

poly.itemtype

Item type for polytomous data. This can be gpcm for the generalized partial credit model or graded for the graded response model.

...

Further arguments to be passed for estimation in mirt

Details

This function uses the lavaan::lavaanify (lavaan) function.

Only single group models are supported (for now).

Value

A list with following entries

mirt

Object generated by mirt function if est.mirt=TRUE

mirt.model

Generated mirt model

mirt.syntax

Generated mirt syntax

mirt.pars

Generated parameter specifications in mirt

lavaan.model

Used lavaan model transformed by lavaanify function

dat

Used dataset. If necessary, only items used in the model are included in the dataset.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Convert some lavaan syntax to mirt syntax for data.read
#############################################################################

library(mirt)
data(data.read)
dat <- data.read

#******************
#*** Model 1: Single factor model
lavmodel <- "
     # omit item C3
     F=~ A1+A2+A3+A4 + C1+C2+C4 + B1+B2+B3+B4
     F ~~ 1*F
            "

# convert syntax and estimate model
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=3) )
# inspect coefficients
coef(res$mirt)
mirt.wrapper.coef(res$mirt)
# converted mirt model and parameter table
cat(res$mirt.syntax)
res$mirt.pars

#******************
#*** Model 2: Rasch Model with first six items
lavmodel <- "
     F=~ a*A1+a*A2+a*A3+a*A4+a*B1+a*B2
     F ~~ 1*F
            "
# convert syntax and estimate model
res <- sirt::lavaan2mirt( dat,  lavmodel, est.mirt=FALSE)
# converted mirt model
cat(res$mirt.syntax)
# mirt parameter table
res$mirt.pars
# estimate model using generated objects
res2 <- mirt::mirt( res$dat, res$mirt.model, pars=res$mirt.pars )
mirt.wrapper.coef(res2)     # parameter estimates

#******************
#*** Model 3: Bifactor model
lavmodel <- "
     G=~ A1+A2+A3+A4 + B1+B2+B3+B4  + C1+C2+C3+C4
     A=~ A1+A2+A3+A4
     B=~ B1+B2+B3+B4
     C=~ C1+C2+C3+C4
     G ~~ 1*G
     A ~~ 1*A
     B ~~ 1*B
     C ~~ 1*C
            "
res <- sirt::lavaan2mirt( dat,  lavmodel, est.mirt=FALSE )
# mirt syntax and mirt model
cat(res$mirt.syntax)
res$mirt.model
res$mirt.pars

#******************
#*** Model 4: 3-dimensional model with some parameter constraints
lavmodel <- "
     # some equality constraints among loadings
     A=~ a*A1+a*A2+a2*A3+a2*A4
     B=~ B1+B2+b3*B3+B4
     C=~ c*C1+c*C2+c*C3+c*C4
     # some equality constraints among thresholds
     A1 | da*t1
     A3 | da*t1
     B3 | da*t1
     C3 | dg*t1
     C4 | dg*t1
     # standardized latent variables
     A ~~ 1*A
     B ~~ 1*B
     C ~~ 1*C
     # estimate Cov(A,B) and Cov(A,C)
     A ~~ B
     A ~~ C
     # estimate mean of B
     B ~ 1
            "
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=3) )
# estimated parameters
mirt.wrapper.coef(res$mirt)
# generated mirt syntax
cat(res$mirt.syntax)
# mirt parameter table
mirt::mod2values(res$mirt)

#******************
#*** Model 5: 3-dimensional model with some parameter constraints and
#             parameter fixings
lavmodel <- "
     A=~ a*A1+a*A2+1.3*A3+A4  # set loading of A3 to 1.3
     B=~ B1+1*B2+b3*B3+B4
     C=~ c*C1+C2+c*C3+C4
     A1 | da*t1
     A3 | da*t1
     C4 | dg*t1
     B1 | 0*t1
     B3 | -1.4*t1   # fix item threshold of B3 to -1.4
     A ~~ 1*A
     B ~~ B         # estimate variance of B freely
     C ~~ 1*C
     A ~~ B         # estimate covariance between A and B
     A ~~ .6 * C    # fix covariance to .6
     A ~ .5*1       # set mean of A to .5
     B ~ 1          # estimate mean of B
            "
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=3) )
mirt.wrapper.coef(res$mirt)

#******************
#*** Model 6: 1-dimensional model with guessing and slipping parameters
#******************

lavmodel <- "
     F=~ c*A1+c*A2+1*A3+1.3*A4 + C1__C4 + a*B1+b*B2+b*B3+B4
     # guessing parameters
     A1+A2 ?=guess1*g1
     A3 ?=.25*g1
     B1+C1 ?=g1
     B2__B4 ?=0.10*g1
     # slipping parameters
     A1+A2+C3 ?=slip1*s1
     A3 ?=.02*s1
     # fix item intercepts
     A1 | 0*t1
     A2 | -.4*t1
     F ~ 1    # estimate mean of F
     F ~~ 1*F   # fix variance of F
            "
# convert syntax and estimate model
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=3) )
# coefficients
mirt.wrapper.coef(res$mirt)
# converted mirt model
cat(res$mirt.syntax)

#############################################################################
# EXAMPLE 2: Convert some lavaan syntax to mirt syntax for
#            longitudinal data data.long
#############################################################################

data(data.long)
dat <- data.long[,-1]

#******************
#*** Model 1: Rasch model for T1
lavmodel <- "
     F=~ 1*I1T1 +1*I2T1+1*I3T1+1*I4T1+1*I5T1+1*I6T1
     F ~~ F
            "
# convert syntax and estimate model
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=20) )
# inspect coefficients
mirt.wrapper.coef(res$mirt)
# converted mirt model
cat(res$mirt.syntax)

#******************
#*** Model 2: Rasch model for two time points
lavmodel <- "
     F1=~ 1*I1T1 +1*I2T1+1*I3T1+1*I4T1+1*I5T1+1*I6T1
     F2=~ 1*I3T2 +1*I4T2+1*I5T2+1*I6T2+1*I7T2+1*I8T2
     F1 ~~ F1
     F1 ~~ F2
     F2 ~~ F2
     # equal item difficulties of same items
     I3T1 | i3*t1
     I3T2 | i3*t1
     I4T1 | i4*t1
     I4T2 | i4*t1
     I5T1 | i5*t1
     I5T2 | i5*t1
     I6T1 | i6*t1
     I6T2 | i6*t1
     # estimate mean of F1, but fix mean of F2
     F1 ~ 1
     F2 ~ 0*1
            "
# convert syntax and estimate model
res <- sirt::lavaan2mirt( dat,  lavmodel, verbose=TRUE, technical=list(NCYCLES=20) )
# inspect coefficients
mirt.wrapper.coef(res$mirt)
# converted mirt model
cat(res$mirt.syntax)

#-- compare estimation with smirt function
# define Q-matrix
I <- ncol(dat)
Q <- matrix(0,I,2)
Q[1:6,1] <- 1
Q[7:12,2] <- 1
rownames(Q) <- colnames(dat)
colnames(Q) <- c("T1","T2")
# vector with same items
itemnr <- as.numeric( substring( colnames(dat),2,2) )
# fix mean at T2 to zero
mu.fixed <- cbind( 2,0 )
# estimate model in smirt
mod1 <- sirt::smirt(dat, Qmatrix=Q, irtmodel="comp", est.b=itemnr, mu.fixed=mu.fixed )
summary(mod1)

#############################################################################
# EXAMPLE 3: Converting lavaan syntax for polytomous data
#############################################################################

data(data.big5)
# select some items
items <- c( grep( "O", colnames(data.big5), value=TRUE )[1:6],
            grep( "N", colnames(data.big5), value=TRUE )[1:4] )
#  O3  O8  O13 O18 O23 O28 N1  N6  N11 N16
dat <- data.big5[, items ]
library(psych)
psych::describe(dat)

#******************
#*** Model 1: Partial credit model
lavmodel <- "
      O=~ 1*O3+1*O8+1*O13+1*O18+1*O23+1*O28
      O ~~ O
         "
# estimate model in mirt
res <- sirt::lavaan2mirt( dat, lavmodel, technical=list(NCYCLES=20), verbose=TRUE)
# estimated mirt model
mres <- res$mirt
# mirt syntax
cat(res$mirt.syntax)
  ##   O=1,2,3,4,5,6
  ##   COV=O*O
# estimated parameters
mirt.wrapper.coef(mres)
# some plots
mirt::itemplot( mres, 3 )   # third item
plot(mres)   # item information
plot(mres,type="trace")  # item category functions

# graded response model with equal slopes
res1 <- sirt::lavaan2mirt( dat, lavmodel, poly.itemtype="graded", technical=list(NCYCLES=20),
              verbose=TRUE )
mirt.wrapper.coef(res1$mirt)

#******************
#*** Model 2: Generalized partial credit model with some constraints
lavmodel <- "
      O=~ O3+O8+O13+a*O18+a*O23+1.2*O28
      O ~ 1   # estimate mean
      O ~~ O  # estimate variance
      # some constraints among thresholds
      O3  | d1*t1
      O13 | d1*t1
      O3  | d2*t2
      O8  | d3*t2
      O28 | (-0.5)*t1
         "
# estimate model in mirt
res <- sirt::lavaan2mirt( dat, lavmodel, technical=list(NCYCLES=5), verbose=TRUE)
# estimated mirt model
mres <- res$mirt
# estimated parameters
mirt.wrapper.coef(mres)

#*** generate syntax for mirt for this model and estimate it in mirt package
# Items: O3  O8  O13 O18 O23 O28
mirtmodel <- mirt::mirt.model( "
             O=1-6
             # a(O18)=a(O23), t1(O3)=t1(O18), t2(O3)=t2(O8)
             CONSTRAIN=(4,5,a1), (1,3,d1), (1,2,d2)
             MEAN=O
             COV=O*O
               ")
# initial table of parameters in mirt
mirt.pars <- mirt::mirt( dat[,1:6], mirtmodel, itemtype="gpcm", pars="values")
# fix slope of item O28 to 1.2
ind <- which( ( mirt.pars$item=="O28" ) & ( mirt.pars$name=="a1") )
mirt.pars[ ind, "est"] <- FALSE
mirt.pars[ ind, "value"] <- 1.2
# fix d1 of item O28 to -0.5
ind <- which( ( mirt.pars$item=="O28" ) & ( mirt.pars$name=="d1") )
mirt.pars[ ind, "est"] <- FALSE
mirt.pars[ ind, "value"] <- -0.5
# estimate model
res2 <- mirt::mirt( dat[,1:6], mirtmodel, pars=mirt.pars,
             verbose=TRUE, technical=list(NCYCLES=4) )
mirt.wrapper.coef(res2)
plot(res2, type="trace")

## End(Not run)

Latent Class Model for Two Exchangeable Raters and One Item

Description

This function computes a latent class model for ratings on an item based on exchangeable raters (Uebersax & Grove, 1990). Additionally, several measures of rater agreement are computed (see e.g. Gwet, 2010).

Usage

lc.2raters(data, conv=0.001, maxiter=1000, progress=TRUE)

## S3 method for class 'lc.2raters'
summary(object,...)

Arguments

data

Data frame with item responses (must be ordered from 0 to K) and two columns which correspond to ratings of two (exchangeable) raters.

conv

Convergence criterion

maxiter

Maximum number of iterations

progress

An optional logical indicating whether iteration progress should be displayed.

object

Object of class lc.2raters

...

Further arguments to be passed

Details

For two exchangeable raters which provide ratings on an item, a latent class model with K+1 classes (if there are K+1 item categories 0,...,K) is defined. Where P(X=x, Y=y | c) denotes the probability that the first rating is x and the second rating is y given the true but unknown item category (class) c. Ratings are assumed to be locally independent, i.e.

P(X=x, Y=y | c )=P( X=x | c) \cdot P(Y=y | c )=p_{x|c} \cdot p_{y|c}

Note that P(X=x|c)=P(Y=x|c)=p_{x|c} holds due to the exchangeability of raters. The latent class model estimates true class proportions \pi_c and conditional item probabilities p_{x|c}.

Value

A list with following entries

classprob.1rater.like

Classification probability P(c|x) of latent category c given a manifest rating x (estimated by maximum likelihood)

classprob.1rater.post

Classification probability P(c|x) of latent category c given a manifest rating x (estimated by the posterior distribution)

classprob.2rater.like

Classification probability P(c|(x,y)) of latent category c given two manifest ratings x and y (estimated by maximum likelihood)

classprob.2rater.post

Classification probability P(c|(x,y)) of latent category c given two manifest ratings x and y (estimated by posterior distribution)

f.yi.qk

Likelihood of each pair of ratings

f.qk.yi

Posterior of each pair of ratings

probs

Item response probabilities p_{x|c}

pi.k

Estimated class proportions \pi_c

pi.k.obs

Observed manifest class proportions

freq.long

Frequency table of ratings in long format

freq.table

Symmetrized frequency table of ratings

agree.stats

Measures of rater agreement. These measures include percentage agreement (agree0, agree1), Cohen's kappa and weighted Cohen's kappa (kappa, wtd.kappa.linear), Gwet's AC1 agreement measures (AC1; Gwet, 2008, 2010) and Aickin's alpha (alpha.aickin; Aickin, 1990).

data

Used dataset

N.categ

Number of categories

References

Aickin, M. (1990). Maximum likelihood estimation of agreement in the constant predictive probability model, and its relation to Cohen's kappa. Biometrics, 46, 293-302.

Gwet, K. L. (2008). Computing inter-rater reliability and its variance in the presence of high agreement. British Journal of Mathematical and Statistical Psychology, 61, 29-48.

Gwet, K. L. (2010). Handbook of Inter-Rater Reliability. Advanced Analytics, Gaithersburg. http://www.agreestat.com/

Uebersax, J. S., & Grove, W. M. (1990). Latent class analysis of diagnostic agreement. Statistics in Medicine, 9, 559-572.

Examples

#############################################################################
# EXAMPLE 1: Latent class models for rating datasets data.si05
#############################################################################

data(data.si05)

#*** Model 1: one item with two categories
mod1 <- sirt::lc.2raters( data.si05$Ex1)
summary(mod1)

#*** Model 2: one item with five categories
mod2 <- sirt::lc.2raters( data.si05$Ex2)
summary(mod2)

#*** Model 3: one item with eight categories
mod3 <- sirt::lc.2raters( data.si05$Ex3)
summary(mod3)

Adjustment and Approximation of Individual Likelihood Functions

Description

Approximates individual likelihood functions L(\bold{X}_p | \theta) by normal distributions (see Mislevy, 1990). Extreme response patterns are handled by adding pseudo-observations of items with extreme item difficulties (see argument extreme.item. The individual standard deviations of the likelihood, used in the normal approximation, can be modified by individual adjustment factors which are specified in adjfac. In addition, a reliability of the adjusted likelihood can be specified in target.EAP.rel.

Usage

likelihood.adjustment(likelihood, theta=NULL, prob.theta=NULL,
     adjfac=rep(1, nrow(likelihood)), extreme.item=5, target.EAP.rel=NULL,
     min_tuning=0.2, max_tuning=3, maxiter=100, conv=1e-04,
     trait.normal=TRUE)

Arguments

likelihood

A matrix containing the individual likelihood L(\bold{X}_p | \theta) or an object of class IRT.likelihood.

theta

Optional vector of (unidimensional) \theta values

prob.theta

Optional vector of probabilities of \theta trait distribution

adjfac

Vector with individual adjustment factors of the standard deviations of the likelihood

extreme.item

Item difficulties of two extreme pseudo items which are added as additional observed data to the likelihood. A large number (e.g. extreme.item=15) leaves the likelihood almost unaffected. See also Mislevy (1990).

target.EAP.rel

Target EAP reliability. An additional tuning parameter is estimated which adjusts the likelihood to obtain a pre-specified reliability.

min_tuning

Minimum value of tuning parameter (if ! is.null(target.EAP.rel) )

max_tuning

Maximum value of tuning parameter (if ! is.null(target.EAP.rel) )

maxiter

Maximum number of iterations (if ! is.null(target.EAP.rel) )

conv

Convergence criterion (if ! is.null(target.EAP.rel) )

trait.normal

Optional logical indicating whether the trait distribution should be normally distributed (if ! is.null(target.EAP.rel) ).

Value

Object of class IRT.likelihood.

References

Mislevy, R. (1990). Scaling procedures. In E. Johnson & R. Zwick (Eds.), Focusing the new design: The NAEP 1988 technical report (ETS RR 19-20). Princeton, NJ: Educational Testing Service.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Adjustment of the likelihood | data.read
#############################################################################

library(CDM)
library(TAM)
data(data.read)
dat <- data.read

# define theta grid
theta.k <- seq(-6,6,len=41)

#*** Model 1: fit Rasch model in TAM
mod1 <- TAM::tam.mml( dat, control=list( nodes=theta.k) )
summary(mod1)

#*** Model 2: fit Rasch copula model
testlets <- substring( colnames(dat), 1, 1 )
mod2 <- sirt::rasch.copula2( dat, itemcluster=testlets, theta.k=theta.k)
summary(mod2)

# model comparison
IRT.compareModels( mod1, mod2 )

# extract EAP reliabilities
rel1 <- mod1$EAP.rel
rel2 <- mod2$EAP.Rel
# variance inflation factor
vif <- (1-rel2) / (1-rel1)
  ##  > vif
  ##  [1] 1.211644

# extract individual likelihood
like1 <- IRT.likelihood( mod1 )
# adjust likelihood from Model 1 to obtain a target EAP reliability of .599
like1b <- sirt::likelihood.adjustment( like1, target.EAP.rel=.599 )

# compare estimated latent regressions
lmod1a <- TAM::tam.latreg( like1, Y=NULL )
lmod1b <- TAM::tam.latreg( like1b, Y=NULL )
summary(lmod1a)
summary(lmod1b)

## End(Not run)

Linking in the 2PL/Generalized Partial Credit Model

Description

This function does the linking of several studies which are calibrated using the 2PL or the generalized item response model according to Haberman (2009). This method is a generalization of log-mean-mean linking from one study to several studies. The default a_log=TRUE logarithmizes item slopes for linking while otherwise an additive regression model is assumed for the original item loadings (see Details; Battauz, 2017)

Usage

linking.haberman(itempars, personpars, estimation="OLS", a_trim=Inf, b_trim=Inf,
    lts_prop=.5, a_log=TRUE, conv=1e-05, maxiter=1000, progress=TRUE,
    adjust_main_effects=TRUE, vcov=TRUE)

## S3 method for class 'linking.haberman'
summary(object, digits=3, file=NULL, ...)

linking.haberman.lq(itempars, pow=2, eps=1e-3, a_log=TRUE, use_nu=FALSE,
      est_pow=FALSE, lower_pow=.1, upper_pow=3)

## S3 method for class 'linking.haberman.lq'
summary(object, digits=3, file=NULL, ...)

## prepare 'itempars' argument for linking.haberman()
linking_haberman_itempars_prepare(b, a=NULL, wgt=NULL)

## conversion of different parameterizations of item parameters
linking_haberman_itempars_convert(itempars=NULL, lambda=NULL, nu=NULL, a=NULL, b=NULL)

## L0 polish precedure minimizing number of interactions in two-way table
L0_polish(x, tol, conv=0.01, maxiter=30, type=1, verbose=TRUE)

Arguments

itempars

A data frame with four or five columns. The first four columns contain in the order: study name, item name, a parameter, b parameter. The fifth column is an optional weight for every item and every study.

personpars

A list with vectors (e.g. EAPs or WLEs) or data frames (e.g. plausible values) containing person parameters which should be transformed. If a data frame in each list entry has se or SE (standard error) in a column name, then the corresponding column is only multiplied by A_t. If a column is labeled as pid (person ID), then it is left untransformed.

estimation

Estimation method. Can be "OLS" (ordinary least squares), "BSQ" (bisquare weighted regression), "HUB" (regression using Huber weights), "MED" (median regression), "LTS" (trimmed least squares), "L1" (median polish), "L0" (minimizing number of interactions)

a_trim

Trimming parameter for item slopes a_{it} in bisquare regression (see Details).

b_trim

Trimming parameter for item slopes b_{it} in bisquare regression (see Details).

lts_prop

Proportion of retained observations in "LTS" regression estimation

a_log

Logical indicating whether item slopes should be logarithmized for linking.

conv

Convergence criterion.

maxiter

Maximum number of iterations.

progress

An optional logical indicating whether computational progress should be displayed.

adjust_main_effects

Logical indicating whether all elements in the vector of main effects should be simultaneously adjusted

vcov

Optional indicating whether covariance matrix for linking errors should be computed

pow

Power q

eps

Epsilon value used in differentiable approximating function

use_nu

Logical indicating whether item intercepts instead of item difficulties are used in linking

est_pow

Logical indicating whether power values should be estimated

lower_pow

Lower bound for estimated power

upper_pow

Upper bound for estimated power

lambda

Matrix containing item loadings

nu

Matrix containing item intercepts

object

Object of class linking.haberman.

digits

Number of digits after decimals for rounding in summary.

file

Optional file name if summary should be sunk into a file.

...

Further arguments to be passed

b

Matrix of item intercepts (items times studies)

a

Matrix of item slopes

wgt

Matrix of weights

x

Matrix

tol

Tolerance value

type

Can be 1 (using Tukey's median polish) or 2 (alternating median regression).

verbose

Logical indicating whether iteration progress should be displayed

Details

For t=1,\ldots,T studies, item difficulties b_{it} and item slopes a_{it} are available. For dichotomous responses, these parameters are defined by the 2PL response equation

logit P(X_{pi}=1| \theta_p )=a_i ( \theta_p - b_i )

while for polytomous responses the generalized partial credit model holds

log \frac{P(X_{pi}=k| \theta_p )}{P(X_{pi}=k-1| \theta_p )} =a_i ( \theta_p - b_i + d_{ik} )

The parameters \{ a_{it}, b_{it} \} of all items and studies are linearly transformed using equations a_{it} \approx a_i / A_t (if a_log=TRUE) or a_{it} \approx a_i + A_t (if a_log=FALSE) and b_{it} \cdot A_t \approx B_t + b_i. For identification reasons, we define A_1=1 and B_1=0.

The optimization function (which is a least squares criterion; see Haberman, 2009) seeks the transformation parameters A_t and B_t with an alternating least squares method (estimation="OLS"). Note that every item i and every study t can be weighted (specified in the fifth column of itempars). Alternatively, a robust regression method based on bisquare weighting (Fox, 2015) can be employed for linking using the argument estimation="BSQ". For example, in the case of item loadings, bisquare weighting is applied to residuals e_{it}=a_{it} - a_i - A_t (where logarithmized or non-logarithmized item loadings are employed) forming weights w_{it}=[ 1 - ( e_{it} / k )^2 ]^2 for e_{it} <k and 0 for e_{it} \ge k where k is the trimming constant which can be estimated or fixed during estimation using arguments a_trim or b_trim. Items in studies with large residuals (i.e., presence differential item functioning) are effectively set to zero in the linking procedure. Alternatively, Huber weights (estimation="HUB") downweight large residuals by applying w_{it}=k / | e_{it} | for residuals |e_{it}|>k. The method estimation="LTS" employs trimmed least squares where the proportion of data retained is specified in lts_prop with default set to .50.

The method estimation="MED" estimates item parameters and linking constants based on alternating median regression. A similar approach is the median polish procedure of Tukey (Tukey, 1977, p. 362ff.; Maronna, Martin & Yohai, 2006, p. 104; see also stats::medpolish) implemented in estimation="L1" which aims to minimize \sum_{i,t} | e_{it} |. For a pre-specified tolerance value t (in a_trim or b_trim), the approach estimation="L0" minimizes the number of interactions (i.e., DIF effects) in the e_{it} effects. In more detail, it minimizes \sum_{i,t} \# \{ | e_{it} | > t \} which is computationally conducted by repeatedly applying the median polish procedure in which one cell is omitted (Davies, 2012; Terbeck & Davies, 1998).

Effect sizes of invariance are calculated as R-squared measures of explained item slopes and intercepts after linking in comparison to item parameters across groups (Asparouhov & Muthen, 2014).

The function linking.haberman.lq uses the loss function \rho(x)=|x|^q. The originally proposed Haberman linking can be obtained with pow=2 (q=2). The powers can also be estimated (argument est_pow=TRUE).

Value

A list with following entries

transf.pars

Data frame with transformation parameters A_t and B_t

transf.personpars

Data frame with linear transformation functions for person parameters

joint.itempars

Estimated joint item parameters a_i and b_i

a.trans

Transformed a_{it} parameters

b.trans

Transformed b_{it} parameters

a.orig

Original a_{it} parameters

b.orig

Original b_{it} parameters

a.resid

Residual a_{it} parameters (DIF parameters)

b.resid

Residual b_{it} parameters (DIF parameters)

personpars

Transformed person parameters

es.invariance

Effect size measures of invariance, separately for item slopes and intercepts. In the rows, R^2 and \sqrt{1-R^2} are reported.

es.robust

Effect size measures of invariance based on robust estimation (if used).

selitems

Indices of items which are present in more than one study.

References

Asparouhov, T., & Muthen, B. (2014). Multiple-group factor analysis alignment. Structural Equation Modeling, 21(4), 1-14. doi:10.1080/10705511.2014.919210

Battauz, M. (2017). Multiple equating of separate IRT calibrations. Psychometrika, 82(3), 610-636. doi:10.1007/s11336-016-9517-x

Davies, P. L. (2012). Interactions in the analysis of variance. Journal of the American Statistical Association, 107(500), 1502-1509. doi:10.1080/01621459.2012.726895

Fox, J. (2015). Applied regression analysis and generalized linear models. Thousand Oaks: Sage.

Kolen, M. J., & Brennan, R. L. (2014). Test equating, scaling, and linking: Methods and practices. New York: Springer. doi:10.1007/978-1-4939-0317-7

Magis, D., & De Boeck, P. (2012). A robust outlier approach to prevent type I error inflation in differential item functioning. Educational and Psychological Measurement, 72(2), 291-311. doi:10.1177/0013164411416975

Maronna, R. A., Martin, R. D., & Yohai, V. J. (2006). Robust statistics. West Sussex: Wiley. doi:10.1002/0470010940

Terbeck, W., & Davies, P. L. (1998). Interactions and outliers in the two-way analysis of variance. Annals of Statistics, 26(4), 1279-1305. doi: 10.1214/aos/1024691243

Tukey, J. W. (1977). Exploratory data analysis. Addison-Wesley.

Weeks, J. P. (2010). plink: An R package for linking mixed-format tests using IRT-based methods. Journal of Statistical Software, 35(12), 1-33. doi:10.18637/jss.v035.i12

Examples

#############################################################################
# EXAMPLE 1: Item parameters data.pars1.rasch and data.pars1.2pl
#############################################################################

# Model 1: Linking three studies calibrated by the Rasch model
data(data.pars1.rasch)
mod1 <- sirt::linking.haberman( itempars=data.pars1.rasch )
summary(mod1)

# Model 1b: Linking these studies but weigh these studies by
#     proportion weights 3 : 0.5 : 1 (see below).
#     All weights are the same for each item but they could also
#     be item specific.
itempars <- data.pars1.rasch
itempars$wgt <- 1
itempars[ itempars$study=="study1","wgt"] <- 3
itempars[ itempars$study=="study2","wgt"] <- .5
mod1b <- sirt::linking.haberman( itempars=itempars )
summary(mod1b)

# Model 2: Linking three studies calibrated by the 2PL model
data(data.pars1.2pl)
mod2 <- sirt::linking.haberman( itempars=data.pars1.2pl )
summary(mod2)

# additive model instead of logarithmic model for item slopes
mod2b <- sirt::linking.haberman( itempars=data.pars1.2pl, a_log=FALSE )
summary(mod2b)

## Not run: 
#############################################################################
# EXAMPLE 2: Linking longitudinal data
#############################################################################
data(data.long)

#******
# Model 1: Scaling with the 1PL model

# scaling at T1
dat1 <- data.long[, grep("T1", colnames(data.long) ) ]
resT1 <- sirt::rasch.mml2( dat1 )
itempartable1 <- data.frame( "study"="T1", resT1$item[, c("item", "a", "b" ) ] )
# scaling at T2
dat2 <- data.long[, grep("T2", colnames(data.long) ) ]
resT2 <- sirt::rasch.mml2( dat2 )
summary(resT2)
itempartable2 <- data.frame( "study"="T2", resT2$item[, c("item", "a", "b" ) ] )
itempartable <- rbind( itempartable1, itempartable2 )
itempartable[,2] <- substring( itempartable[,2], 1, 2 )
# estimate linking parameters
mod1 <- sirt::linking.haberman( itempars=itempartable )

#******
# Model 2: Scaling with the 2PL model

# scaling at T1
dat1 <- data.long[, grep("T1", colnames(data.long) ) ]
resT1 <- sirt::rasch.mml2( dat1, est.a=1:6)
itempartable1 <- data.frame( "study"="T1", resT1$item[, c("item", "a", "b" ) ] )

# scaling at T2
dat2 <- data.long[, grep("T2", colnames(data.long) ) ]
resT2 <- sirt::rasch.mml2( dat2, est.a=1:6)
summary(resT2)
itempartable2 <- data.frame( "study"="T2", resT2$item[, c("item", "a", "b" ) ] )
itempartable <- rbind( itempartable1, itempartable2 )
itempartable[,2] <- substring( itempartable[,2], 1, 2 )
# estimate linking parameters
mod2 <- sirt::linking.haberman( itempars=itempartable )

#############################################################################
# EXAMPLE 3: 2 Studies - 1PL and 2PL linking
#############################################################################
set.seed(789)
I <- 20        # number of items
N <- 2000       # number of persons
# define item parameters
b <- seq( -1.5, 1.5, length=I )
# simulate data
dat1 <- sirt::sim.raschtype( stats::rnorm( N, mean=0,sd=1 ), b=b )
dat2 <- sirt::sim.raschtype( stats::rnorm( N, mean=0.5,sd=1.50 ), b=b )

#*** Model 1: 1PL
# 1PL Study 1
mod1 <- sirt::rasch.mml2( dat1, est.a=rep(1,I) )
summary(mod1)
# 1PL Study 2
mod2 <- sirt::rasch.mml2( dat2, est.a=rep(1,I) )
summary(mod2)

# collect item parameters
dfr1 <- data.frame( "study1", mod1$item$item, mod1$item$a, mod1$item$b )
dfr2 <- data.frame( "study2", mod2$item$item, mod2$item$a, mod2$item$b )
colnames(dfr2) <- colnames(dfr1) <- c("study", "item", "a", "b" )
itempars <- rbind( dfr1, dfr2 )

# Haberman linking
linkhab1 <- sirt::linking.haberman(itempars=itempars)
  ## Transformation parameters (Haberman linking)
  ##    study    At     Bt
  ## 1 study1 1.000  0.000
  ## 2 study2 1.465 -0.512
  ##
  ## Linear transformation for item parameters a and b
  ##    study   A_a   A_b    B_b
  ## 1 study1 1.000 1.000  0.000
  ## 2 study2 0.682 1.465 -0.512
  ##
  ## Linear transformation for person parameters theta
  ##    study A_theta B_theta
  ## 1 study1   1.000   0.000
  ## 2 study2   1.465   0.512
  ##
  ## R-Squared Measures of Invariance
  ##        slopes intercepts
  ## R2          1     0.9979
  ## sqrtU2      0     0.0456

#*** Model 2: 2PL
# 2PL Study 1
mod1 <- sirt::rasch.mml2( dat1, est.a=1:I )
summary(mod1)
# 2PL Study 2
mod2 <- sirt::rasch.mml2( dat2, est.a=1:I )
summary(mod2)

# collect item parameters
dfr1 <- data.frame( "study1", mod1$item$item, mod1$item$a, mod1$item$b )
dfr2 <- data.frame( "study2", mod2$item$item, mod2$item$a, mod2$item$b )
colnames(dfr2) <- colnames(dfr1) <- c("study", "item", "a", "b" )
itempars <- rbind( dfr1, dfr2 )

# Haberman linking
linkhab2 <- sirt::linking.haberman(itempars=itempars)
  ## Transformation parameters (Haberman linking)
  ##    study    At     Bt
  ## 1 study1 1.000  0.000
  ## 2 study2 1.468 -0.515
  ##
  ## Linear transformation for item parameters a and b
  ##    study   A_a   A_b    B_b
  ## 1 study1 1.000 1.000  0.000
  ## 2 study2 0.681 1.468 -0.515
  ##
  ## Linear transformation for person parameters theta
  ##    study A_theta B_theta
  ## 1 study1   1.000   0.000
  ## 2 study2   1.468   0.515
  ##
  ## R-Squared Measures of Invariance
  ##        slopes intercepts
  ## R2     0.9984     0.9980
  ## sqrtU2 0.0397     0.0443

#############################################################################
# EXAMPLE 4: 3 Studies - 1PL and 2PL linking
#############################################################################
set.seed(789)
I <- 20         # number of items
N <- 1500       # number of persons
# define item parameters
b <- seq( -1.5, 1.5, length=I )
# simulate data
dat1 <- sirt::sim.raschtype( stats::rnorm( N, mean=0, sd=1), b=b )
dat2 <- sirt::sim.raschtype( stats::rnorm( N, mean=0.5, sd=1.50), b=b )
dat3 <- sirt::sim.raschtype( stats::rnorm( N, mean=-0.2, sd=0.8), b=b )
# set some items to non-administered
dat3 <- dat3[, -c(1,4) ]
dat2 <- dat2[, -c(1,2,3) ]

#*** Model 1: 1PL in sirt
# 1PL Study 1
mod1 <- sirt::rasch.mml2( dat1, est.a=rep(1,ncol(dat1)) )
summary(mod1)
# 1PL Study 2
mod2 <- sirt::rasch.mml2( dat2, est.a=rep(1,ncol(dat2)) )
summary(mod2)
# 1PL Study 3
mod3 <- sirt::rasch.mml2( dat3, est.a=rep(1,ncol(dat3)) )
summary(mod3)

# collect item parameters
dfr1 <- data.frame( "study1", mod1$item$item, mod1$item$a, mod1$item$b )
dfr2 <- data.frame( "study2", mod2$item$item, mod2$item$a, mod2$item$b )
dfr3 <- data.frame( "study3", mod3$item$item, mod3$item$a, mod3$item$b )
colnames(dfr3) <- colnames(dfr2) <- colnames(dfr1) <- c("study", "item", "a", "b" )
itempars <- rbind( dfr1, dfr2, dfr3 )

# use person parameters
personpars <- list( mod1$person[, c("EAP","SE.EAP") ], mod2$person[, c("EAP","SE.EAP") ],
    mod3$person[, c("EAP","SE.EAP") ] )

# Haberman linking
linkhab1 <- sirt::linking.haberman(itempars=itempars, personpars=personpars)
# compare item parameters
round( cbind( linkhab1$joint.itempars[,-1], linkhab1$b.trans )[1:5,], 3 )
  ##            aj     bj study1 study2 study3
  ##   I0001 0.998 -1.427 -1.427     NA     NA
  ##   I0002 0.998 -1.290 -1.324     NA -1.256
  ##   I0003 0.998 -1.140 -1.068     NA -1.212
  ##   I0004 0.998 -0.986 -1.003 -0.969     NA
  ##   I0005 0.998 -0.869 -0.809 -0.872 -0.926

# summary of person parameters of second study
round( psych::describe( linkhab1$personpars[[2]] ), 2 )
  ##   var    n mean   sd median trimmed  mad   min  max range  skew kurtosis
  ## EAP      1 1500 0.45 1.36   0.41    0.47 1.52 -2.61 3.25  5.86 -0.08    -0.62
  ## SE.EAP   2 1500 0.57 0.09   0.53    0.56 0.04  0.49 0.84  0.35  1.47     1.56
  ##          se
  ## EAP    0.04
  ## SE.EAP 0.00

#*** Model 2: 2PL in TAM
library(TAM)
# 2PL Study 1
mod1 <- TAM::tam.mml.2pl( resp=dat1, irtmodel="2PL" )
pvmod1 <- TAM::tam.pv(mod1, ntheta=300, normal.approx=TRUE) # draw plausible values
summary(mod1)
# 2PL Study 2
mod2 <- TAM::tam.mml.2pl( resp=dat2, irtmodel="2PL" )
pvmod2 <- TAM::tam.pv(mod2, ntheta=300, normal.approx=TRUE)
summary(mod2)
# 2PL Study 3
mod3 <- TAM::tam.mml.2pl( resp=dat3, irtmodel="2PL" )
pvmod3 <- TAM::tam.pv(mod3, ntheta=300, normal.approx=TRUE)
summary(mod3)

# collect item parameters
#!!  Note that in TAM the parametrization is a*theta - b while linking.haberman
#!!  needs the parametrization a*(theta-b)
dfr1 <- data.frame( "study1", mod1$item$item, mod1$B[,2,1], mod1$xsi$xsi / mod1$B[,2,1] )
dfr2 <- data.frame( "study2", mod2$item$item, mod2$B[,2,1], mod2$xsi$xsi / mod2$B[,2,1] )
dfr3 <- data.frame( "study3", mod3$item$item, mod3$B[,2,1], mod3$xsi$xsi / mod3$B[,2,1] )
colnames(dfr3) <- colnames(dfr2) <- colnames(dfr1) <- c("study", "item", "a", "b" )
itempars <- rbind( dfr1, dfr2, dfr3 )

# define list containing person parameters
personpars <- list(  pvmod1$pv[,-1], pvmod2$pv[,-1], pvmod3$pv[,-1] )

# Haberman linking
linkhab2 <- sirt::linking.haberman(itempars=itempars,personpars=personpars)
  ##   Linear transformation for person parameters theta
  ##      study A_theta B_theta
  ##   1 study1   1.000   0.000
  ##   2 study2   1.485   0.465
  ##   3 study3   0.786  -0.192

# extract transformed person parameters
personpars.trans <- linkhab2$personpars

#############################################################################
# EXAMPLE 5: Linking with simulated item parameters containing outliers
#############################################################################

# simulate some parameters
I <- 38
set.seed(18785)
b <- stats::rnorm( I, mean=.3, sd=1.4 )
# simulate DIF effects plus some outliers
bdif <- stats::rnorm(I,mean=.4,sd=.09)+( stats::runif(I)>.9 )* rep( 1*c(-1,1)+.4, each=I/2 )
# create item parameter table
itempars <- data.frame( "study"=paste0("study",rep(1:2, each=I)),
                "item"=paste0( "I", 100 + rep(1:I,2) ), "a"=1,
                 "b"=c( b, b + bdif  )  )

#*** Model 1: Haberman linking with least squares regression
mod1 <- sirt::linking.haberman( itempars=itempars )
summary(mod1)

#*** Model 2: Haberman linking with robust bisquare regression with fixed trimming value
mod2 <- sirt::linking.haberman( itempars=itempars, estimation="BSQ", b_trim=.4)
summary(mod2)

#*** Model 2: Haberman linking with robust bisquare regression with estimated trimming value
mod3 <- sirt::linking.haberman( itempars=itempars, estimation="BSQ")
summary(mod3)

## see also Example 3 of ?sirt::robust.linking

#############################################################################
# EXAMPLE 6: Toy example of Magis and De Boeck (2012)
#############################################################################

# define item parameters from Magis & De Boeck (20212, p. 293)
b1 <- c(1,1,1,1)
b2 <- c(1,1,1,2)
itempars <- data.frame(study=rep(1:2, each=4), item=rep(1:4,2), a=1, b=c(b1,b2) )

#- Least squares regression
mod1 <- sirt::linking.haberman( itempars=itempars, estimation="OLS")
summary(mod1)

#- Bisquare regression with estimated and fixed trimming factors
mod2 <- sirt::linking.haberman( itempars=itempars, estimation="BSQ")
mod2a <- sirt::linking.haberman( itempars=itempars, estimation="BSQ", b_trim=.4)
mod2b <- sirt::linking.haberman( itempars=itempars, estimation="BSQ", b_trim=1.2)
summary(mod2)
summary(mod2a)
summary(mod2b)

#- Least squares trimmed regression
mod3 <- sirt::linking.haberman( itempars=itempars, estimation="LTS")
summary(mod3)

#- median regression
mod4 <- sirt::linking.haberman( itempars=itempars, estimation="MED")
summary(mod4)

#############################################################################
# EXAMPLE 7: Simulated example with directional DIF
#############################################################################

set.seed(98)
I <- 8
mu <- c(-.5, 0, .5)
b <- sample(seq(-1.5,1.5, len=I))
sd_dif <- 0.001
pars <- outer(b, mu, "+") + stats::rnorm(I*3, sd=sd_dif)
ind <- c(1,2); pars[ind,1] <- pars[ind,1] + c(.5,.5)
ind <- c(3,4); pars[ind,2] <- pars[ind,2] + (-1)*c(.6,.6)
ind <- c(5,6); pars[ind,3] <- pars[ind,3] + (-1)*c(1,1)

# median polish (=stats::medpolish())
tmod1 <- sirt:::L1_polish(x=pars)
# L0 polish with tolerance criterion of .3
tmod2 <- sirt::L0_polish(x=pars, tol=.3)

#- prepare itempars input
itempars <- sirt::linking_haberman_itempars_prepare(b=pars)

#- compare different estimation functions for Haberman linking
mod01 <- sirt::linking.haberman(itempars, estimation="L1")
mod02 <- sirt::linking.haberman(itempars, estimation="L0", b_trim=.3)
mod1 <- sirt::linking.haberman(itempars, estimation="OLS")
mod2 <- sirt::linking.haberman(itempars, estimation="BSQ")
mod2a <- sirt::linking.haberman(itempars, estimation="BSQ", b_trim=.4)
mod3 <- sirt::linking.haberman(itempars, estimation="MED")
mod4 <- sirt::linking.haberman(itempars, estimation="LTS")
mod5 <- sirt::linking.haberman(itempars, estimation="HUB")
mod01$transf.pars
mod02$transf.pars
mod1$transf.pars
mod2$transf.pars
mod2a$transf.pars
mod3$transf.pars
mod4$transf.pars
mod5$transf.pars

#############################################################################
# EXAMPLE 8: Many studies and directional DIF
#############################################################################

## dataset 2
set.seed(98)
I <- 10 # number of items
S <- 7  # number of studies
mu <- round( seq(0, 1, len=S))
b <- sample(seq(-1.5,1.5, len=I))
sd_dif <- 0.001
pars0 <- pars <- outer(b, mu, "+") + stats::rnorm(I*S, sd=sd_dif)

# select n_dif items at random per group and set it to dif or -dif
n_dif <- 2
dif <- .6
for (ss in 1:S){
    ind <- sample( 1:I, n_dif )
    pars[ind,ss] <- pars[ind,ss] + dif*sign( runif(1) - .5 )
}

# check DIF
pars - pars0

#* estimate models
itempars <- sirt::linking_haberman_itempars_prepare(b=pars)
mod0 <- sirt::linking.haberman(itempars, estimation="L0", b_trim=.2)
mod1 <- sirt::linking.haberman(itempars, estimation="OLS")
mod2 <- sirt::linking.haberman(itempars, estimation="BSQ")
mod2a <- sirt::linking.haberman(itempars, estimation="BSQ", b_trim=.4)
mod3 <- sirt::linking.haberman(itempars, estimation="MED")
mod3a <- sirt::linking.haberman(itempars, estimation="L1")
mod4 <- sirt::linking.haberman(itempars, estimation="LTS")
mod5 <- sirt::linking.haberman(itempars, estimation="HUB")
mod0$transf.pars
mod1$transf.pars
mod2$transf.pars
mod2a$transf.pars
mod3$transf.pars
mod3a$transf.pars
mod4$transf.pars
mod5$transf.pars

#* compare results with Haebara linking
mod11 <- sirt::linking.haebara(itempars, dist="L2")
mod12 <- sirt::linking.haebara(itempars, dist="L1")
summary(mod11)
summary(mod12)

## End(Not run)

Haebara Linking of the 2PL Model for Multiple Studies

Description

The function linking.haebara is a generalization of Haebara linking of the 2PL model to multiple groups (or multiple studies; see Battauz, 2017, for a similar approach). The optimization estimates transformation parameters for means and standard deviations of the groups and joint item parameters. The function allows two different distance functions dist="L2" and dist="L1" where the latter is a robustified version of Haebara linking (see Details; He, Cui, & Osterlind, 2015; He & Cui, 2020; Hu, Rogers, & Vukmirovic, 2008).

Usage

linking.haebara(itempars, dist="L2", theta=seq(-4,4, length=61),
        optimizer="optim", center=FALSE, eps=1e-3, par_init=NULL, use_rcpp=TRUE,
        pow=2, use_der=TRUE, ...)

## S3 method for class 'linking.haebara'
summary(object, digits=3, file=NULL, ...)

Arguments

itempars

dist

Distance function. Options are "L2" for squared loss and "L1" for absolute value loss.

theta

Grid of theta points for 2PL item response functions

optimizer

Name of the optimizer chosen for alignment. Options are "optim" (using stats::optim) or "nlminb" (using stats::nlminb).

center

Logical indicating whether means and standard deviations should be centered after estimation

eps

Small value for smooth approximation of the absolute value function

par_init

Optional vector of initial parameter estimates

use_rcpp

Logical indicating whether Rcpp is used for computation

pow

Power for method dist="Lq"

use_der

Logical indicating whether analytical derivative should be used

object

Object of class linking.haabara.

digits

Number of digits after decimals for rounding in summary.

file

Optional file name if summary should be sunk into a file.

...

Further arguments to be passed

Details

For t=1,\ldots,T studies, item difficulties b_{it} and item slopes a_{it} are available. The 2PL item response functions are given by

logit P(X_{pi}=1| \theta_p )=a_i ( \theta_p - b_i )

Haebara linking compares the observed item response functions P_{it} based on the equation for the logits a_{it}(\theta - b_{it}) and the expected item response functions P_{it}^\ast based on the equation for the logits a_i^\ast \sigma_t ( \theta - ( b_i - \mu_t)/\sigma_t ) where the joint item parameters a_i and b_i and means \mu_t and standard deviations \sigma_t are estimated.

Two loss functions are implemented. The quadratic loss of Haebara linking (dist="L2") minimizes

f_{opt, L2}=\sum_t \sum_i \int ( P_{it} (\theta ) - P_{it}^\ast (\theta ) )^2 w(\theta)

was originally proposed by Haebara. A robustified version (dist="L1") uses the optimization function (He et al., 2015)

f_{opt, L1}=\sum_t \sum_i \int | P_{it} (\theta ) - P_{it}^\ast (\theta ) | w(\theta)

As a further generalization, the follwing distance function (dist="Lp") can be minimized:

f_{opt, Lp}=\sum_t \sum_i \int | P_{it} (\theta ) - P_{it}^\ast (\theta ) |^p w(\theta)

Value

A list with following entries

pars

Estimated means and standard deviations (transformation parameters)

item

Estimated joint item parameters

a.orig

Original a_{it} parameters

b.orig

Original b_{it} parameters

a.resid

Residual a_{it} parameters (DIF parameters)

b.resid

Residual b_{it} parameters (DIF parameters)

res_optim

Value of optimization routine

References

Battauz, M. (2017). Multiple equating of separate IRT calibrations. Psychometrika, 82, 610-636. doi:10.1007/s11336-016-9517-x

He, Y., Cui, Z., & Osterlind, S. J. (2015). New robust scale transformation methods in the presence of outlying common items. Applied Psychological Measurement, 39(8), 613-626. doi:10.1177/0146621615587003

He, Y., & Cui, Z. (2020). Evaluating robust scale transformation methods with multiple outlying common items under IRT true score equating. Applied Psychological Measurement, 44(4), 296-310. doi:10.1177/0146621619886050

Hu, H., Rogers, W. T., & Vukmirovic, Z. (2008). Investigation of IRT-based equating methods in the presence of outlier common items. Applied Psychological Measurement, 32(4), 311-333. doi:10.1177/0146621606292215

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Robust linking methods in the presence of outliers
#############################################################################

#** simulate data
I <- 10
a <- seq(.9, 1.1, len=I)
b <- seq(-2, 2, len=I)

#- define item parameters
item_names <- paste0("I",100+1:I)
# th=SIG*TH+MU=> logit(p)=a*(SIG*TH+MU-b)=a*SIG*(TH-(-MU)/SIG-b/SIG)
d1 <- data.frame( study="S1", item=item_names, a=a, b=b )
mu <- .5; sigma <- 1.3
d2 <- data.frame( study="S2", item=item_names, a=a*sigma, b=(b-mu)/sigma )
mu <- -.3; sigma <- .7
d3 <- data.frame( study="S3", item=item_names, a=a*sigma, b=(b-mu)/sigma )

#- define DIF effect
# dif <- 0  # no DIF effects
dif <- 1
d2[4,"a"] <- d2[4,"a"] * (1-.8*dif)
d3[5,"b"] <- d3[5,"b"] - 2*dif
itempars <- rbind(d1, d2, d3)

#* Haebara linking non-robust
mod1 <- sirt::linking.haebara( itempars, dist="L2", control=list(trace=2) )
summary(mod1)

#* Haebara linking robust
mod2 <- sirt::linking.haebara( itempars, dist="L1", control=list(trace=2) )
summary(mod2)

#* using initial parameter estimates
par_init <- mod1$res_optim$par
mod2b <- sirt::linking.haebara( itempars, dist="L1", par_init=par_init)
summary(mod2b)

#* power p=.25
mod2c <- sirt::linking.haebara( itempars, dist="Lp", pow=.25, par_init=par_init)
summary(mod2c)

#* Haberman linking non-robust
mod3 <- sirt::linking.haberman(itempars)
summary(mod3)

#* Haberman linking robust
mod4 <- sirt::linking.haberman(itempars, estimation="BSQ", a_trim=.25, b_trim=.5)
summary(mod4)

#* compare transformation parameters (means and standard deviations)
mod1$pars
mod2$pars
mod3$transf.personpars
mod4$transf.personpars

## End(Not run)

Robust Linking of Item Intercepts

Description

This function implements a robust alternative of mean-mean linking which employs trimmed means instead of means. The linking constant is calculated for varying trimming parameters k. The treatment of differential item functioning as outliers and application of robust statistics is discussed in Magis and De Boeck (2011, 2012).

Usage

linking.robust(itempars)

## S3 method for class 'linking.robust'
summary(object,...)

## S3 method for class 'linking.robust'
plot(x, ...)

Arguments

itempars

Data frame of item parameters (item intercepts). The first column contains the item label, the 2nd and 3rd columns item parameters of two studies.

object

Object of class linking.robust

x

Object of class linking.robust

...

Further arguments to be passed

Value

A list with following entries

ind.kopt

Index for optimal scale parameter

kopt

Optimal scale parameter

meanpars.kopt

Linking constant for optimal scale parameter

se.kopt

Standard error for linking constant obtained with optimal scale parameter

meanpars

Linking constant dependent on the scale parameter

se

Standard error of the linking constant dependent on the scale parameter

sd

DIF standard deviation (non-robust estimate)

mad

DIF standard deviation (robust estimate using the MAD measure)

pars

Original item parameters

k.robust

Used vector of scale parameters

I

Number of items

itempars

Used data frame of item parameters

References

Magis, D., & De Boeck, P. (2011). Identification of differential item functioning in multiple-group settings: A multivariate outlier detection approach. Multivariate Behavioral Research, 46(5), 733-755. doi:10.1080/00273171.2011.606757

Examples

#############################################################################
# EXAMPLE 1: Linking data.si03
#############################################################################

data(data.si03)
res1 <- sirt::linking.robust( itempars=data.si03 )
summary(res1)
  ##   Number of items=27
  ##   Optimal trimming parameter k=8 |  non-robust parameter k=0
  ##   Linking constant=-0.0345 |  non-robust estimate=-0.056
  ##   Standard error=0.0186 |  non-robust estimate=0.027
  ##   DIF SD: MAD=0.0771 (robust) | SD=0.1405 (non-robust)
plot(res1)

## Not run: 
#############################################################################
# EXAMPLE 2: Linking PISA item parameters data.pisaPars
#############################################################################

data(data.pisaPars)

# Linking with items
res2 <- sirt::linking.robust( data.pisaPars[, c(1,3,4)] )
summary(res2)
  ##   Optimal trimming parameter k=0 |  non-robust parameter k=0
  ##   Linking constant=-0.0883 |  non-robust estimate=-0.0883
  ##   Standard error=0.0297 |  non-robust estimate=0.0297
  ##   DIF SD: MAD=0.1824 (robust) | SD=0.1487 (non-robust)
##  -> no trimming is necessary for reducing the standard error
plot(res2)

#############################################################################
# EXAMPLE 3: Linking with simulated item parameters containing outliers
#############################################################################

# simulate some parameters
I <- 38
set.seed(18785)
itempars <- data.frame("item"=paste0("I",1:I) )
itempars$study1 <- stats::rnorm( I, mean=.3, sd=1.4 )
# simulate DIF effects plus some outliers
bdif <- stats::rnorm(I,mean=.4,sd=.09)+( stats::runif(I)>.9 )* rep( 1*c(-1,1)+.4, each=I/2 )
itempars$study2 <- itempars$study1 + bdif

# robust linking
res <- sirt::linking.robust( itempars )
summary(res)
  ##   Number of items=38
  ##   Optimal trimming parameter k=12 |  non-robust parameter k=0
  ##   Linking constant=-0.4285 |  non-robust estimate=-0.5727
  ##   Standard error=0.0218 |  non-robust estimate=0.0913
  ##   DIF SD: MAD=0.1186 (robust) | SD=0.5628 (non-robust)
## -> substantial differences of estimated linking constants in this case of
##    deviations from normality of item parameters
plot(res)

## End(Not run)

Fit of a `L_q` Regression Model

Description

Fits a regression model in the L_q norm (also labeled as the L_p norm). In more detail, the optimization function \sum_i | y_i - x_i \beta | ^p is optimized. The nondifferentiable function is approximated by a differentiable approximation, i.e., we use |x| \approx \sqrt{x^2 + \varepsilon } . The power p can also be estimated by using est_pow=TRUE, see Giacalone, Panarello and Mattera (2018). The algorithm iterates between estimating regression coefficients and the estimation of power values. The estimation of the power based on a vector of residuals e can be conducted using the function lq_fit_estimate_power.

Using the L_q norm in the regression is equivalent to assuming an expontial power function for residuals (Giacalone et al., 2018). The density function and a simulation function is provided by dexppow and rexppow, respectively. See also the normalp package.

Usage

lq_fit(y, X, w=NULL, pow=2, eps=0.001, beta_init=NULL, est_pow=FALSE, optimizer="optim",
    eps_vec=10^seq(0,-10, by=-.5), conv=1e-4, miter=20, lower_pow=.1, upper_pow=5)

lq_fit_estimate_power(e, pow_init=2, lower_pow=.1, upper_pow=10)

dexppow(x, mu=0, sigmap=1, pow=2, log=FALSE)

rexppow(n, mu=0, sigmap=1, pow=2, xbound=100, xdiff=.01)

Arguments

y

Dependent variable

X

Design matrix

w

Optional vector of weights

pow

Power p in L_q norm

est_pow

Logical indicating whether power should be estimated

eps

Parameter governing the differentiable approximation

e

Vector of resiuals

pow_init

Initial value of power

beta_init

Initial vector

optimizer

Can be "optim" or "nlminb".

eps_vec

Vector with decreasing \varepsilon values used in optimization

conv

Convergence criterion

miter

Maximum number of iterations

lower_pow

Lower bound for estimated power

upper_pow

Upper bound for estimated power

x

Vector

mu

Location parameter

sigmap

Scale parameter

log

Logical indicating whether the logarithm should be provided

n

Sample size

xbound

Lower and upper bound for density approximation

xdiff

Grid width for density approximation

Value

List with following several entries

coefficients

Vector of coefficients

res_optim

Results of optimization

...

More values

References

Giacalone, M., Panarello, D., & Mattera, R. (2018). Multicollinearity in regression: an efficiency comparison between $L_p$-norm and least squares estimators. Quality & Quantity, 52(4), 1831-1859. doi:10.1007/s11135-017-0571-y

Examples

#############################################################################
# EXAMPLE 1: Small simulated example with fixed power
#############################################################################

set.seed(98)
N <- 300
x1 <- stats::rnorm(N)
x2 <- stats::rnorm(N)
par1 <- c(1,.5,-.7)
y <- par1[1]+par1[2]*x1+par1[3]*x2 + stats::rnorm(N)
X <- cbind(1,x1,x2)

#- lm function in stats
mod1 <- stats::lm.fit(y=y, x=X)

#- use lq_fit function
mod2 <- sirt::lq_fit( y=y, X=X, pow=2, eps=1e-4)
mod1$coefficients
mod2$coefficients

## Not run: 
#############################################################################
# EXAMPLE 2: Example with estimated power values
#############################################################################

#*** simulate regression model with residuals from the exponential power distribution
#*** using a power of .30
set.seed(918)
N <- 2000
X <- cbind( 1, c(rep(1,N), rep(0,N)) )
e <- sirt::rexppow(n=2*N, pow=.3, xdiff=.01, xbound=200)
y <- X %*% c(1,.5) + e

#*** estimate model
mod <- sirt::lq_fit( y=y, X=X, est_pow=TRUE, lower_pow=.1)
mod1 <- stats::lm( y ~ 0 + X )
mod$coefficients
mod$pow
mod1$coefficients

## End(Not run)

Least Squares Distance Method of Cognitive Validation

Description

This function estimates the least squares distance method of cognitive validation (Dimitrov, 2007; Dimitrov & Atanasov, 2012) which assumes a multiplicative relationship of attribute response probabilities to explain item response probabilities. The argument distance allows the estimation of a squared loss function (distance="L2") and an absolute value loss function (distance="L1").

The function also estimates the classical linear logistic test model (LLTM; Fischer, 1973) which assumes a linear relationship for item difficulties in the Rasch model.

Usage

lsdm(data, Qmatrix, theta=seq(-3,3,by=.5), wgt_theta=rep(1, length(theta)), distance="L2",
   quant.list=c(0.5,0.65,0.8), b=NULL, a=rep(1,nrow(Qmatrix)), c=rep(0,nrow(Qmatrix)) )

## S3 method for class 'lsdm'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'lsdm'
plot(x, ...)

Arguments

data

An I \times L matrix of dichotomous item responses. The data consists of I item response functions (parametrically or nonparametrically estimated) which are evaluated at a discrete grid of L theta values (person parameters) and are specified in the argument theta.

Qmatrix

An I \times K matrix where the allocation of items to attributes is coded. Values of zero and one and all values between zero and one are permitted. There must not be any items with only zero Q-matrix entries in a row.

theta

The discrete grid points \theta where item response functions are evaluated for doing the LSDM method.

wgt_theta

Optional vector for weights of discrete \theta points

quant.list

A vector of quantiles where attribute response functions are evaluated.

distance

Type of distance function for minimizing the discrepancy between observed and expected item response functions. Options are "L2" which is the squared distance (proposed in the original LSDM formulation in Dimitrov, 2007) and the absolute value distance "L1" (see Details).

b

An optional vector of item difficulties. If it is specified, then no data input is necessary.

a

An optional vector of item discriminations.

c

An optional vector of guessing parameters.

object

Object of class lsdm

file

Optional file name for summary output

digits

Number of digits aftert decimal in summary

...

Further arguments to be passed

x

Object of class lsdm

Details

The least squares distance method (LSDM; Dimitrov 2007) is based on the assumption that estimated item response functions P(X_i=1 | \theta) can be decomposed in a multiplicative way (in the implemented conjunctive model):

P( X_i=1 | \theta ) \approx \prod_{k=1}^K [ P( A_k=1 | \theta ) ]^{q_{ik}}

where P( A_k=1 | \theta ) are attribute response functions and q_{ik} are entries of the Q-matrix. Note that the multiplicative form can be rewritten by taking the logarithm

\log P( X_i=1 | \theta ) \approx \sum_{k=1}^K q_{ik} \log [ P( A_k=1 | \theta ) ]

The item and attribute response functions are evaluated on a grid of \theta values. Using the definitions of matrices \bold{L}=\{ \log P( X_i=1 ) | \theta ) \} , \bold{Q}=\{ q_{ik} \} and \bold{X}=\{ \log P( A_k=1 | \theta ) \} , the estimation problem can be formulated as \bold{L} \approx \bold{Q} \bold{X}. Two different loss functions for minimizing the discrepancy between \bold{L} and \bold{Q} \bold{X} are implemented. First, the squared loss function computes the weighted difference || \bold{L} - \bold{Q} \bold{X}||_2=\sum_i ( l_i - \sum_t q_{it} x_{it})^2 (distance="L2") and has been originally proposed by Dimitrov (2007). Second, the absolute value loss function || \bold{L} - \bold{Q} \bold{X}||_1=\sum_i | l_i - \sum_t q_{it} x_{it} | (distance="L1") is more robust to outliers (i.e., items which show misfit to the assumed multiplicative LSDM formulation).

After fitting the attribute response functions, empirical item-attribute discriminations w_{ik} are calculated as the approximation of the following equation

\log P( X_i=1 | \theta )= \sum_{k=1}^K w_{ik} q_{ik} \log [ P( A_k=1 | \theta ) ]

Value

A list with following entries

mean.mad.lsdm0

Mean of MAD statistics for LSDM

mean.mad.lltm

Mean of MAD statistics for LLTM

attr.curves

Estimated attribute response curves evaluated at theta

attr.pars

Estimated attribute parameters for LSDM and LLTM

data.fitted

LSDM-fitted item response functions evaluated at theta

theta

Grid of ability distributions at which functions are evaluated

item

Item statistics (p value, MAD, ...)

data

Estimated or fixed item response functions evaluated at theta

Qmatrix

Used Q-matrix

lltm

Model output of LLTM (lm values)

W

Matrix with empirical item-attribute discriminations

References

Al-Shamrani, A., & Dimitrov, D. M. (2016). Cognitive diagnostic analysis of reading comprehension items: The case of English proficiency assessment in Saudi Arabia. International Journal of School and Cognitive Psychology, 4(3). 1000196. http://dx.doi.org/10.4172/2469-9837.1000196

DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007). Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26 (pp. 979-1030). Amsterdam: Elsevier.

Dimitrov, D. M., & Atanasov, D. V. (2012). Conjunctive and disjunctive extensions of the least squares distance model of cognitive diagnosis. Educational and Psychological Measurement, 72, 120-138. http://dx.doi.org/10.1177/0013164411402324

Dimitrov, D. M., Gerganov, E. N., Greenberg, M., & Atanasov, D. V. (2008). Analysis of cognitive attributes for mathematics items in the framework of Rasch measurement. AERA 2008, New York.

Fischer, G. H. (1973). The linear logistic test model as an instrument in educational research. Acta Psychologica, 37, 359-374. http://dx.doi.org/10.1016/0001-6918(73)90003-6

Sonnleitner, P. (2008). Using the LLTM to evaluate an item-generating system for reading comprehension. Psychology Science, 50, 345-362.

Examples

#############################################################################
# EXAMPLE 1: Dataset Fischer (see Dimitrov, 2007)
#############################################################################

# item difficulties
b <- c( 0.171,-1.626,-0.729,0.137,0.037,-0.787,-1.322,-0.216,1.802,
    0.476,1.19,-0.768,0.275,-0.846,0.213,0.306,0.796,0.089,
    0.398,-0.887,0.888,0.953,-1.496,0.905,-0.332,-0.435,0.346,
    -0.182,0.906)
# read Q-matrix
Qmatrix <- c( 1,1,0,1,0,0,0,0,1,0,1,0,0,0,0,0,1,0,0,1,0,0,0,0,
    1,0,1,1,0,0,0,0,1,0,0,1,0,0,0,0,0,1,0,0,1,1,0,0,1,0,1,0,1,0,0,0,
    1,0,1,0,1,1,0,0,1,0,1,1,0,1,0,0,1,0,0,1,0,1,0,0,1,0,1,1,1,0,0,0,
    1,0,0,1,0,0,1,0,1,0,0,1,0,0,1,0,1,0,1,0,0,0,1,0,1,1,0,1,0,1,1,0,
    1,0,1,1,0,0,1,0,1,0,0,1,0,0,0,1,1,0,1,1,0,0,0,1,1,0,0,1,0,0,0,1,
    0,1,0,0,0,1,0,1,1,1,0,1,0,1,0,1,1,0,0,1,0,1,0,0,1,1,0,0,1,0,0,0,
    1,0,0,1,1,0,0,0,1,1,0,1,0,0,0,0,1,0,1,1,0,0,0,0,1,0,1,1,0,1,0,0,
    1,1,0,1,0,0,0,0,1,0,1,1,1,1,0,0 )
Qmatrix <- matrix( Qmatrix, nrow=29, byrow=TRUE )
colnames(Qmatrix) <- paste("A",1:8,sep="")
rownames(Qmatrix) <- paste("Item",1:29,sep="")

#* Model 1: perform a LSDM analysis with defaults
mod1 <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod1)
plot(mod1)

#* Model 2: different theta values and weights
theta <- seq(-4,4,len=31)
wgt_theta <- stats::dnorm(theta)
mod2 <- sirt::lsdm( b=b, Qmatrix=Qmatrix, theta=theta, wgt_theta=wgt_theta )
summary(mod2)

#* Model 3: absolute value distance function
mod3 <- sirt::lsdm( b=b, Qmatrix=Qmatrix, distance="L1" )
summary(mod3)

#############################################################################
# EXAMPLE 2: Dataset Henning (see Dimitrov, 2007)
#############################################################################

# item difficulties
b <- c(-2.03,-1.29,-1.03,-1.58,0.59,-1.65,2.22,-1.46,2.58,-0.66)
# item slopes
a <- c(0.6,0.81,0.75,0.81,0.62,0.75,0.54,0.65,0.75,0.54)
# define Q-matrix
Qmatrix <- c(1,0,0,0,0,0,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,1,0,0,
    0,0,0,1,0,0,1,0,0,1,0,0,0,1,0,0,0,0,1,1,1,0,1,0,0 )
Qmatrix <- matrix( Qmatrix, nrow=10, byrow=TRUE )
colnames(Qmatrix) <- paste("A",1:5,sep="")
rownames(Qmatrix) <- paste("Item",1:10,sep="")

# LSDM analysis
mod <- sirt::lsdm( b=b, a=a, Qmatrix=Qmatrix )
summary(mod)

## Not run: 
#############################################################################
# EXAMPLE 3: PISA reading (data.pisaRead)
#    using nonparametrically estimated item response functions
#############################################################################

data(data.pisaRead)
# response data
dat <- data.pisaRead$data
dat <- dat[, substring( colnames(dat),1,1)=="R" ]
# define Q-matrix
pars <- data.pisaRead$item
Qmatrix <- data.frame(  "A0"=1*(pars$ItemFormat=="MC" ),
                  "A1"=1*(pars$ItemFormat=="CR" ) )

# start with estimating the 1PL in order to get person parameters
mod <- sirt::rasch.mml2( dat )
theta <- sirt::wle.rasch( dat=dat,b=mod$item$b )$theta
# Nonparametric estimation of item response functions
mod2 <- sirt::np.dich( dat=dat, theta=theta, thetagrid=seq(-3,3,len=100) )

# LSDM analysis
lmod <- sirt::lsdm( data=mod2$estimate, Qmatrix=Qmatrix, theta=mod2$thetagrid)
summary(lmod)
plot(lmod)

#############################################################################
# EXAMPLE 4: Fraction subtraction dataset
#############################################################################

data( data.fraction1, package="CDM")
data <- data.fraction1$data
q.matrix <- data.fraction1$q.matrix

#****
# Model 1: 2PL estimation
mod1 <- sirt::rasch.mml2( data, est.a=1:nrow(q.matrix) )

# LSDM analysis
lmod1 <- sirt::lsdm( b=mod1$item$b, a=mod1$item$a, Qmatrix=q.matrix )
summary(lmod1)

#****
# Model 2: 1PL estimation
mod2 <- sirt::rasch.mml2(data)

# LSDM analysis
lmod2 <- sirt::lsdm( b=mod1$item$b, Qmatrix=q.matrix )
summary(lmod2)

#############################################################################
# EXAMPLE 5: Dataset LLTM Sonnleitner Reading Comprehension (Sonnleitner, 2008)
#############################################################################

# item difficulties Table 7, p. 355 (Sonnleitner, 2008)
b <- c(-1.0189,1.6754,-1.0842,-.4457,-1.9419,-1.1513,2.0871,2.4874,-1.659,-1.197,-1.2437,
    2.1537,.3301,-.5181,-1.3024,-.8248,-.0278,1.3279,2.1454,-1.55,1.4277,.3301)
b <- b[-21] # remove Item 21

# Q-matrix Table 9, p. 357 (Sonnleitner, 2008)
Qmatrix <- scan()
   1 0 0 0 0 0 0 7 4 0 0 0   0 1 0 0 0 0 0 5 1 0 0 0   1 1 0 1 0 0 0 9 1 0 1 0
   1 1 1 0 0 0 0 5 2 0 1 0   1 1 0 0 1 0 0 7 5 1 1 0   1 1 0 0 0 0 0 7 3 0 0 0
   0 1 0 0 0 0 2 6 1 0 0 0   0 0 0 0 0 0 2 6 1 0 0 0   1 0 0 0 0 0 1 7 4 1 0 0
   0 1 0 0 0 0 0 6 2 1 1 0   0 1 0 0 0 1 0 7 3 1 0 0   0 1 0 0 0 0 0 5 1 0 0 0
   0 0 0 0 0 1 0 4 1 0 0 1   0 0 0 0 0 0 0 6 1 0 1 1   0 0 1 0 0 0 0 6 3 0 1 1
   0 0 0 1 0 0 1 7 5 0 0 1   0 1 0 0 0 0 1 2 2 0 0 1   0 1 1 0 0 0 1 4 1 0 0 1
   0 1 0 0 1 0 0 5 1 0 0 1   0 1 0 0 0 0 1 7 2 0 0 1   0 0 0 0 0 1 0 5 1 0 0 1

Qmatrix <- matrix( as.numeric(Qmatrix), nrow=21, ncol=12, byrow=TRUE )
colnames(Qmatrix) <- scan( what="character", nlines=1)
   pc ic ier inc iui igc ch nro ncro td a t

# divide Q-matrix entries by maximum in each column
Qmatrix <- round(Qmatrix / matrix(apply(Qmatrix,2,max),21,12,byrow=TRUE),3)
# LSDM analysis
mod <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod)

#############################################################################
# EXAMPLE 6: Dataset Dimitrov et al. (2008)
#############################################################################

Qmatrix <- scan()
1 0 0 0 1 1 0 1 1 0 0 0 1 0 0 0 1 1 0 0 1 1 0 1 0 0 1 1 1 0 1 0 0 0 1 0

Qmatrix <- matrix(Qmatrix, ncol=4, byrow=TRUE)
colnames(Qmatrix) <- paste0("A",1:4)
rownames(Qmatrix) <- paste0("I",1:9)

b <- scan()
0.068 1.095 -0.641 -1.129 -0.061 1.218 1.244 -0.648 -1.146

# estimate model
mod <- sirt::lsdm( b=b, Qmatrix=Qmatrix )
summary(mod)
plot(mod)

#############################################################################
# EXAMPLE 7: Dataset Al-Shamrani & Dimitrov et al. (2017)
#############################################################################

I <- 39  # number of items

Qmatrix <- scan()
0 0 0 1 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0
0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0
0 0 0 0 1 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0
0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0
0 1 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0
0 0 1 0 0 0 0 0 0 0 0 0 0 1 0

Qmatrix <- matrix(Qmatrix, nrow=I, byrow=TRUE)
colnames(Qmatrix) <- paste0("A",1:7)
rownames(Qmatrix) <- paste0("I",1:I)

pars <- scan()
1.952 0.9833 0.1816 1.1053 0.9631 0.1653 1.3904 1.3208 0.2545 0.7391 1.9367 0.2083 2.0833
1.8627 0.1873 1.4139 1.0107 0.2454 0.8274 0.9913 0.2137 1.0338 -0.0068 0.2368 2.4803
0.7939 0.1997 1.4867 1.1705 0.2541 1.4482 1.4176 0.2889 1.0789 0.8062 0.269 1.6258 1.1739
0.1723 1.5995 1.0936 0.2054 1.1814 1.0909 0.2623 2.0389 1.5023 0.2466 1.3636 1.1485 0.2059
1.8468 1.2755 0.192 1.9461 1.4947 0.2001 1.194 0.0889 0.2275 1.2114 0.8925 0.2367 2.0912
0.5961 0.2036 2.5769 1.3014 0.186 1.4554 1.2529 0.2423 1.4919 0.4763 0.2482 2.6787 1.7069
0.1796 1.5611 1.3991 0.2312 1.4353 0.678 0.1851 0.9127 1.3523 0.2525 0.6886 -0.3652 0.207
0.7039 -0.2494 0.2315 1.3683 0.8953 0.2326 1.4992 0.1025 0.2403 1.0727 0.2591 0.2152
1.3854 1.3802 0.2448 0.7748 0.4304 0.184 1.0218 1.8964 0.1949 1.5773 1.8934 0.2231 0.8631
1.4145 0.2132

pars <- matrix(pars, nrow=I, byrow=TRUE)
colnames(pars) <- c("a","b","c")
rownames(pars) <- paste0("I",1:I)
pars <- as.data.frame(pars)

#* Model 1: fit LSDM to 3PL curves (as in Al-Shamrani)
mod1 <- sirt::lsdm(b=pars$b, a=pars$a, c=pars$c, Qmatrix=Qmatrix)
summary(mod1)
plot(mod1)

#* Model 2: fit LSDM to 2PL curves
mod2 <- sirt::lsdm(b=pars$b, a=pars$a, Qmatrix=Qmatrix)
summary(mod2)
plot(mod2)

## End(Not run)

Local Structural Equation Models (LSEM)

Description

Local structural equation models (LSEM) are structural equation models (SEM) which are evaluated for each value of a pre-defined moderator variable (Hildebrandt et al., 2009, 2016). As in nonparametric regression models, observations near a focal point - at which the model is evaluated - obtain higher weights, far distant observations obtain lower weights. The LSEM can be specified by making use of lavaan syntax. It is also possible to specify a discretized version of LSEM in which values of the moderator are grouped and a multiple group SEM is specified. The LSEM can be tested by employing a permutation test, see lsem.permutationTest.

The function lsem.MGM.stepfunctions outputs stepwise functions for a multiple group model evaluated at a grid of focal points of the moderator, specified in moderator.grid.

The argument pseudo_weights provides an ad hoc solution to estimate an LSEM for any model which can be fitted in lavaan.

It is also possible to constrain some of the parameters along the values of the moderator in a joint estimation approach (est_joint=TRUE). Parameter names can be specified which are assumed to be invariant (in par_invariant). In addition, linear or quadratic constraints can be imposed on parameters (par_linear or par_quadratic).

Statistical inference in case of joint estimation (but also for separate estimation) can be conducted via bootstrap using the function lsem.bootstrap. Bootstrap at the level of a cluster identifier is allowed (argument cluster).

Usage

lsem.estimate(data, moderator, moderator.grid, lavmodel, type="LSEM", h=1.1, bw=NULL,
    residualize=TRUE, fit_measures=c("rmsea", "cfi", "tli", "gfi", "srmr"),
    standardized=FALSE, standardized_type="std.all", lavaan_fct="sem",
    sufficient_statistics=TRUE, pseudo_weights=0,
    sampling_weights=NULL, loc_linear_smooth=TRUE, est_joint=FALSE, par_invariant=NULL,
    par_linear=NULL, par_quadratic=NULL, partable_joint=NULL, pw_linear=1,
    pw_quadratic=1, pd=TRUE, est_DIF=FALSE, se=NULL, kernel="gaussian",
    eps=1e-08, verbose=TRUE, ...)

## S3 method for class 'lsem'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'lsem'
plot(x, parindex=NULL, ask=TRUE, ci=TRUE, lintrend=TRUE,
       parsummary=TRUE, ylim=NULL, xlab=NULL,  ylab=NULL, main=NULL,
       digits=3, ...)

lsem.MGM.stepfunctions( object, moderator.grid )

# compute local weights
lsem_local_weights(data.mod, moderator.grid, h, sampling_weights=NULL,  bw=NULL,
     kernel="gaussian")

lsem.bootstrap(object, R=100, verbose=TRUE, cluster=NULL,
     repl_design=NULL, repl_factor=NULL, use_starting_values=TRUE,
     n.core=1, cl.type="PSOCK")

Arguments

data

Data frame

moderator

Variable name of the moderator

moderator.grid

Focal points at which the LSEM should be evaluated. If type="MGM", breaks are defined in this vector.

lavmodel

Specified SEM in lavaan.

type

Type of estimated model. The default is type="LSEM" which means that a local structural equation model is estimated. A multiple group model with a discretized moderator as the grouping variable can be estimated with type="MGM". In this case, the breaks must be defined in moderator.grid.

h

Bandwidth factor

bw

Optional bandwidth parameter if h should not be used

residualize

Logical indicating whether a residualization should be applied.

fit_measures

Vector with names of fit measures following the labels in lavaan

standardized

Optional logical indicating whether standardized solution should be included as parameters in the output using the lavaan::standardizedSolution function. Standardized parameters are labeled as std__.

standardized_type

Type of standardization if standardized=TRUE. The types are described in lavaan::standardizedSolution.

lavaan_fct

String whether lavaan::lavaan (lavaan_fct="lavaan"), lavaan::sem (lavaan_fct="sem"), lavaan::cfa (lavaan_fct="cfa") or lavaan::growth (lavaan_fct="growth") should be used.

sufficient_statistics

Logical whether sufficient statistics of weighted means and covariances should be used for model fitting. This option can be set to sufficient_statistics=FALSE if the data contain missing values. Note that the option sufficient_statistics=TRUE is only valid for (approximate) missing completely at random (MCAR) data. The option can only be used for continuous data.

pseudo_weights

Integer defining a target sample size. Local weights are multiplied by a factor which is rounded to integers. This approach is referred as a pseudo weighting approach. For example, using pseudo_weights=30000 implies that the sum of local weights at each focal point is 30000.

sampling_weights

Optional vector of sampling weights

loc_linear_smooth

Logical indicating whether local linear smoothing should be used for computing sufficient statistics for means and covariances. The default is FALSE.

est_joint

Logical indicating whether LSEM should be estimated in a joint estimation approach. This options only works wih continuous data and sufficient statistics.

par_invariant

Vector of invariant parameters

par_linear

Vector of parameters with linear function

par_quadratic

Vector of parameters with quadratic function

partable_joint

User-defined parameter table if joint estimation is used (est_joint=TRUE).

pw_linear

Number of segments if piecewise linear estimation of parameters is used

pw_quadratic

Number of segments if piecewise quadratic estimation of parameters is used

pd

Logical indicating whether nearest positive definite covariance matrix should be computed if sufficient statistics are used

est_DIF

Logical indicating whether parameters under differential item functioning (DIF) should be additionally computed for invariant item parameters

se

Type of standard error used in lavaan::lavaan. If NULL, the lavaan default is used.

kernel

Type of kernel function. Can be "gaussian", "uniform" or "epanechnikov".

eps

Minimum number for weights

verbose

Optional logical printing information about computation progress.

object

Object of class lsem

file

A file name in which the summary output will be written.

digits

Number of digits.

x

Object of class lsem.

parindex

Vector of indices for parameters in plot function.

ask

A logical which asks for changing the graphic for each parameter.

ci

Logical indicating whether confidence intervals should be plotted.

lintrend

Logical indicating whether a linear trend should be plotted.

parsummary

Logical indicating whether a parameter summary should be displayed.

ylim

Plot parameter ylim. Can be a list, see Examples.

xlab

Plot parameter xlab. Can be a vector.

ylab

Plot parameter ylab. Can be a vector.

main

Plot parameter main. Can be a vector.

...

Further arguments to be passed to lavaan::sem or lavaan::lavaan.

data.mod

Observed values of the moderator

R

Number of bootstrap samples

cluster

Optional variable name for bootstrap at the level of a cluster identifier

repl_design

Optional matrix containing replication weights for computation of standard errors. Note that sampling weights have to be already included in repl_design.

repl_factor

Replication factor in variance formula for statistical inference, e.g., 0.05 in PISA.

use_starting_values

Logical indicating whether starting values should be used from the original sample

n.core

A scalar indicating the number of cores that should be used.

cl.type

The cluster type. Default value is "PSOCK". Posix machines (Linux, Mac) generally benefit from much faster cluster computation if type is set to type="FORK".

Value

List with following entries

parameters

Data frame with all parameters estimated at focal points of moderator. Bias-corrected estimates under boostrap can be found in the column est_bc.

weights

Data frame with weights at each focal point

parameters_summary

Summary table for estimated parameters

parametersM

Estimated parameters in matrix form. Parameters are in columns and values of the grid of the moderator are in rows.

bw

Used bandwidth

h

Used bandwidth factor

N

Sample size

moderator.density

Estimated frequencies and effective sample size for moderator at focal points

moderator.stat

Descriptive statistics for moderator

moderator

Variable name of moderator

moderator.grid

Used grid of focal points for moderator

moderator.grouped

Data frame with informations about grouping of moderator if type="MGM".

residualized.intercepts

Estimated intercept functions used for residualization.

lavmodel

Used lavaan model

data

Used data frame, possibly residualized if residualize=TRUE

model_parameters

Model parameters in LSEM

parameters_boot

Parameter values in each bootstrap sample (for lsem.bootstrap)

fitstats_joint_boot

Fit statistics in each bootstrap sample (for lsem.bootstrap)

dif_effects

Estimated item parameters under DIF

Author(s)

Alexander Robitzsch, Oliver Luedtke, Andrea Hildebrandt

References

Hildebrandt, A., Wilhelm, O., & Robitzsch, A. (2009). Complementary and competing factor analytic approaches for the investigation of measurement invariance. Review of Psychology, 16, 87-102.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.lsem01 | Age differentiation
#############################################################################

data(data.lsem01, package="sirt")
dat <- data.lsem01

# specify lavaan model
lavmodel <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F"

# define grid of moderator variable age
moderator.grid <- seq(4,23,1)

#********************************
#*** Model 1: estimate LSEM with bandwidth 2
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE)
summary(mod1)
plot(mod1, parindex=1:5)

# perform permutation test for Model 1
pmod1 <- sirt::lsem.permutationTest( mod1, B=10 )
          # only for illustrative purposes the number of permutations B is set
          # to a low number of 10
summary(pmod1)
plot(pmod1, type="global")

#* perform permutation test with parallel computation
pmod1a <- sirt::lsem.permutationTest( mod1, B=10, n.core=3 )
summary(pmod1a)

#** estimate Model 1 based on pseudo weights
mod1b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE, pseudo_weights=50 )
summary(mod1b)

#** estimation with sampling weights

# generate random sampling weights
set.seed(987)
weights <- stats::runif(nrow(dat), min=.4, max=3 )
mod1c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, sampling_weights=weights)
summary(mod1c)

#********************************
#*** Model 2: estimate multiple group model with 4 age groups

# define breaks for age groups
moderator.grid <- seq( 3.5, 23.5, len=5) # 4 groups
# estimate model
mod2 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
           lavmodel=lavmodel, type="MGM", std.lv=TRUE)
summary(mod2)

# output step functions
smod2 <- sirt::lsem.MGM.stepfunctions( object=mod2, moderator.grid=seq(4,23,1) )
str(smod2)

#********************************
#*** Model 3: define standardized loadings as derived variables

# specify lavaan model
lavmodel <- "
        F=~ a1*v1+a2*v2+a3*v3+a4*v4
        v1 ~~ s1*v1
        v2 ~~ s2*v2
        v3 ~~ s3*v3
        v4 ~~ s4*v4
        F ~~ 1*F
        # standardized loadings
        l1 :=a1 / sqrt(a1^2 + s1 )
        l2 :=a2 / sqrt(a2^2 + s2 )
        l3 :=a3 / sqrt(a3^2 + s3 )
        l4 :=a4 / sqrt(a4^2 + s4 )
        "
# estimate model
mod3 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE)
summary(mod3)
plot(mod3)

#********************************
#*** Model 4: estimate LSEM and automatically include standardized solutions

lavmodel <- "
        F=~ 1*v1+v2+v3+v4
        F ~~ F"
mod4 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE)
summary(mod4)
# permutation test (use only few permutations for testing purposes)
pmod1 <- sirt::lsem.permutationTest( mod4, B=3 )

#**** compute LSEM local weights
wgt <- sirt::lsem_local_weights(data.mod=dat$age, moderator.grid=moderator.grid,
             h=2)$weights
print(str(weights))

#********************************
#*** Model 5: invariance parameter constraints and other constraints

lavmodel <- "
        F=~ 1*v1+v2+v3+v4
        F ~~ F"
moderator.grid <- seq(4,23,4)

#- estimate model without constraints
mod5a <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE)
summary(mod5a)
# extract parameter names
mod5a$model_parameters

#- invariance constraints on residual variances
par_invariant <- c("F=~v2","v2~~v2")
mod5b <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE, par_invariant=par_invariant)
summary(mod5b)

#- bootstrap for statistical inference
bmod5b <- sirt::lsem.bootstrap(mod5b, R=100)
# inspect parameter values and standard errors
bmod5b$parameters

#- bootstrap using parallel computing (i.e., multiple cores)
bmod5ba <- sirt::lsem.bootstrap(mod5b, R=100, n.core=3)

#- user-defined replication design
R <- 100    # bootstrap samples
N <- nrow(dat)
repl_design <- matrix(0, nrow=N, ncol=R)
for (rr in 1:R){
    indices <- sort( sample(1:N, replace=TRUE) )
    repl_design[,rr] <- sapply(1:N, FUN=function(ii){ sum(indices==ii) } )
}
head(repl_design)
bmod5b1 <- sirt::lsem.bootstrap(mod5a, repl_design=repl_design, repl_factor=1/R)

#- compare model mod5b with joint estimation without constraints
mod5c <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, standardized=TRUE, est_joint=TRUE)
summary(mod5c)

#- linear and quadratic functions
par_invariant <- c("F=~v1","v2~~v2")
par_linear <- c("v1~~v1")
par_quadratic <- c("v4~~v4")

mod5d <- sirt::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid,
            lavmodel=lavmodel, h=2, par_invariant=par_invariant, par_linear=par_linear,
            par_quadratic=par_quadratic)
summary(mod5d)

#- user-defined constraints: step functions for parameters

# inspect parameter table (from lavaan) of fitted model
pj <- mod5d$partable_joint
#* modify parameter table for user-defined constraints
# define step function for F=~v1 which is constant on intervals 1:4 and 5:7
pj2 <- pj[ pj$con==1, ]
pj2[ c(5,6), "lhs" ] <- "p1g5"
pj2 <- pj2[ -4, ]
partable_joint <- rbind(pj1, pj2)
# estimate model with constraints
mod5e <- lsem::lsem.estimate( dat1, moderator="age", moderator.grid=moderator.grid,
             lavmodel=lavmodel, h=2, std.lv=TRUE, estimator="ML",
             partable_joint=partable_joint)
summary(mod5e)

#############################################################################
# EXAMPLE 2: data.lsem01 | FIML with missing data
#############################################################################

data(data.lsem01)
dat <- data.lsem01
# induce artifical missing values
set.seed(98)
dat[ stats::runif(nrow(dat)) < .5, c("v1")] <- NA
dat[ stats::runif(nrow(dat)) < .25, c("v2")] <- NA

# specify lavaan model
lavmodel1 <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F"

# define grid of moderator variable age
moderator.grid <- seq(4,23,2)

#*** estimate LSEM with FIML
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
                lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="ML", missing="fiml")
summary(mod1)

#############################################################################
# EXAMPLE 3: data.lsem01 | WLSMV estimation
#############################################################################

data(data.lsem01)
dat <- data.lsem01

# create artificial dichotomous data
for (vv in 2:6){
dat[,vv] <- 1*(dat[,vv] > mean(dat[,vv]))
}

# specify lavaan model
lavmodel1 <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F
        v1 | t1
        v2 | t1
        v3 | t1
        v4 | t1
        v5 | t1
        "

# define grid of moderator variable age
moderator.grid <- seq(4,23,2)

#*** local WLSMV estimation
mod1 <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
          lavmodel=lavmodel1, h=2, std.lv=TRUE, estimator="DWLS", ordered=paste0("v",1:5),
          residualize=FALSE, pseudo_weights=10000, parameterization="THETA" )
summary(mod1)

## End(Not run)

Permutation Test for a Local Structural Equation Model

Description

Performs a permutation test for testing the hypothesis that model parameter are independent of a moderator variable (see Hildebrandt, Wilhelm, & Robitzsch, 2009; Hildebrandt, Luedtke, Robitzsch, Sommer, & Wilhelm, 2016).

Usage

lsem.permutationTest(lsem.object, B=1000, residualize=TRUE, verbose=TRUE,
     n.core=1, cl.type="PSOCK")

## S3 method for class 'lsem.permutationTest'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'lsem.permutationTest'
plot(x, type="global", stattype="SD",
    parindex=NULL, sig_add=TRUE, sig_level=0.05, sig_pch=17, nonsig_pch=2,
    sig_cex=1, sig_lab="p value",  stat_lab="Test statistic",
    moderator_lab=NULL, digits=3, title=NULL, parlabels=NULL,
    ask=TRUE, ...)

Arguments

lsem.object

Fitted object of class lsem with lsem.estimate

B

Number of permutation samples

residualize

Optional logical indicating whether residualization of the moderator should be performed for each permutation sample.

verbose

Optional logical printing information about computation progress.

n.core

A scalar indicating the number of cores that should be used.

cl.type

The cluster type. Default value is "PSOCK". Posix machines (Linux, Mac) generally benefit from much faster cluster computation if type is set to type="FORK".

object

Object of class lsem

file

A file name in which the summary output will be written.

digits

Number of digits.

...

Further arguments to be passed.

x

Object of class lsem

type

Type of the statistic to be plotted. If type="global", a global test will be displayed. If type="pointwise" for each value at the focal point (defined in moderator.grid) are calculated.

stattype

Type of test statistics. Can be MAD (mean absolute deviation), SD (standard deviation) or lin_slo (linear slope).

parindex

Vector of indices of selected parameters.

sig_add

Logical indicating whether significance values (p values) should be displayed.

sig_level

Significance level.

sig_pch

Point symbol for significant values.

nonsig_pch

Point symbol for non-significant values.

sig_cex

Point size for graphic displaying p values

sig_lab

Label for significance value (p value).

stat_lab

Label of y axis for graphic with pointwise test statistic

moderator_lab

Label of the moderator.

title

Title of the plot. Can be a vector.

parlabels

Labels of the parameters. Can be a vector.

ask

A logical which asks for changing the graphic for each parameter.

Value

List with following entries

teststat

Data frame with global test statistics. The statistics are SD, MAD and lin_slo with their corresponding p values.

parameters_pointwise_test

Data frame with pointwise test statistics.

parameters

Original parameters.

parameters

Parameters in permutation samples.

parameters_summary

Original parameter summary.

parameters_summary_M

Mean of each parameter in permutation sample.

parameters_summary_SD

Standard deviation (SD) statistic in permutation slope.

parameters_summary_MAD

Mean absolute deviation (MAD) statistic in permutation sample.

parameters_summary_MAD

Linear slope parameter in permutation sample.

nonconverged_rate

Percentage of permuted dataset in which a LSEM model did not converge

Author(s)

Alexander Robitzsch, Oliver Luedtke, Andrea Hildebrandt

References

Hildebrandt, A., Wilhelm, O., & Robitzsch, A. (2009). Complementary and competing factor analytic approaches for the investigation of measurement invariance. Review of Psychology, 16, 87-102.

Test a Local Structural Equation Model Based on Bootstrap

Description

Performs global and parameter tests for a fitted local structural equation model. The LSEM must have been fitted and bootstrap estimates of the LSEM model must be available for statistical inference. The hypothesis of a constant parameter is tested by means of a Wald test. Moreover, regression functions can be specified and tested if these are specified in the argument models.

Usage

lsem.test(mod, bmod, models=NULL)

Arguments

mod

Fitted LSEM object

bmod

Fitted LSEM bootstrap object. The argument bmod can also be missing.

models

List of model formulas for named LSEM model parameters

Value

List with following entries

wald_test_global

Global Wald test for model parameters

test_models

Output for fitted regression models

parameters

Original model parameters after fitting (i.e., smoothing) a particular parameter using a regression model specified in models.

parameters_boot

Bootstrapped model parameters after fitting (i.e., smoothing) a particular parameter using a regression model specified in models.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: data.lsem01 | Age differentiation and tested models
#############################################################################

data(data.lsem01, package="sirt")
dat <- data.lsem01

# specify lavaan model
lavmodel <- "
        F=~ v1+v2+v3+v4+v5
        F ~~ 1*F
    "

# define grid of moderator variable age
moderator.grid <- seq(4,23,1)

#-- estimate LSEM with bandwidth 2
mod <- sirt::lsem.estimate( dat, moderator="age", moderator.grid=moderator.grid,
               lavmodel=lavmodel, h=2, std.lv=TRUE)
summary(mod1)

#-- bootstrap model
bmod <- sirt::lsem.bootstrap(mod, R=200)

#-- test models
models <- list( "F=~v1"=y ~ m + I(m^2),
                "F=~v2"=y ~ I( splines::bs(m, df=4) ) )
tmod <- sirt::lsem.test(mod=mod, bmod=bmod, models=models)
str(tmod)
sirt::print_digits(wald_test_global, 3)
sirt::print_digits(test_models, 3)

## End(Not run)

True-Score Reliability for Dichotomous Data

Description

This function computes the marginal true-score reliability for dichotomous data (Dimitrov, 2003; May & Nicewander, 1994) for the four-parameter logistic item response model (see rasch.mml2 for details regarding this IRT model).

Usage

marginal.truescore.reliability(b, a=1+0*b,c=0*b,d=1+0*b,
    mean.trait=0, sd.trait=1, theta.k=seq(-6,6,len=200) )

Arguments

b

Vector of item difficulties

a

Vector of item discriminations

c

Vector of guessing parameters

d

Vector of upper asymptotes

mean.trait

Mean of trait distribution

sd.trait

Standard deviation of trait distribution

theta.k

Grid at which the trait distribution should be evaluated

Value

A list with following entries:

rel.test

Reliability of the test

item

True score variance (sig2.true, error variance (sig2.error) and item reliability (rel.item). Expected proportions correct are in the column pi.

pi

Average proportion correct for all items and persons

sig2.tau

True score variance \sigma^2_{\tau} (calculated by the formula in May & Nicewander, 1994)

sig2.error

Error variance \sigma^2_{e}

References

Dimitrov, D. (2003). Marginal true-score measures and reliability for binary items as a function of their IRT parameters. Applied Psychological Measurement, 27, 440-458.

May, K., & Nicewander, W. A. (1994). Reliability and information functions for percentile ranks. Journal of Educational Measurement, 31, 313-325.

Examples

#############################################################################
# EXAMPLE 1: Dimitrov (2003) Table 1 - 2PL model
#############################################################################

# item discriminations
a <- 1.7*c(0.449,0.402,0.232,0.240,0.610,0.551,0.371,0.321,0.403,0.434,0.459,
    0.410,0.302,0.343,0.225,0.215,0.487,0.608,0.341,0.465)
# item difficulties
b <- c( -2.554,-2.161,-1.551,-1.226,-0.127,-0.855,-0.568,-0.277,-0.017,
    0.294,0.532,0.773,1.004,1.250,1.562,1.385,2.312,2.650,2.712,3.000 )

marginal.truescore.reliability( b=b, a=a )
  ##   Reliability=0.606

#############################################################################
# EXAMPLE 2: Dimitrov (2003) Table 2
#  3PL model: Poetry items (4 items)
#############################################################################

# slopes, difficulties and guessing parameters
a <- 1.7*c(1.169,0.724,0.554,0.706 )
b <- c(0.468,-1.541,-0.042,0.698 )
c <- c(0.159,0.211,0.197,0.177 )

res <- sirt::marginal.truescore.reliability( b=b, a=a, c=c)
  ##   Reliability=0.403
  ##   > round( res$item, 3 )
  ##     item    pi sig2.tau sig2.error rel.item
  ##   1    1 0.463    0.063      0.186    0.252
  ##   2    2 0.855    0.017      0.107    0.135
  ##   3    3 0.605    0.026      0.213    0.107
  ##   4    4 0.459    0.032      0.216    0.130

#############################################################################
# EXAMPLE 3: Reading Data
#############################################################################
data( data.read)

#***
# Model 1: 1PL
mod <- sirt::rasch.mml2( data.read )
marginal.truescore.reliability( b=mod$item$b )
  ##   Reliability=0.653

#***
# Model 2: 2PL
mod <- sirt::rasch.mml2( data.read, est.a=1:12 )
marginal.truescore.reliability( b=mod$item$b, a=mod$item$a)
  ##   Reliability=0.696

## Not run: 
# compare results with Cronbach's alpha and McDonald's omega
# posing a 'wrong model' for normally distributed data
library(psych)
psych::omega(dat, nfactors=1)     # 1 factor
  ##  Omega_h for 1 factor is not meaningful, just omega_t
  ##   Omega
  ##   Call: omega(m=dat, nfactors=1)
  ##   Alpha:                 0.69
  ##   G.6:                   0.7
  ##   Omega Hierarchical:    0.66
  ##   Omega H asymptotic:    0.95
  ##   Omega Total            0.69

##! Note that alpha in psych is the standardized one.

## End(Not run)

Some Matrix Functions

Description

Some matrix functions which are written in Rcpp for speed reasons.

Usage

rowMaxs.sirt(matr)      # rowwise maximum

rowMins.sirt(matr)      # rowwise minimum

rowCumsums.sirt(matr)   # rowwise cumulative sum

colCumsums.sirt(matr)   # columnwise cumulative sum

rowIntervalIndex.sirt(matr,rn) # first index in row nn when matr(nn,zz) > rn(nn)

rowKSmallest.sirt(matr, K, break.ties=TRUE) # k smallest elements in a row
rowKSmallest2.sirt(matr, K )

Arguments

matr

A numeric matrix

rn

A vector, usually a random number in applications

K

An integer indicating the number of smallest elements to be extracted

break.ties

A logical which indicates if ties are randomly broken. The default is TRUE.

Details

The function rowIntervalIndex.sirt searches for all rows n the first index i for which matr(n,i) > rn(n) holds.

The functions rowKSmallest.sirt and rowKSmallest2.sirt extract the K smallest entries in a matrix row. For small numbers of K the function rowKSmallest2.sirt is the faster one.

Value

The output of rowMaxs.sirt is a list with the elements maxval (rowwise maximum values) and maxind (rowwise maximum indices). The output of rowMins.sirt contains corresponding minimum values with entries minval and minind.

The output of rowKSmallest.sirt are two matrices: smallval contains the K smallest values whereas smallind contains the K smallest indices.

Author(s)

Alexander Robitzsch

The Rcpp code for rowCumsums.sirt is copied from code of Romain Francois (https://lists.r-forge.r-project.org/pipermail/rcpp-devel/2010-October/001198.html).

Examples

#############################################################################
# EXAMPLE 1: a small toy example (I)
#############################################################################
set.seed(789)
N1 <- 10 ; N2 <- 4
M1 <- round( matrix( runif(N1*N2), nrow=N1, ncol=N2), 1 )

rowMaxs.sirt(M1)      # rowwise maximum
rowMins.sirt(M1)      # rowwise minimum
rowCumsums.sirt(M1)   # rowwise cumulative sum

# row index for exceeding a certain threshold value
matr <- M1
matr <- matr / rowSums( matr )
matr <- sirt::rowCumsums.sirt( matr )
rn <- runif(N1)    # generate random numbers
rowIntervalIndex.sirt(matr,rn)

# select the two smallest values
rowKSmallest.sirt(matr=M1, K=2)
rowKSmallest2.sirt(matr=M1, K=2)

Some Methods for Objects of Class `mcmc.list`

Description

Some methods for objects of class mcmc.list created from the coda package.

Usage

## coefficients
mcmc_coef(mcmcobj, exclude="deviance")

## covariance matrix
mcmc_vcov(mcmcobj, exclude="deviance")

## confidence interval
mcmc_confint( mcmcobj, parm, level=.95, exclude="deviance" )

## summary function
mcmc_summary( mcmcobj, quantiles=c(.025,.05,.50,.95,.975) )

## plot function
mcmc_plot(mcmcobj, ...)

## inclusion of derived parameters in mcmc object
mcmc_derivedPars( mcmcobj, derivedPars )

## Wald test for parameters
mcmc_WaldTest( mcmcobj, hypotheses )

## S3 method for class 'mcmc_WaldTest'
summary(object, digits=3, ...)

Arguments

mcmcobj

Objects of class mcmc.list as created by coda::mcmc

exclude

Vector of parameters which should be excluded in calculations

parm

Optional vector of parameters

level

Confidence level

quantiles

Vector of quantiles to be computed.

...

Parameters to be passed to mcmc_plot. See LAM::plot.amh for arguments.

derivedPars

List with derived parameters (see examples).

hypotheses

List with hypotheses of the form g_i( \bold{\theta})=0.

object

Object of class mcmc_WaldTest.

digits

Number of digits used for rounding.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Logistic regression in rcppbugs package
#############################################################################


#***************************************
# (1) simulate data
set.seed(8765)
N <- 500
x1 <- stats::rnorm(N)
x2 <- stats::rnorm(N)
y <- 1*( stats::plogis( -.6 + .7*x1 + 1.1 *x2 ) > stats::runif(N) )

#***************************************
# (2) estimate logistic regression with glm
mod <- stats::glm( y ~ x1 + x2, family="binomial" )
summary(mod)

#***************************************
# (3) estimate model with rcppbugs package
library(rcppbugs)
b <- rcppbugs::mcmc.normal( stats::rnorm(3),mu=0,tau=0.0001)
y.hat <- rcppbugs::deterministic( function(x1,x2,b){
                stats::plogis( b[1] + b[2]*x1 + b[3]*x2 ) },
                  x1, x2, b)
y.lik <- rcppbugs::mcmc.bernoulli( y, p=y.hat, observed=TRUE)
model <- rcppbugs::create.model(b, y.hat, y.lik)

#*** estimate model in rcppbugs; 5000 iterations, 1000 burnin iterations
n.burnin <- 500 ; n.iter <- 2000 ; thin <- 2
ans <- rcppbugs::run.model(model, iterations=n.iter, burn=n.burnin, adapt=200, thin=thin)
print(rcppbugs::get.ar(ans)) # get acceptance rate
print(apply(ans[["b"]],2,mean)) # get means of posterior

#*** convert rcppbugs into mcmclist object
mcmcobj <- data.frame( ans$b )
colnames(mcmcobj) <- paste0("b",1:3)
mcmcobj <- as.matrix(mcmcobj)
class(mcmcobj) <- "mcmc"
attr(mcmcobj, "mcpar") <- c( n.burnin+1, n.iter, thin )
mcmcobj <- coda::mcmc( mcmcobj )

# coefficients, variance covariance matrix and confidence interval
mcmc_coef(mcmcobj)
mcmc_vcov(mcmcobj)
mcmc_confint( mcmcobj, level=.90 )

# summary and plot
mcmc_summary(mcmcobj)
mcmc_plot(mcmcobj, ask=TRUE)

# include derived parameters in mcmc object
derivedPars <- list( "diff12"=~ I(b2-b1), "diff13"=~ I(b3-b1) )
mcmcobj2 <- sirt::mcmc_derivedPars(mcmcobj, derivedPars=derivedPars )
mcmc_summary(mcmcobj2)

#*** Wald test for parameters
 # hyp1: b2 - 0.5=0
 # hyp2: b2 * b3=0
hypotheses <- list( "hyp1"=~ I( b2 - .5 ), "hyp2"=~ I( b2*b3 ) )
test1 <- sirt::mcmc_WaldTest( mcmcobj, hypotheses=hypotheses )
summary(test1)

## End(Not run)

Computation of the Rhat Statistic from a Single MCMC Chain

Description

Computes the Rhat statistic from a single MCMC chain.

Usage

mcmc_Rhat(mcmc_object, n_splits=3)

Arguments

mcmc_object

Object of class mcmc

n_splits

Number of splits for MCMC chain

Value

Numeric vector

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Computation Rhat statistic for 2PNO model fitting by MCMC
#############################################################################

data(data.read)

# estimate 2PNO with MCMC with 3000 iterations and 500 burn-in iterations
mod <- sirt::mcmc.2pno( dat=data.read, iter=1000, burnin=100 )
# plot MCMC chains
plot( mod$mcmcobj, ask=TRUE )
# compute Rhat statistics
round( sirt::mcmc_Rhat( mod$mcmcobj[[1]] ), 3 )

## End(Not run)

MCMC Estimation of the Two-Parameter Normal Ogive Item Response Model

Description

This function estimates the Two-Parameter normal ogive item response model by MCMC sampling (Johnson & Albert, 1999, p. 195ff.).

Usage

mcmc.2pno(dat, weights=NULL, burnin=500, iter=1000, N.sampvalues=1000,
      progress.iter=50, save.theta=FALSE)

Arguments

dat

Data frame with dichotomous item responses

weights

An optional vector with student sample weights

burnin

Number of burnin iterations

iter

Total number of iterations

N.sampvalues

Maximum number of sampled values to save

progress.iter

Display progress every progress.iter-th iteration. If no progress display is wanted, then choose progress.iter larger than iter.

save.theta

Should theta values be saved?

Details

The two-parameter normal ogive item response model with a probit link function is defined by

P(X_{pi}=1 | \theta_p )=\Phi ( a_i \theta_p - b_i ) \quad, \quad \theta_p \sim N(0,1)

Note that in this implementation non-informative priors for the item parameters are chosen (Johnson & Albert, 1999, p. 195ff.).

Value

A list of class mcmc.sirt with following entries:

mcmcobj

Object of class mcmc.list

summary.mcmcobj

Summary of the mcmcobj object. In this summary the Rhat statistic and the mode estimate MAP is included. The variable PercSEratio indicates the proportion of the Monte Carlo standard error in relation to the total standard deviation of the posterior distribution.

burnin

Number of burnin iterations

iter

Total number of iterations

a.chain

Sampled values of a_i parameters

b.chain

Sampled values of b_i parameters

theta.chain

Sampled values of \theta_p parameters

deviance.chain

Sampled values of Deviance values

EAP.rel

EAP reliability

person

Data frame with EAP person parameter estimates for \theta_p and their corresponding posterior standard deviations

dat

Used data frame

weights

Used student weights

...

Further values

References

Johnson, V. E., & Albert, J. H. (1999). Ordinal Data Modeling. New York: Springer.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)
# estimate 2PNO with MCMC with 3000 iterations and 500 burn-in iterations
mod <- sirt::mcmc.2pno( dat=data.read, iter=3000, burnin=500 )
# plot MCMC chains
plot( mod$mcmcobj, ask=TRUE )
# write sampled chains into codafile
mcmclist2coda( mod$mcmcobj, name="dataread_2pno" )
# summary
summary(mod)

#############################################################################
# EXAMPLE 2
#############################################################################
# simulate data
N <- 1000
I <- 10
b <- seq( -1.5, 1.5, len=I )
a <- rep( c(1,2), I/2 )
theta1 <- stats::rnorm(N)
dat <- sirt::sim.raschtype( theta=theta1, fixed.a=a, b=b )

#***
# Model 1: estimate model without weights
mod1 <- sirt::mcmc.2pno( dat, iter=1500, burnin=500)
mod1$summary.mcmcobj
plot( mod1$mcmcobj, ask=TRUE )

#***
# Model 2: estimate model with weights
# define weights
weights <- c( rep( 5, N/4 ), rep( .2, 3/4*N ) )
mod2 <- sirt::mcmc.2pno( dat, weights=weights, iter=1500, burnin=500)
mod1$summary.mcmcobj

## End(Not run)

Random Item Response Model / Multilevel IRT Model

Description

This function enables the estimation of random item models and multilevel (or hierarchical) IRT models (Chaimongkol, Huffer & Kamata, 2007; Fox & Verhagen, 2010; van den Noortgate, de Boeck & Meulders, 2003; Asparouhov & Muthen, 2012; Muthen & Asparouhov, 2013, 2014). Dichotomous response data is supported using a probit link. Normally distributed responses can also be analyzed. See Details for a description of the implemented item response models.

Usage

mcmc.2pno.ml(dat, group, link="logit", est.b.M="h", est.b.Var="n",
    est.a.M="f", est.a.Var="n", burnin=500, iter=1000,
    N.sampvalues=1000, progress.iter=50, prior.sigma2=c(1, 0.4),
    prior.sigma.b=c(1, 1), prior.sigma.a=c(1, 1), prior.omega.b=c(1, 1),
    prior.omega.a=c(1, 0.4), sigma.b.init=.3 )

Arguments

dat

Data frame with item responses.

group

Vector of group identifiers (e.g. classes, schools or countries)

link

Link function. Choices are "logit" for dichotomous data and "normal" for data under normal distribution assumptions

est.b.M

Estimation type of b_i parameters:
n - non-hierarchical prior distribution, i.e. \omega_b is set to a very high value and is not estimated
h - hierarchical prior distribution with estimated distribution parameters \mu_b and \omega_b

est.b.Var

Estimation type of standard deviations of item difficulties b_i.
n – no estimation of the item variance, i.e. \sigma_{b,i} is assumed to be zero
i – item-specific standard deviation of item difficulties
j – a joint standard deviation of all item difficulties is estimated, i.e. \sigma_{b,1}=\ldots=\sigma_{b,I}=\sigma_b

est.a.M

Estimation type of a_i parameters:
f - no estimation of item slopes, i.e all item slopes a_i are fixed at one
n - non-hierarchical prior distribution, i.e. \omega_a=0
h - hierarchical prior distribution with estimated distribution parameter \omega_a

est.a.Var

Estimation type of standard deviations of item slopes a_i.
n – no estimation of the item variance
i – item-specific standard deviation of item slopes
j – a joint standard deviation of all item slopes is estimated, i.e. \sigma_{a,1}=\ldots=\sigma_{a,I}=\sigma_a

burnin

Number of burnin iterations

iter

Total number of iterations

N.sampvalues

Maximum number of sampled values to save

progress.iter

Display progress every progress.iter-th iteration. If no progress display is wanted, then choose progress.iter larger than iter.

prior.sigma2

Prior for Level 2 standard deviation \sigma_{L2}

prior.sigma.b

Priors for item difficulty standard deviations \sigma_{b,i}

prior.sigma.a

Priors for item difficulty standard deviations \sigma_{a,i}

prior.omega.b

Prior for \omega_b

prior.omega.a

Prior for \omega_a

sigma.b.init

Initial standard deviation for \sigma_{b,i} parameters

Details

For dichotomous item responses (link="logit") of persons p in group j on item i, the probability of a correct response is defined as

P( X_{pji}=1 | \theta_{pj} )=\Phi ( a_{ij} \theta_{pj} - b_{ij} )

The ability \theta_{pj} is decomposed into a Level 1 and a Level 2 effect

\theta_{pj}=u_j + e_{pj} \quad, \quad u_j \sim N ( 0, \sigma_{L2}^2 ) \quad, \quad e_{pj} \sim N ( 0, \sigma_{L1}^2 )

In a multilevel IRT model (or a random item model), item parameters are allowed to vary across groups:

b_{ij} \sim N( b_i, \sigma^2_{b,i} ) \quad, \quad a_{ij} \sim N( a_i, \sigma^2_{a,i} )

In a hierarchical IRT model, a hierarchical distribution of the (main) item parameters is assumed

b_{i} \sim N( \mu_b, \omega^2_{b} ) \quad, \quad a_{i} \sim N( 1, \omega^2_{a} )

Note that for identification purposes, the mean of all item slopes a_i is set to one. Using the arguments est.b.M, est.b.Var, est.a.M and est.a.Var defines which variance components should be estimated.

For normally distributed item responses (link="normal"), the model equations remain the same except the item response model which is now written as

X_{pji}=a_{ij} \theta_{pj} - b_{ij} + \varepsilon_{pji} \quad, \quad \varepsilon_{pji} \sim N( 0, \sigma^2_{res,i} )

Value

A list of class mcmc.sirt with following entries:

mcmcobj

Object of class mcmc.list

summary.mcmcobj

ic

Information criteria (DIC)

burnin

Number of burnin iterations

iter

Total number of iterations

theta.chain

Sampled values of \theta_{pj} parameters

theta.chain

Sampled values of u_{j} parameters

deviance.chain

Sampled values of Deviance values

EAP.rel

EAP reliability

person

Data frame with EAP person parameter estimates for \theta_pj and their corresponding posterior standard deviations

dat

Used data frame

...

Further values

References

Asparouhov, T. & Muthen, B. (2012). General random effect latent variable modeling: Random subjects, items, contexts, and parameters. http://www.statmodel.com/papers_date.shtml.

Chaimongkol, S., Huffer, F. W., & Kamata, A. (2007). An explanatory differential item functioning (DIF) model by the WinBUGS 1.4. Songklanakarin Journal of Science and Technology, 29, 449-458.

Muthen, B. & Asparouhov, T. (2013). New methods for the study of measurement invariance with many groups. http://www.statmodel.com/papers_date.shtml

Muthen, B. & Asparouhov, T. (2014). Item response modeling in Mplus: A multi-dimensional, multi-level, and multi-timepoint example. In W. Linden & R. Hambleton (2014). Handbook of item response theory: Models, statistical tools, and applications. http://www.statmodel.com/papers_date.shtml

van den Noortgate, W., De Boeck, P., & Meulders, M. (2003). Cross-classification multilevel logistic models in psychometrics. Journal of Educational and Behavioral Statistics, 28, 369-386.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Dataset Multilevel data.ml1 - dichotomous items
#############################################################################
data(data.ml1)
dat <- data.ml1[,-1]
group <- data.ml1$group
# just for a try use a very small number of iterations
burnin <- 50 ; iter <- 100

#***
# Model 1: 1PNO with no cluster item effects
mod1 <- sirt::mcmc.2pno.ml( dat, group, est.b.Var="n", burnin=burnin, iter=iter )
summary(mod1)    # summary
plot(mod1,layout=2,ask=TRUE) # plot results
# write results to coda file
mcmclist2coda( mod1$mcmcobj, name="data.ml1_mod1" )

#***
# Model 2: 1PNO with cluster item effects of item difficulties
mod2 <- sirt::mcmc.2pno.ml( dat, group, est.b.Var="i", burnin=burnin, iter=iter )
summary(mod2)
plot(mod2, ask=TRUE, layout=2 )

#***
# Model 3: 2PNO with cluster item effects of item difficulties but
#          joint item slopes
mod3 <- sirt::mcmc.2pno.ml( dat, group, est.b.Var="i", est.a.M="h",
              burnin=burnin, iter=iter )
summary(mod3)

#***
# Model 4: 2PNO with cluster item effects of item difficulties and
#          cluster item effects with a jointly estimated SD
mod4 <- sirt::mcmc.2pno.ml( dat, group, est.b.Var="i", est.a.M="h",
                est.a.Var="j", burnin=burnin, iter=iter )
summary(mod4)

#############################################################################
# EXAMPLE 2: Dataset Multilevel data.ml2 - polytomous items
#            assuming a normal distribution for polytomous items
#############################################################################
data(data.ml2)
dat <- data.ml2[,-1]
group <- data.ml2$group
# set iterations for all examples (too few!!)
burnin <- 100 ; iter <- 500

#***
# Model 1: no intercept variance, no slopes
mod1 <- sirt::mcmc.2pno.ml( dat=dat, group=group, est.b.Var="n",
             burnin=burnin, iter=iter, link="normal",  progress.iter=20  )
summary(mod1)

#***
# Model 2a: itemwise intercept variance, no slopes
mod2a <- sirt::mcmc.2pno.ml( dat=dat, group=group, est.b.Var="i",
            burnin=burnin, iter=iter,link="normal",  progress.iter=20  )
summary(mod2a)

#***
# Model 2b: homogeneous intercept variance, no slopes
mod2b <- sirt::mcmc.2pno.ml( dat=dat, group=group, est.b.Var="j",
              burnin=burnin, iter=iter,link="normal",  progress.iter=20  )
summary(mod2b)

#***
# Model 3: intercept variance and slope variances
#          hierarchical item and slope parameters
mod3 <- sirt::mcmc.2pno.ml( dat=dat, group=group,
               est.b.M="h", est.b.Var="i", est.a.M="h", est.a.Var="i",
               burnin=burnin, iter=iter,link="normal",  progress.iter=20  )
summary(mod3)

#############################################################################
# EXAMPLE 3: Simulated random effects model | dichotomous items
#############################################################################
set.seed(7698)

#*** model parameters
sig2.lev2 <- .3^2   # theta level 2 variance
sig2.lev1 <- .8^2   # theta level 1 variance
G <- 100            # number of groups
n <- 20             # number of persons within a group
I <- 12             # number of items
#*** simuate theta
theta2 <- stats::rnorm( G, sd=sqrt(sig2.lev2) )
theta1 <- stats::rnorm( n*G, sd=sqrt(sig2.lev1) )
theta  <- theta1 + rep( theta2, each=n )
#*** item difficulties
b <- seq( -2, 2, len=I )
#*** define group identifier
group <- 1000 + rep(1:G, each=n )
#*** SD of group specific difficulties for items 3 and 5
sigma.item <- rep(0,I)
sigma.item[c(3,5)] <- 1
#*** simulate group specific item difficulties
b.class <- sapply( sigma.item, FUN=function(sii){ stats::rnorm( G, sd=sii ) } )
b.class <- b.class[ rep( 1:G,each=n ), ]
b <- matrix( b, n*G, I, byrow=TRUE ) + b.class
#*** simulate item responses
m1 <- stats::pnorm( theta - b )
dat <- 1 * ( m1 > matrix( stats::runif( n*G*I ), n*G, I ) )

#*** estimate model
mod <- sirt::mcmc.2pno.ml( dat, group=group, burnin=burnin, iter=iter,
            est.b.M="n", est.b.Var="i", progress.iter=20)
summary(mod)
plot(mod, layout=2, ask=TRUE )

## End(Not run)

MCMC Estimation of the Hierarchical IRT Model for Criterion-Referenced Measurement

Description

This function estimates the hierarchical IRT model for criterion-referenced measurement which is based on a two-parameter normal ogive response function (Janssen, Tuerlinckx, Meulders & de Boeck, 2000).

Usage

mcmc.2pnoh(dat, itemgroups, prob.mastery=c(.5,.8), weights=NULL,
      burnin=500, iter=1000, N.sampvalues=1000,
      progress.iter=50, prior.variance=c(1,1), save.theta=FALSE)

Arguments

dat

Data frame with dichotomous item responses

itemgroups

Vector with characters or integers which define the criterion to which an item is associated.

prob.mastery

Probability levels which define nonmastery, transition and mastery stage (see Details)

weights

An optional vector with student sample weights

burnin

Number of burnin iterations

iter

Total number of iterations

N.sampvalues

Maximum number of sampled values to save

progress.iter

Display progress every progress.iter-th iteration. If no progress display is wanted, then choose progress.iter larger than iter.

prior.variance

Scale parameter of the inverse gamma distribution for the \sigma^2 and \nu^2 item variance parameters

save.theta

Should theta values be saved?

Details

The hierarchical IRT model for criterion-referenced measurement (Janssen et al., 2000) assumes that every item i intends to measure a criterion k. The item response function is defined as

P(X_{pik}=1 | \theta_p )= \Phi [ \alpha_{ik} ( \theta_p - \beta_{ik} ) ] \quad, \quad \theta_p \sim N(0,1)

Item parameters (\alpha_{ik},\beta_{ik}) are hierarchically modeled, i.e.

\beta_{ik} \sim N( \xi_k, \sigma^2 ) \quad \mbox{and} \quad \alpha_{ik} \sim N( \omega_k, \nu^2 )

In the mcmc.list output object, also the derived parameters d_{ik}=\alpha_{ik} \beta_{ik} and \tau_k=\xi_k \omega_k are calculated. Mastery and nonmastery probabilities are based on a reference item Y_{k} of criterion k and a response function

P(Y_{pk}=1 | \theta_p )= \Phi [ \omega_{k} ( \theta_p - \xi_{k} ) ] \quad, \quad \theta_p \sim N(0,1)

With known item parameters and person parameters, response probabilities of criterion k are calculated. If a response probability of criterion k is larger than prob.mastery[2], then a student is defined as a master. If this probability is smaller than prob.mastery[1], then a student is a nonmaster. In all other cases, students are in a transition stage.

In the mcmcobj output object, the parameters d[i] are defined by d_{ik}=\alpha_{ik} \cdot \beta_{ik} while tau[k] are defined by \tau_k=\xi_k \cdot \omega_k .

Value

A list of class mcmc.sirt with following entries:

mcmcobj

Object of class mcmc.list

summary.mcmcobj

burnin

Number of burnin iterations

iter

Total number of iterations

alpha.chain

Sampled values of \alpha_{ik} parameters

beta.chain

Sampled values of \beta_{ik} parameters

xi.chain

Sampled values of \xi_{k} parameters

omega.chain

Sampled values of \omega_{k} parameters

sigma.chain

Sampled values of \sigma parameter

nu.chain

Sampled values of \nu parameter

theta.chain

Sampled values of \theta_p parameters

deviance.chain

Sampled values of Deviance values

EAP.rel

EAP reliability

person

Data frame with EAP person parameter estimates for \theta_p and their corresponding posterior standard deviations

dat

Used data frame

weights

Used student weights

...

Further values

References

Janssen, R., Tuerlinckx, F., Meulders, M., & De Boeck, P. (2000). A hierarchical IRT model for criterion-referenced measurement. Journal of Educational and Behavioral Statistics, 25, 285-306.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Simulated data according to Janssen et al. (2000, Table 2)
#############################################################################

N <- 1000
Ik <- c(4,6,8,5,9,6,8,6,5)
xi.k <- c( -.89, -1.13, -1.23, .06, -1.41, -.66, -1.09, .57, -2.44)
omega.k <- c(.98, .91, .76, .74, .71, .80, .79, .82, .54)

# select 4 attributes
K <- 4
Ik <- Ik[1:K] ; xi.k <- xi.k[1:K] ; omega.k <- omega.k[1:K]
sig2 <- 3.02
nu2 <- .09
I <- sum(Ik)
b <- rep( xi.k, Ik ) + stats::rnorm(I, sd=sqrt(sig2) )
a <- rep( omega.k, Ik ) + stats::rnorm(I, sd=sqrt(nu2) )
theta1 <- stats::rnorm(N)
t1 <- rep(1,N)
p1 <- stats::pnorm( outer(t1,a) * ( theta1 - outer(t1,b) ) )
dat <- 1  * ( p1 > stats::runif(N*I)  )
itemgroups <- rep( paste0("A", 1:K ), Ik )

# estimate model
mod <- sirt::mcmc.2pnoh(dat, itemgroups, burnin=200, iter=1000 )
# summary
summary(mod)
# plot
plot(mod$mcmcobj, ask=TRUE)
# write coda files
mcmclist2coda( mod$mcmcobj, name="simul_2pnoh" )

## End(Not run)

3PNO Testlet Model

Description

This function estimates the 3PNO testlet model (Wang, Bradlow & Wainer, 2002, 2007) by Markov Chain Monte Carlo methods (Glas, 2012).

Usage

mcmc.3pno.testlet(dat, testlets=rep(NA, ncol(dat)),
   weights=NULL, est.slope=TRUE, est.guess=TRUE, guess.prior=NULL,
   testlet.variance.prior=c(1, 0.2), burnin=500, iter=1000,
   N.sampvalues=1000, progress.iter=50, save.theta=FALSE, save.gamma.testlet=FALSE )

Arguments

dat

Data frame with dichotomous item responses for N persons and I items

testlets

An integer or character vector which indicates the allocation of items to testlets. Same entries corresponds to same testlets. If an entry is NA, then this item does not belong to any testlet.

weights

An optional vector with student sample weights

est.slope

Should item slopes be estimated? The default is TRUE.

est.guess

Should guessing parameters be estimated? The default is TRUE.

guess.prior

A vector of length two or a matrix with I items and two columns which defines the beta prior distribution of guessing parameters. The default is a non-informative prior, i.e. the Beta(1,1) distribution.

testlet.variance.prior

A vector of length two which defines the (joint) prior for testlet variances assuming an inverse chi-squared distribution. The first entry is the effective sample size of the prior while the second entry defines the prior variance of the testlet. The default of c(1,.2) means that the prior sample size is 1 and the prior testlet variance is .2.

burnin

Number of burnin iterations

iter

Number of iterations

N.sampvalues

Maximum number of sampled values to save

progress.iter

Display progress every progress.iter-th iteration. If no progress display is wanted, then choose progress.iter larger than iter.

save.theta

Logical indicating whether theta values should be saved

save.gamma.testlet

Logical indicating whether gamma values should be saved

Details

The testlet response model for person p at item i is defined as

P(X_{pi}=1 )=c_i + ( 1 - c_i ) \Phi ( a_i \theta_p + \gamma_{p,t(i)} + b_i ) \quad, \quad \theta_p \sim N ( 0,1 ), \gamma_{p,t(i)} \sim N( 0, \sigma^2_t )

In case of est.slope=FALSE, all item slopes a_i are set to 1. Then a variance \sigma^2 of the \theta_p distribution is estimated which is called the Rasch testlet model in the literature (Wang & Wilson, 2005).

In case of est.guess=FALSE, all guessing parameters c_i are set to 0.

After fitting the testlet model, marginal item parameters are calculated (integrating out testlet effects \gamma_{p,t(i)}) according the defining response equation

P(X_{pi}=1 )=c_i + ( 1 - c_i ) \Phi ( a_i^\ast \theta_p + b_i^\ast )

Value

A list of class mcmc.sirt with following entries:

mcmcobj

Object of class mcmc.list containing item parameters (b_marg and a_marg denote marginal item parameters) and person parameters (if requested)

summary.mcmcobj

ic

Information criteria (DIC)

burnin

Number of burnin iterations

iter

Total number of iterations

theta.chain

Sampled values of \theta_p parameters

deviance.chain

Sampled values of deviance values

EAP.rel

EAP reliability

person

Data frame with EAP person parameter estimates for \theta_p and their corresponding posterior standard deviations and for all testlet effects

dat

Used data frame

weights

Used student weights

...

Further values

References

Glas, C. A. W. (2012). Estimating and testing the extended testlet model. LSAC Research Report Series, RR 12-03.

Wainer, H., Bradlow, E. T., & Wang, X. (2007). Testlet response theory and its applications. Cambridge: Cambridge University Press.

Wang, W.-C., & Wilson, M. (2005). The Rasch testlet model. Applied Psychological Measurement, 29, 126-149.

Wang, X., Bradlow, E. T., & Wainer, H. (2002). A general Bayesian model for testlets: Theory and applications. Applied Psychological Measurement, 26, 109-128.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)
dat <- data.read
I <- ncol(dat)

# set burnin and total number of iterations here (CHANGE THIS!)
burnin <- 200
iter <- 500

#***
# Model 1: 1PNO model
mod1 <- sirt::mcmc.3pno.testlet( dat,  est.slope=FALSE, est.guess=FALSE,
            burnin=burnin, iter=iter )
summary(mod1)
plot(mod1,ask=TRUE) # plot MCMC chains in coda style
plot(mod1,ask=TRUE, layout=2) # plot MCMC output in different layout

#***
# Model 2: 3PNO model with Beta(5,17) prior for guessing parameters
mod2 <- sirt::mcmc.3pno.testlet( dat,  guess.prior=c(5,17),
               burnin=burnin, iter=iter )
summary(mod2)

#***
# Model 3: Rasch (1PNO) testlet model
testlets <- substring( colnames(dat), 1, 1 )
mod3 <- sirt::mcmc.3pno.testlet( dat,  testlets=testlets,  est.slope=FALSE,
           est.guess=FALSE, burnin=burnin, iter=iter )
summary(mod3)

#***
# Model 4: 3PNO testlet model with (almost) fixed guessing parameters .25
mod4 <- sirt::mcmc.3pno.testlet( dat,  guess.prior=1000*c(25,75), testlets=testlets,
              burnin=burnin, iter=iter )
summary(mod4)
plot(mod4, ask=TRUE, layout=2)

#############################################################################
# EXAMPLE 2: Simulated data according to the Rasch testlet model
#############################################################################
set.seed(678)

N <- 3000   # number of persons
I <- 4      # number of items per testlet
TT <- 3     # number of testlets

ITT <- I*TT
b <- round( stats::rnorm( ITT, mean=0, sd=1 ), 2 )
sd0 <- 1 # sd trait
sdt <- seq( 0, 2, len=TT ) # sd testlets

# simulate theta
theta <- stats::rnorm( N, sd=sd0 )
# simulate testlets
ut <- matrix(0,nrow=N, ncol=TT )
for (tt in 1:TT){
    ut[,tt] <- stats::rnorm( N, sd=sdt[tt] )
}
ut <- ut[, rep(1:TT,each=I) ]
# calculate response probability
prob <- matrix( stats::pnorm( theta + ut + matrix( b, nrow=N, ncol=ITT,
            byrow=TRUE ) ), N, ITT)
Y <- (matrix( stats::runif(N*ITT), N, ITT) < prob )*1
colMeans(Y)

# define testlets
testlets <- rep(1:TT, each=I )

burnin <- 300
iter <- 1000

#***
# Model 1: 1PNO model (without testlet structure)
mod1 <- sirt::mcmc.3pno.testlet( dat=Y,  est.slope=FALSE, est.guess=FALSE,
            burnin=burnin, iter=iter, testlets=testlets )
summary(mod1)

summ1 <- mod1$summary.mcmcobj
# compare item parameters
cbind( b, summ1[ grep("b", summ1$parameter ), "Mean" ] )
# Testlet standard deviations
cbind( sdt, summ1[ grep("sigma\.testlet", summ1$parameter ), "Mean" ] )

#***
# Model 2: 1PNO model (without testlet structure)
mod2 <- sirt::mcmc.3pno.testlet( dat=Y,  est.slope=TRUE, est.guess=FALSE,
           burnin=burnin, iter=iter, testlets=testlets )
summary(mod2)

summ2 <- mod2$summary.mcmcobj
# compare item parameters
cbind( b, summ2[ grep("b\[", summ2$parameter ), "Mean" ] )
# item discriminations
cbind( sd0, summ2[ grep("a\[", summ2$parameter ), "Mean" ] )
# Testlet standard deviations
cbind( sdt, summ2[ grep("sigma\.testlet", summ2$parameter ), "Mean" ] )

#############################################################################
# EXAMPLE 3: Simulated data according to the 2PNO testlet model
#############################################################################
set.seed(678)

N <- 3000    # number of persons
I <- 3      # number of items per testlet
TT <- 5    # number of testlets

ITT <- I*TT
b <- round( stats::rnorm( ITT, mean=0, sd=1 ), 2 )
a <- round( stats::runif( ITT, 0.5, 2 ),2)
sdt <- seq( 0, 2, len=TT ) # sd testlets
sd0 <- 1

# simulate theta
theta <- stats::rnorm( N, sd=sd0 )
# simulate testlets
ut <- matrix(0,nrow=N, ncol=TT )
for (tt in 1:TT){
   ut[,tt] <- stats::rnorm( N, sd=sdt[tt] )
}
ut <- ut[, rep(1:TT,each=I) ]
# calculate response probability
bM <- matrix( b, nrow=N, ncol=ITT, byrow=TRUE )
aM <- matrix( a, nrow=N, ncol=ITT, byrow=TRUE )
prob <- matrix( stats::pnorm( aM*theta + ut + bM ), N, ITT)
Y <- (matrix( stats::runif(N*ITT), N, ITT) < prob )*1
colMeans(Y)

# define testlets
testlets <- rep(1:TT, each=I )

burnin <- 500
iter <- 1500

#***
# Model 1: 2PNO model
mod1 <- sirt::mcmc.3pno.testlet( dat=Y,  est.slope=TRUE, est.guess=FALSE,
             burnin=burnin, iter=iter, testlets=testlets )
summary(mod1)

summ1 <- mod1$summary.mcmcobj
# compare item parameters
cbind( b, summ1[ grep("b\[", summ1$parameter ), "Mean" ] )
# item discriminations
cbind( a, summ1[ grep("a\[", summ1$parameter ), "Mean" ] )
# Testlet standard deviations
cbind( sdt, summ1[ grep("sigma\.testlet", summ1$parameter ), "Mean" ] )

## End(Not run)

Computation of Descriptive Statistics for a `mcmc.list` Object

Description

Computation of descriptive statistics, Rhat convergence statistic and MAP for a mcmc.list object. The Rhat statistic is computed by splitting one Monte Carlo chain into three segments of equal length. The MAP is the mode estimate of the posterior distribution which is approximated by the mode of the kernel density estimate.

Usage

mcmc.list.descriptives( mcmcobj, quantiles=c(.025,.05,.1,.5,.9,.95,.975) )

Arguments

mcmcobj

Object of class mcmc.list

quantiles

Quantiles to be calculated for all parameters

Value

A data frame with descriptive statistics for all parameters in the mcmc.list object.

Examples

## Not run: 
miceadds::library_install("coda")
miceadds::library_install("R2WinBUGS")

#############################################################################
# EXAMPLE 1: Logistic regression
#############################################################################

#***************************************
# (1) simulate data
set.seed(8765)
N <- 500
x1 <- stats::rnorm(N)
x2 <- stats::rnorm(N)
y <- 1*( stats::plogis( -.6 + .7*x1 + 1.1 *x2 ) > stats::runif(N) )

#***************************************
# (2) estimate logistic regression with glm
mod <- stats::glm( y ~ x1 + x2, family="binomial" )
summary(mod)

#***************************************
# (3) estimate model with rcppbugs package
b <- rcppbugs::mcmc.normal( stats::rnorm(3),mu=0,tau=0.0001)
y.hat <- rcppbugs::deterministic(function(x1,x2,b) {
             stats::plogis( b[1] + b[2]*x1 + b[3]*x2 ) }, x1, x2, b)
y.lik <- rcppbugs::mcmc.bernoulli( y, p=y.hat, observed=TRUE)
m <- rcppbugs::create.model(b, y.hat, y.lik)

#*** estimate model in rcppbugs; 5000 iterations, 1000 burnin iterations
ans <- rcppbugs::run.model(m, iterations=5000, burn=1000, adapt=1000, thin=5)
print(rcppbugs::get.ar(ans))     # get acceptance rate
print(apply(ans[["b"]],2,mean))  # get means of posterior

#*** convert rcppbugs into mcmclist object
mcmcobj <- data.frame( ans$b  )
colnames(mcmcobj) <- paste0("b",1:3)
mcmcobj <- as.matrix(mcmcobj)
class(mcmcobj) <- "mcmc"
attr(mcmcobj, "mcpar") <- c( 1, nrow(mcmcobj), 1 )
mcmcobj <- coda::as.mcmc.list( mcmcobj )

# plot results
plot(mcmcobj)

# summary
summ1 <-  sirt::mcmc.list.descriptives( mcmcobj )
summ1

## End(Not run)

Write Coda File from an Object of Class `mcmc.list`

Description

This function writes a coda file from an object of class mcmc.list. Note that only first entry (i.e. one chain) will be processed.

Usage

mcmclist2coda(mcmclist, name, coda.digits=5)

Arguments

mcmclist

An object of class mcmc.list.

name

Name of the coda file to be written

coda.digits

Number of digits after decimal in the coda file

Value

The coda file and a corresponding index file are written into the working directory.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: MCMC estimation 2PNO dataset Reading
#############################################################################

data(data.read)
# estimate 2PNO with MCMC with 3000 iterations and 500 burn-in iterations
mod <- sirt::mcmc.2pno( dat=data.read, iter=3000, burnin=500 )
# plot MCMC chains
plot( mod$mcmcobj, ask=TRUE )
# write sampled chains into codafile
mcmclist2coda( mod$mcmcobj, name="dataread_2pl" )

## End(Not run)

Response Pattern in a Binary Matrix

Description

Computes different statistics of the response pattern in a binary matrix.

Usage

md.pattern.sirt(dat)

Arguments

dat

A binary data matrix

Value

A list with following entries:

dat

Original dataset

dat.resp1

Indices for responses of 1's

dat.resp0

Indices for responses of 0's

resp_patt

Vector of response patterns

unique_resp_patt

Unique response patterns

unique_resp_patt_freq

Frequencies of unique response patterns

unique_resp_patt_firstobs

First observation in original dataset dat of a unique response pattern

freq1

Frequencies of 1's

freq0

Frequencies of 0's

dat.ordered

Dataset according to response patterns

Examples

#############################################################################
# EXAMPLE 1: Response patterns
#############################################################################
set.seed(7654)
N <- 21         # number of rows
I <- 4          # number of columns
dat <- matrix( 1*( stats::runif(N*I) > .3 ), N, I )
res <- sirt::md.pattern.sirt(dat)
# plot of response patterns
res$dat.ordered
image( z=t(res$dat.ordered), y=1:N, x=1:I, xlab="Items", ylab="Persons")
# 0's are yellow and 1's are red

#############################################################################
# EXAMPLE 2: Item response patterns for dataset data.read
#############################################################################

data(data.read)
dat <- data.read  ; N <- nrow(dat) ; I <- ncol(dat)
# order items according to p values
dat <- dat[, order(colMeans(dat, na.rm=TRUE )) ]
# analyzing response pattern
res <- sirt::md.pattern.sirt(dat)
res$dat.ordered
image( z=t(res$dat.ordered), y=1:N, x=1:I, xlab="Items", ylab="Persons")

Estimation of Multiple-Group Structural Equation Models

Description

Estimates a multiple-group structural equation model. The function allows arbitrary prior distributions on model parameters and regularized estimation with the SCAD and the LASSO penalty. Moreover, it can also conduct robust moment estimation using the L_p loss function \rho(x)=|x|^p for p \ge 0. See Robitzsch (2023) for more details.

Usage

mgsem(suffstat, model, data=NULL, group=NULL, weights=NULL, estimator="ML",
     p_me=2, p_pen=1, pen_type="scad", diffpar_pen=NULL, pen_sample_size=TRUE,
     a_scad=3.7, eps_approx=0.001, comp_se=TRUE, se_delta_formula=FALSE,
     prior_list=NULL, hessian=TRUE, fixed_parms=FALSE, cd=FALSE,
     cd_control=list(maxiter=20, tol=5*1e-04, interval_length=0.05, method="exact"),
     partable_start=NULL, num_approx=FALSE, technical=NULL, control=list())

Arguments

suffstat

List containing sufficient statistics

model

Model specification, see examples. Can have entries est, index, lower, upper, prior, pen_l2, pen_lp, pen_difflp. Each entry can be defined for model matrices ALPHA, NU, LAM, PHI, and PSI.

data

Optional data frame

group

Optional vector of group identifiers

weights

Optional vector of sampling weights

estimator

Character. Can be either "ML" for maximum likelihood fitting function or "ME" for robust moment estimation.

p_me

Power in $L_p$ loss function for robust moment estimation

p_pen

Power for penalty in regularized estimation. For regular LASSO and SCAD penalty functions, it is $p=1$.

pen_type

Penalty type. Can be either "scad" or "lasso".

diffpar_pen

List containing values of regularization parameters in fused lasso estimation

pen_sample_size

List containing values for sample sizes for regularized estimation

a_scad

Parameter $a$ used in SCAD penalty

eps_approx

Approximation value for nondifferentiable robust moment fitting function or penalty function

comp_se

Logical indicating whether standard errors should be computed

se_delta_formula

Logical indicating whether standard errors should be computed according to the delta formula

prior_list

List containing specifications of the prior distributions

hessian

Logical indicating whether the Hessian matrix should be computed

fixed_parms

Logical indicating whether all model parameters should be fixed

cd

Logical indicating whether coordinate descent should be used for estimation

cd_control

Control parameters for coordinate descent estimation

partable_start

Starting values for parameter estimation

num_approx

Logical indicating whether derivatives should be computed based on numerical differentiation

technical

Parameters used for optimization in sirt_optimizer

control

Control paramaters for optimization

Details

[MORE INFORMATION TO BE ADDED]

Value

A list with following values

coef

Coeffients

vcov

Variance matrix

se

Vector of standard errors

partable

Parameter table

model

Specified model

opt_res

Result from optimization

opt_value

Value of fitting function

residuals

Residuals of sufficient statistics

ic

Information criteria

technical

Specifications of optimizer

suffstat_vcov

Variance matrix of sufficient statistics

me_delta_method

Input and output matrices for delta method if estimator="ME"

data_proc

Processed data

case_ll

Casewise log-likelihood function

...

Further values

References

Robitzsch, A. (2023). Model-robust estimation of multiple-group structural equation models. Algorithms, 16(4), 210. doi:10.3390/a16040210

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Noninvariant item intercepts in a multiple-group SEM
#############################################################################

#---- simulate data
set.seed(65)
G <- 3  # number of groups
I <- 5  # number of items
# define lambda and nu parameters
lambda <- matrix(1, nrow=G, ncol=I)
nu <- matrix(0, nrow=G, ncol=I)
err_var <- matrix(1, nrow=G, ncol=I)

# define extent of noninvariance
dif_int <- .5

#- 1st group: N(0,1)
nu[1,4] <- dif_int
#- 2nd group: N(0.3,1.5)
gg <- 2 ;
nu[gg,1] <- -dif_int
#- 3nd group: N(.8,1.2)
gg <- 3
nu[gg,2] <- -dif_int
#- define distributions of groups
mu <- c(0,.3,.8)
sigma <- sqrt(c(1,1.5,1.2))
N <- rep(1000,3) # sample sizes per group

exact <- FALSE
suffstat <- sirt::invariance_alignment_simulate(nu, lambda, err_var, mu, sigma, N,
                output="suffstat", groupwise=TRUE, exact=exact)

#---- model specification

# model specifications joint group
est <- list(
        ALPHA=matrix( c(0), ncol=1),
        NU=matrix( 0, nrow=I, ncol=1),
        LAM=matrix(1, nrow=I, ncol=1),
        PHI=matrix(0,nrow=1,ncol=1),
        PSI=diag(rep(1,I))
        )

# parameter index
index <- list(
        ALPHA=0*est$ALPHA,
        NU=1+0*est$NU,
        LAM=1+0*est$LAM,
        PHI=0*est$PHI,
        PSI=diag(1,I)
        )

# lower bounds
lower <- list(
        PSI=diag(rep(0.01,I)), PHI=matrix(0.01,1,1)
        )

#*** joint parameters
group0 <- list(est=est, index=index, lower=lower)

#*** group1
est <- list(
        ALPHA=matrix( c(0), ncol=1),
        NU=matrix( 0, nrow=I, ncol=1),
        LAM=matrix(0, nrow=I, ncol=1),
        PHI=matrix(1,nrow=1,ncol=1)
            )

# parameter index
index <- list(
        ALPHA=0*est$ALPHA,
        NU=0*est$NU,
        LAM=1*est$LAM,
        PHI=0*est$PHI
        )

group1 <- list(est=est, index=index, lower=lower)

#*** group 2 and group 3

# modify parameter index
index$ALPHA <- 1+0*est$ALPHA
index$PHI <- 1+0*est$PHI
group3 <- group2 <- list(est=est, index=index, lower=lower)

#*** define model
model <- list(group0=group0, group1=group1, group2=group2, group3=group3)

#-- estimate model with ML
res1 <- sirt::mgsem( suffstat=suffstat, model=model2, eps_approx=1e-4, estimator="ML",
                    technical=list(maxiter=500, optimizer="optim"),
                    hessian=FALSE, comp_se=FALSE, control=list(trace=1) )
str(res1)

#-- robust moment estimation with p=0.5

optimizer <- "optim"
technical <- list(maxiter=500, optimizer=optimizer)
eps_approx <- 1e-3

res2 <- sirt::mgsem( suffstat=suffstat, model=res1$model, p_me=0.5,
                    eps_approx=eps_approx, estimator="ME", technical=technical,
                    hessian=FALSE, comp_se=FALSE, control=list(trace=1) )

#---- regularized estimation

nu_lam <- 0.1    # regularization parameter

# redefine model
define_model <- function(model, nu_lam)
{
    pen_lp <- list( NU=nu_lam+0*model$group1$est$NU)
    ee <- "group1"
    for (ee in c("group1","group2","group3"))
    {
        model[[ee]]$index$NU <- 1+0*index$NU
        model[[ee]]$pen_lp <- pen_lp
    }
    return(model)
}

model3 <- define_model(model=model, nu_lam=nu_lam)
pen_type <- "scad"

res3 <- sirt::mgsem( suffstat=suffstat, model=model3, p_pen=1, pen_type=pen_type,
                    eps_approx=eps_approx, estimator="ML",
                    technical=list(maxiter=500, optimizer="optim"),
                    hessian=FALSE, comp_se=FALSE, control=list(trace=1) )
str(res3)

## End(Not run)

Specify or modify a Parameter Table in mirt

Description

Specify or modify a parameter table in mirt.

Usage

mirt.specify.partable(mirt.partable, parlist, verbose=TRUE)

Arguments

mirt.partable

Parameter table in mirt package

parlist

List of parameters which are used for specification in the parameter table. See Examples.

verbose

An optional logical indicating whether the some warnings should be printed.

Value

A modified parameter table

Author(s)

Alexander Robitzsch, Phil Chalmers

Examples

#############################################################################
# EXAMPLE 1: Modifying a parameter table for single group
#############################################################################

library(mirt)
data(LSAT7,package="mirt")
data <- mirt::expand.table(LSAT7)

mirt.partable <- mirt::mirt(data, 1, pars="values")
colnames(mirt.partable)
## > colnames(mirt.partable) [1] 'group' 'item' 'class' 'name' 'parnum' 'value'
##   'lbound' 'ubound' 'est' 'prior.type' 'prior_1' 'prior_2'

# specify some values of item parameters
value <- data.frame(d=c(0.7, -1, NA), a1=c(1, 1.2, 1.3), g=c(NA, 0.25, 0.25))
rownames(value) <- c("Item.1", "Item.4", "Item.3")

# fix some item paramters
est1 <- data.frame(d=c(TRUE, NA), a1=c(FALSE, TRUE))
rownames(est1) <- c("Item.4", "Item.3")

# estimate all guessing parameters
est2 <- data.frame(g=rep(TRUE, 5))
rownames(est2) <- colnames(data)

# prior distributions
prior.type <- data.frame(g=rep("norm", 4))
rownames(prior.type) <- c("Item.1", "Item.2", "Item.4", "Item.5")
prior_1 <- data.frame(g=rep(-1.38, 4))
rownames(prior_1) <- c("Item.1", "Item.2", "Item.4", "Item.5")
prior_2 <- data.frame(g=rep(0.5, 4))
rownames(prior_2) <- c("Item.1", "Item.2", "Item.4", "Item.5")

# misspecify some entries
rownames(prior_2)[c(3,2)] <- c("A", "B")
rownames(est1)[2] <- c("B")

# define complete list with parameter specification
parlist <- list(value=value, est=est1, est=est2, prior.type=prior.type,
      prior_1=prior_1, prior_2=prior_2)

# modify parameter table
mirt.specify.partable(mirt.partable, parlist)

Some Functions for Wrapping with the mirt Package

Description

Some functions for wrapping with the mirt package.

Usage

# extract coefficients
mirt.wrapper.coef(mirt.obj)

# summary output
mirt_summary(object, digits=4, file=NULL, ...)

# extract posterior, likelihood, ...
mirt.wrapper.posterior(mirt.obj, weights=NULL, group=NULL)
## S3 method for class 'SingleGroupClass'
IRT.likelihood(object, ...)
## S3 method for class 'MultipleGroupClass'
IRT.likelihood(object, ...)
## S3 method for class 'SingleGroupClass'
IRT.posterior(object, ...)
## S3 method for class 'MultipleGroupClass'
IRT.posterior(object, ...)
## S3 method for class 'SingleGroupClass'
IRT.expectedCounts(object, ...)
## S3 method for class 'MultipleGroupClass'
IRT.expectedCounts(object, ...)

# S3 method for extracting item response functions
## S3 method for class 'SingleGroupClass'
IRT.irfprob(object, ...)
## S3 method for class 'MultipleGroupClass'
IRT.irfprob(object, group=1, ...)

# compute factor scores
mirt.wrapper.fscores(mirt.obj, weights=NULL)

# convenience function for itemplot
mirt.wrapper.itemplot( mirt.obj, ask=TRUE, ...)

Arguments

mirt.obj

A fitted model in mirt package

object

A fitted object in mirt package of class SingleGroupClass or MultipleGroupClass.

group

Group index for IRT.irfprob (only applicable for object of class MultipleGroupClass)

digits

Number of digits after decimal used for rounding

file

File name for sinking summary output

weights

Optional vector of student weights

ask

Optional logical indicating whether each new plot should be confirmed.

...

Further arguments to be passed.

Details

The function mirt.wrapper.coef collects all item parameters in a data frame.

The function mirt.wrapper.posterior extracts the individual likelihood, individual likelihood and expected counts. This function does not yet cover the case of multiple groups.

The function mirt.wrapper.fscores computes factor scores EAP, MAP and MLE. The factor scores are computed on the discrete grid of latent traits (contrary to the computation in mirt) as specified in mirt.obj@Theta. This function does also not work for multiple groups.

The function mirt.wrapper.itemplot displays all item plots after each other.

Value

Function mirt.wrapper.coef – List with entries

coef

Data frame with item parameters

GroupPars

Data frame or list with distribution parameters

Function mirt.wrapper.posterior – List with entries

theta.k

Grid of theta points

pi.k

Trait distribution on theta.k

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior

n.ik

Expected counts

data

Used dataset

Function mirt.wrapper.fscores – List with entries

person

Data frame with person parameter estimates (factor scores) EAP, MAP and MLE for all dimensions.

EAP.rel

EAP reliabilities

Examples for the mirt Package

Latent class analysis (data.read, Model 7)
Mixed Rasch model (data.read, Model 8)
Located unidimensional and multidimensional latent class models / Multidimensional latent class IRT models (data.read, Model 12; rasch.mirtlc, Example 4)
Multidimensional IRT model with discrete latent traits (data.read, Model 13)
DINA model (data.read, Model 14; data.dcm, CDM, Model 1m)
Unidimensional IRT model with non-normal distribution (data.read, Model 15)
Grade of membership model (gom.em, Example 2)
Rasch copula model (rasch.copula2, Example 5)
Additive GDINA model (data.dcm, CDM, Model 6m)
Longitudinal Rasch model (data.long, Model 3)
Normally distributed residuals (data.big5, Example 1, Model 5)
Nedelsky model (nedelsky.irf, Examples 1, 2)
Beta item response model (brm.irf, Example 1)

Examples

## Not run: 
# A development version can be installed from GitHub
if (FALSE){ # default is set to FALSE, use the installed version
   library(devtools)
   devtools::install_github("philchalmers/mirt")
          }
# now, load mirt
library(mirt)

#############################################################################
# EXAMPLE 1: Extracting item parameters and posterior LSAT data
#############################################################################

data(LSAT7, package="mirt")
data <- mirt::expand.table(LSAT7)

#*** Model 1: 3PL model for item 5 only, other items 2PL
mod1 <- mirt::mirt(data, 1, itemtype=c("2PL","2PL","2PL","2PL","3PL"), verbose=TRUE)
print(mod1)
summary(mod1)
# extracting coefficients
coef(mod1)
mirt.wrapper.coef(mod1)$coef
# summary output
mirt_summary(mod1)
# extract parameter values in mirt
mirt::mod2values(mod1)
# extract posterior
post1 <- sirt::mirt.wrapper.posterior(mod1)
# extract item response functions
probs1 <- IRT.irfprob(mod1)
str(probs1)
# extract individual likelihood
likemod1 <- IRT.likelihood(mod1)
str(likemod1)
# extract individual posterior
postmod1 <- IRT.posterior(mod1)
str(postmod1)

#*** Model 2: Confirmatory model with two factors
cmodel <- mirt::mirt.model("
        F1=1,4,5
        F2=2,3
        ")
mod2 <- mirt::mirt(data, cmodel, verbose=TRUE)
print(mod2)
summary(mod2)
# extract coefficients
coef(mod2)
mirt.wrapper.coef(mod2)$coef
# extract posterior
post2 <- sirt::mirt.wrapper.posterior(mod2)

#############################################################################
# EXAMPLE 2: Extracting item parameters and posterior for differering
#            number of response catagories | Dataset Science
#############################################################################

data(Science,package="mirt")
library(psych)
psych::describe(Science)

# modify dataset
dat <- Science
dat[ dat[,1] > 3,1] <- 3
psych::describe(dat)

# estimate generalized partial credit model
mod1 <- mirt::mirt(dat, 1, itemtype="gpcm")
print(mod1)
# extract coefficients
coef(mod1)
mirt.wrapper.coef(mod1)$coef
# extract posterior
post1 <- sirt::mirt.wrapper.posterior(mod1)

#############################################################################
# EXAMPLE 3: Multiple group model; simulated dataset from mirt package
#############################################################################

#*** simulate data (copy from the multipleGroup manual site in mirt package)
set.seed(1234)
a <- matrix(c(abs( stats::rnorm(5,1,.3)), rep(0,15),abs( stats::rnorm(5,1,.3)),
          rep(0,15),abs( stats::rnorm(5,1,.3))), 15, 3)
d <- matrix( stats::rnorm(15,0,.7),ncol=1)
mu <- c(-.4, -.7, .1)
sigma <- matrix(c(1.21,.297,1.232,.297,.81,.252,1.232,.252,1.96),3,3)
itemtype <- rep("dich", nrow(a))
N <- 1000
dataset1 <- mirt::simdata(a, d, N, itemtype)
dataset2 <- mirt::simdata(a, d, N, itemtype, mu=mu, sigma=sigma)
dat <- rbind(dataset1, dataset2)
group <- c(rep("D1", N), rep("D2", N))

#group models
model <- mirt::mirt.model("
   F1=1-5
   F2=6-10
   F3=11-15
      ")

# separate analysis
mod_configural <- mirt::multipleGroup(dat, model, group=group, verbose=TRUE)
mirt.wrapper.coef(mod_configural)

# equal slopes (metric invariance)
mod_metric <- mirt::multipleGroup(dat, model, group=group, invariance=c("slopes"),
                verbose=TRUE)
mirt.wrapper.coef(mod_metric)

# equal slopes and intercepts (scalar invariance)
mod_scalar <- mirt::multipleGroup(dat, model, group=group,
          invariance=c("slopes","intercepts","free_means","free_varcov"), verbose=TRUE)
mirt.wrapper.coef(mod_scalar)

# full constraint
mod_fullconstrain <- mirt::multipleGroup(dat, model, group=group,
             invariance=c("slopes", "intercepts", "free_means", "free_var"), verbose=TRUE )
mirt.wrapper.coef(mod_fullconstrain)

#############################################################################
# EXAMPLE 4: Nonlinear item response model
#############################################################################

data(data.read)
dat <- data.read
# specify mirt model with some interactions
mirtmodel <- mirt.model("
   A=1-4
   B=5-8
   C=9-12
   (A*B)=4,8
   (C*C)=9
   (A*B*C)=12
   " )
# estimate model
res <- mirt::mirt( dat, mirtmodel, verbose=TRUE, technical=list(NCYCLES=3) )
# look at estimated parameters
mirt.wrapper.coef(res)
coef(res)
mirt::mod2values(res)
# model specification
res@model

#############################################################################
# EXAMPLE 5: Extracting factor scores
#############################################################################

data(data.read)
dat <- data.read
# define lavaan model and convert syntax to mirt
lavmodel <- "
    A=~ a*A1+a*A2+1.3*A3+A4       # set loading of A3 to 1.3
    B=~ B1+1*B2+b3*B3+B4
    C=~ c*C1+C2+c*C3+C4
    A1 | da*t1
    A3 | da*t1
    C4 | dg*t1
    B1 | 0*t1
    B3 | -1.4*t1                  # fix item threshold of B3 to -1.4
    A ~~ B                        # estimate covariance between A and B
    A ~~ .6 * C                   # fix covariance to .6
    B ~~ B                        # estimate variance of B
    A ~ .5*1                      # set mean of A to .5
    B ~ 1                         # estimate mean of B
    "
res <- sirt::lavaan2mirt( dat, lavmodel, verbose=TRUE, technical=list(NCYCLES=3) )
# estimated coefficients
mirt.wrapper.coef(res$mirt)
# extract factor scores
fres <- sirt::mirt.wrapper.fscores(res$mirt)
# look at factor scores
head( round(fres$person,2))
  ##     case    M EAP.Var1 SE.EAP.Var1 EAP.Var2 SE.EAP.Var2 EAP.Var3 SE.EAP.Var3 MLE.Var1
  ##   1    1 0.92     1.26        0.67     1.61        0.60     0.05        0.69     2.65
  ##   2    2 0.58     0.06        0.59     1.14        0.55    -0.80        0.56     0.00
  ##   3    3 0.83     0.86        0.66     1.15        0.55     0.48        0.74     0.53
  ##   4    4 1.00     1.52        0.67     1.57        0.60     0.73        0.76     2.65
  ##   5    5 0.50    -0.13        0.58     0.85        0.48    -0.82        0.55    -0.53
  ##   6    6 0.75     0.41        0.63     1.09        0.54     0.27        0.71     0.00
  ##     MLE.Var2 MLE.Var3 MAP.Var1 MAP.Var2 MAP.Var3
  ##   1     2.65    -0.53     1.06     1.59     0.00
  ##   2     1.06    -1.06     0.00     1.06    -1.06
  ##   3     1.06     2.65     1.06     1.06     0.53
  ##   4     2.65     2.65     1.59     1.59     0.53
  ##   5     0.53    -1.06    -0.53     0.53    -1.06
  ##   6     1.06     2.65     0.53     1.06     0.00
# EAP reliabilities
round(fres$EAP.rel,3)
  ##    Var1  Var2  Var3
  ##   0.574 0.452 0.541

## End(Not run)

Maximum Likelihood Estimation of Person or Group Parameters in the Generalized Partial Credit Model

Description

This function estimates person or group parameters in the partial credit model (see Details).

Usage

mle.pcm.group(dat, b, a=rep(1, ncol(dat)), group=NULL,
    pid=NULL, adj_eps=0.3, conv=1e-04, maxiter=30)

Arguments

dat

A numeric N \times I matrix

b

Matrix with item thresholds

a

Vector of item slopes

group

Vector of group identifiers

pid

Vector of person identifiers

adj_eps

Numeric value which is used in \varepsilon adjustment of the likelihood. A value of zero (or a very small \varepsilon>0) corresponds to the usual maximum likelihood estimate.

conv

Convergence criterion

maxiter

Maximum number of iterations

Details

It is assumed that the generalized partial credit model holds. In case one estimates a person parameter \theta_p, the log-likelihood is maximized and the following estimating equation results: (see Penfield & Bergeron, 2005):

0=( \log L )'=\sum_i a_i \cdot [ \tilde{x}_{pi} - E(X_{pi} | \theta_p ) ]

where E(X_{pi} | \theta_p ) denotes the expected item response conditionally on \theta_p.

With the method of \varepsilon-adjustment (Bertoli-Barsotti & Punzo, 2012; Bertoli-Barsotti, Lando & Punzo, 2014), the observed item responses x_{pi} are transformed such that no perfect scores arise and bias is reduced. If S_p is the sum score of person p and M_p the maximum score of this person, then the transformed sum scores \tilde{S}_p are

\tilde{S}_p=\varepsilon + \frac{M_p - 2 \varepsilon}{M_p} S_p

However, the adjustment is directly conducted on item responses to also handle the case of the generalized partial credit model with item slope parameters different from 1.

In case one estimates a group parameter \theta_g, the following estimating equation is used:

0=(\log L )'=\sum_p \sum_i a_i \cdot [ \tilde{x}_{pgi} - E(X_{pgi} | \theta_g ) ]

where persons p are nested within a group g. The \varepsilon-adjustment is then performed at the group level, not at the individual level.

Value

A list with following entries:

person

Data frame with person or group parameters

data_adjeps

Modified dataset according to the \varepsilon adjustment.

References

Bertoli-Barsotti, L., & Punzo, A. (2012). Comparison of two bias reduction techniques for the Rasch model. Electronic Journal of Applied Statistical Analysis, 5, 360-366.

Penfield, R. D., & Bergeron, J. M. (2005). Applying a weighted maximum likelihood latent trait estimator to the generalized partial credit model. Applied Psychological Measurement, 29, 218-233.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Estimation of a group parameter for only one item per group
#############################################################################

data(data.si01)
dat <- data.si01
# item parameter estimation (partial credit model) in TAM
library(TAM)
mod <- TAM::tam.mml( dat[,2:3], irtmodel="PCM")
# extract item difficulties
b <- matrix( mod$xsi$xsi, nrow=2, byrow=TRUE )
# groupwise estimation
res1 <- sirt::mle.pcm.group( dat[,2:3], b=b, group=dat$idgroup )
# individual estimation
res2 <- sirt::mle.pcm.group( dat[,2:3], b=b  )

#############################################################################
# EXAMPLE 2: Data Reading data.read
#############################################################################

data(data.read)
# estimate Rasch model
mod <- sirt::rasch.mml2( data.read )
score <- rowSums( data.read )
data.read <- data.read[ order(score), ]
score <- score[ order(score) ]
# compare different epsilon-adjustments
res30 <- sirt::mle.pcm.group( data.read, b=matrix( mod$item$b, 12, 1 ),
               adj_eps=.3 )$person
res10 <- sirt::mle.pcm.group( data.read, b=matrix( mod$item$b, 12, 1 ),
             adj_eps=.1 )$person
res05 <- sirt::mle.pcm.group( data.read, b=matrix( mod$item$b, 12, 1 ),
              adj_eps=.05 )$person
# plot different scorings
plot( score, res05$theta, type="l", xlab="Raw score", ylab=expression(theta[epsilon]),
         main="Scoring with different epsilon-adjustments")
lines( score, res10$theta, col=2, lty=2 )
lines( score, res30$theta, col=4, lty=3 )

## End(Not run)

Assessing Model Fit and Local Dependence by Comparing Observed and Expected Item Pair Correlations

Description

This function computes several measures of absolute model fit and local dependence indices for dichotomous item responses which are based on comparing observed and expected frequencies of item pairs (Chen, de la Torre & Zhang, 2013; see modelfit.cor for more details).

Usage

modelfit.sirt(object)

modelfit.cor.poly( data, probs, theta.k, f.qk.yi)

## S3 method for class 'sirt'
IRT.modelfit(object, mod, ...)

Arguments

object

An object generated by rasch.mml2, rasch.mirtlc, rasch.pml3 (rasch.pml2), smirt, R2noharm, noharm.sirt, gom.em, TAM::tam.mml, TAM::tam.mml.2pl, TAM::tam.fa, mirt::mirt

data

Dataset with polytomous item responses

probs

Item response probabilities at grid theta.k

theta.k

Grid of theta vector

f.qk.yi

Individual posterior

mod

Model name

...

Further arguments to be passed

Value

A list with following entries:

modelfit

Model fit statistics:

MADcor: mean of absolute deviations in observed and expected correlations (DiBello et al., 2007)

SRMSR: standardized mean square root of squared residuals (Maydeu-Olivares, 2013; Maydeu-Olivares & Joe, 2014)

MX2: Mean of \chi^2 statistics of all item pairs (Chen & Thissen, 1997)

MADRESIDCOV: Mean of absolute deviations of residual covariances (McDonald & Mok, 1995)

MADQ3: Mean of absolute values of Q_3 statistic (Yen, 1984)

MADaQ3: Mean of absolute values of centered Q_3 statistic

itempairs

Fit of every item pair

Note

The function modelfit.cor.poly is just a wrapper to TAM::tam.modelfit in the TAM package.

References

Chen, W., & Thissen, D. (1997). Local dependence indexes for item pairs using item response theory. Journal of Educational and Behavioral Statistics, 22, 265-289.

DiBello, L. V., Roussos, L. A., & Stout, W. F. (2007) Review of cognitively diagnostic assessment and a summary of psychometric models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26 (pp. 979–1030). Amsterdam: Elsevier.

Maydeu-Olivares, A. (2013). Goodness-of-fit assessment of item response theory models (with discussion). Measurement: Interdisciplinary Research and Perspectives, 11, 71-137.

Maydeu-Olivares, A., & Joe, H. (2014). Assessing approximate fit in categorical data analysis. Multivariate Behavioral Research, 49, 305-328.

McDonald, R. P., & Mok, M. M.-C. (1995). Goodness of fit in item response models. Multivariate Behavioral Research, 30, 23-40.

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Reading data
#############################################################################
data(data.read)
dat <- data.read
I <- ncol(dat)

#*** Model 1: Rasch model
mod1 <- sirt::rasch.mml2(dat)
fmod1 <- sirt::modelfit.sirt( mod1 )
summary(fmod1)

#*** Model 1b: Rasch model in TAM package
library(TAM)
mod1b <- TAM::tam.mml(dat)
fmod1b <- sirt::modelfit.sirt( mod1b )
summary(fmod1b)

#*** Model 2: Rasch model with smoothed distribution
mod2 <- sirt::rasch.mml2( dat, distribution.trait="smooth3" )
fmod2 <- sirt::modelfit.sirt( mod2 )
summary(fmod2 )

#*** Model 3: 2PL model
mod3 <- sirt::rasch.mml2( dat, distribution.trait="normal", est.a=1:I )
fmod3 <- sirt::modelfit.sirt( mod3 )
summary(fmod3 )

#*** Model 3: 2PL model in TAM package
mod3b <- TAM::tam.mml.2pl( dat )
fmod3b <- sirt::modelfit.sirt(mod3b)
summary(fmod3b)
# model fit in TAM package
tmod3b <- TAM::tam.modelfit(mod3b)
summary(tmod3b)
# model fit in mirt package
library(mirt)
mmod3b <- sirt::tam2mirt(mod3b)   # convert to mirt object
mirt::M2(mmod3b$mirt)         # global fit statistic
mirt::residuals( mmod3b$mirt, type="LD")  # local dependence statistics

#*** Model 4: 3PL model with equal guessing parameter
mod4 <- TAM::rasch.mml2( dat, distribution.trait="smooth3", est.a=1:I, est.c=rep(1,I) )
fmod4 <- sirt::modelfit.sirt( mod4 )
summary(fmod4 )

#*** Model 5: Latent class model with 2 classes
mod5 <- sirt::rasch.mirtlc( dat, Nclasses=2 )
fmod5 <- sirt::modelfit.sirt( mod5 )
summary(fmod5 )

#*** Model 6: Rasch latent class model with 3 classes
mod6 <- sirt::rasch.mirtlc( dat, Nclasses=3, modeltype="MLC1", mmliter=100)
fmod6 <- sirt::modelfit.sirt( mod6 )
summary(fmod6 )

#*** Model 7: PML estimation
mod7 <- sirt::rasch.pml3( dat )
fmod7 <- sirt::modelfit.sirt( mod7 )
summary(fmod7 )

#*** Model 8: PML estimation
#      Modelling error correlations:
#          joint residual correlations for each item cluster
error.corr <- diag(1,ncol(dat))
itemcluster <- rep( 1:4,each=3 )
for ( ii in 1:3){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
mod8 <- sirt::rasch.pml3( dat, error.corr=error.corr )
fmod8 <- sirt::modelfit.sirt( mod8 )
summary(fmod8 )

#*** Model 9: 1PL in smirt
Qmatrix <- matrix( 1, nrow=I, ncol=1 )
mod9 <- sirt::smirt( dat, Qmatrix=Qmatrix )
fmod9 <- sirt::modelfit.sirt( mod9 )
summary(fmod9 )

#*** Model 10: 3-dimensional Rasch model in NOHARM
noharm.path <- "c:/NOHARM"
Q <- matrix( 0, nrow=12, ncol=3 )
Q[ cbind(1:12, rep(1:3,each=4) ) ] <- 1
rownames(Q) <- colnames(dat)
colnames(Q) <- c("A","B","C")
# covariance matrix
P.pattern <- matrix( 1, ncol=3, nrow=3 )
P.init <- 0.8+0*P.pattern
diag(P.init) <- 1
# loading matrix
F.pattern <- 0*Q
F.init <- Q
# estimate model
mod10 <- sirt::R2noharm( dat=dat, model.type="CFA", F.pattern=F.pattern,
            F.init=F.init, P.pattern=P.pattern, P.init=P.init,
            writename="ex4e", noharm.path=noharm.path, dec="," )
fmod10 <- sirt::modelfit.sirt( mod10 )
summary(fmod10)

#*** Model 11: Rasch model in mirt package
library(mirt)
mod11 <- mirt::mirt(dat, 1, itemtype="Rasch",verbose=TRUE)
fmod11 <- sirt::modelfit.sirt( mod11 )
summary(fmod11)
# model fit in mirt package
mirt::M2(mod11)
mirt::residuals(mod11)

## End(Not run)

Monotone Regression for Rows or Columns in a Matrix

Description

Monotone (isotone) regression for rows (monoreg.rowwise) or columns (monoreg.colwise) in a matrix.

Usage

monoreg.rowwise(yM, wM)

monoreg.colwise(yM, wM)

Arguments

yM

Matrix with dependent variable for the regression. Values are assumed to be sorted.

wM

Matrix with weights for every entry in the yM matrix.

Value

Matrix with fitted values

Note

This function is used for fitting the ISOP model (see isop.dich).

Author(s)

Alexander Robitzsch

The monoreg function from the fdrtool package is simply extended to handle matrix input.

Examples

y <- c(22.5, 23.33, 20.83, 24.25 )
w <- c( 3,3,3,2)
# define matrix input
yM <- matrix( 0, nrow=2, ncol=4 )
wM <- yM
yM[1,] <- yM[2,] <- y
wM[1,] <- w
wM[2,] <- c(1,3,4, 3 )

# fit rowwise monotone regression
monoreg.rowwise( yM, wM )
# compare results with monoreg function from fdrtool package
## Not run: 
miceadds::library_install("fdrtool")
fdrtool::monoreg(x=yM[1,], w=wM[1,])$yf
fdrtool::monoreg(x=yM[2,], w=wM[2,])$yf

## End(Not run)

Functions for the Nedelsky Model

Description

Functions for simulating and estimating the Nedelsky model (Bechger et al., 2003, 2005). nedelsky.sim can be used for simulating the model, nedelsky.irf computes the item response function and can be used for example when estimating the Nedelsky model in the mirt package or using the xxirt function in the sirt package.

Usage

# simulating the Nedelsky model
nedelsky.sim(theta, b, a=NULL, tau=NULL)

# creating latent responses of the Nedelsky model
nedelsky.latresp(K)

# computing the item response function of the Nedelsky model
nedelsky.irf(Theta, K, b, a, tau, combis, thdim=1)

Arguments

theta

Unidimensional ability (theta)

b

Matrix of category difficulties

a

Vector of item discriminations

tau

Category attractivity parameters \tau (see Bechger et al., 2005)

K

(Maximum) Number of distractors of the used multiple choice items

Theta

Theta vector. Note that the Nedelsky model can be only specified as models with between item dimensionality (defined in thdim).

combis

Latent response classes as produced by nedelsky.latresp.

thdim

Theta dimension at which the item loads

Details

Assume that for item i there exists K+1 categories 0,1,...,K. The category 0 denotes the correct alternative. The Nedelsky model assumes that a respondent eliminates all distractors which are thought to be incorrect and guesses the solution from the remaining alternatives. This means, that for item i, K latent variables S_{ik} are defined which indicate whether alternative k has been correctly identified as a distractor. By definition, the correct alternative is never been judged as wrong by the respondent.

Formally, the Nedelsky model assumes a 2PL model for eliminating each of the distractors

P(S_{ik}=1 | \theta )=invlogit[ a_i ( \theta - b_{ik} ) ]

where \theta is the person ability and b_{ik} are distractor difficulties.

The guessing process of the Nedelsky model is defined as

P(X_i=j | \theta, S_{i1}, ..., S_{iK} )= \frac{ ( 1- S_{ij} ) \tau_{ij} }{ \sum_{k=0}^K [ ( 1- S_{ik} ) \tau_{ik} ] }

where \tau_{ij} are attractivity parameters of alternative j. By definition \tau_{i0} is set to 1. By default, all attractivity parameters are set to 1.

References

Bechger, T. M., Maris, G., Verstralen, H. H. F. M., & Verhelst, N. D. (2003). The Nedelsky model for multiple-choice items. CITO Research Report, 2003-5.

Bechger, T. M., Maris, G., Verstralen, H. H. F. M., & Verhelst, N. D. (2005). The Nedelsky model for multiple-choice items. In L. van der Ark, M. Croon, & Sijtsma, K. (Eds.). New developments in categorical data analysis for the social and behavioral sciences, pp. 187-206. Mahwah, Lawrence Erlbaum.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Simulated data according to the Nedelsky model
#############################################################################

#*** simulate data
set.seed(123)
I <- 20          # number of items
b <- matrix(NA,I,ncol=3)
b[,1] <- -0.5 + stats::runif( I, -.75, .75 )
b[,2] <- -1.5 + stats::runif( I, -.75, .75 )
b[,3] <- -2.5 + stats::runif( I, -.75, .75 )
K <- 3           # number of distractors
N <- 2000        # number of persons
# apply simulation function
dat <- sirt::nedelsky.sim( theta=stats::rnorm(N,sd=1.2), b=b )

#*** latent response patterns
K <- 3
combis <- sirt::nedelsky.latresp(K=3)

#*** defining the Nedelsky item response function for estimation in mirt
par <- c( 3, rep(-1,K), 1, rep(1,K+1),1)
names(par) <- c("K", paste0("b",1:K), "a", paste0("tau", 0:K),"thdim")
est <- c( FALSE, rep(TRUE,K), rep(FALSE, K+1 + 2 ) )
names(est) <- names(par)
nedelsky.icc <- function( par, Theta, ncat ){
     K <- par[1]
     b <- par[ 1:K + 1]
     a <- par[ K+2]
     tau <- par[1:(K+1) + (K+2) ]
     thdim <- par[ K+2+K+1 +1 ]
     probs <- sirt::nedelsky.irf( Theta, K=K, b=b, a=a, tau=tau, combis,
                    thdim=thdim  )$probs
     return(probs)
}
name <- "nedelsky"
# create item response function
nedelsky.itemfct <- mirt::createItem(name, par=par, est=est, P=nedelsky.icc)

#*** define model in mirt
mirtmodel <- mirt::mirt.model("
           F1=1-20
           COV=F1*F1
           # define some prior distributions
           PRIOR=(1-20,b1,norm,-1,2),(1-20,b2,norm,-1,2),
                   (1-20,b3,norm,-1,2)
        " )

itemtype <- rep("nedelsky", I )
customItems <- list("nedelsky"=nedelsky.itemfct)
# define parameters to be estimated
mod1.pars <- mirt::mirt(dat, mirtmodel, itemtype=itemtype,
                   customItems=customItems, pars="values")
# estimate model
mod1 <- mirt::mirt(dat,mirtmodel, itemtype=itemtype, customItems=customItems,
               pars=mod1.pars, verbose=TRUE  )
# model summaries
print(mod1)
summary(mod1)
mirt.wrapper.coef( mod1 )$coef
mirt.wrapper.itemplot(mod1,ask=TRUE)

#******************************************************
# fit Nedelsky model with xxirt function in sirt

# define item class for xxirt
item_nedelsky <- sirt::xxirt_createDiscItem( name="nedelsky", par=par,
                est=est, P=nedelsky.icc,
                prior=c( b1="dnorm", b2="dnorm", b3="dnorm" ),
                prior_par1=c( b1=-1, b2=-1, b3=-1),
                prior_par2=c(b1=2, b2=2, b3=2) )
customItems <- list( item_nedelsky )

#---- definition theta distribution
#** theta grid
Theta <- matrix( seq(-6,6,length=21), ncol=1 )
#** theta distribution
P_Theta1 <- function( par, Theta, G){
    mu <- par[1]
    sigma <- max( par[2], .01 )
    TP <- nrow(Theta)
    pi_Theta <- matrix( 0, nrow=TP, ncol=G)
    pi1 <- dnorm( Theta[,1], mean=mu, sd=sigma )
    pi1 <- pi1 / sum(pi1)
    pi_Theta[,1] <- pi1
    return(pi_Theta)
}
#** create distribution class
par_Theta <- c( "mu"=0, "sigma"=1 )
customTheta <- sirt::xxirt_createThetaDistribution( par=par_Theta, est=c(FALSE,TRUE),
                   P=P_Theta1 )

#-- create parameter table
itemtype <- rep( "nedelsky", I )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype, customItems=customItems)

# estimate model
mod2 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable, customItems=customItems,
                    customTheta=customTheta)
summary(mod2)
# compare sirt::xxirt and mirt::mirt
logLik(mod2)
mod1@Fit$logLik

#############################################################################
# EXAMPLE 2: Multiple choice dataset data.si06
#############################################################################

data(data.si06)
dat <- data.si06

#*** create latent responses
combis <- sirt::nedelsky.latresp(K)
I <- ncol(dat)
#*** define item response function
K <- 3
par <- c( 3, rep(-1,K), 1, rep(1,K+1),1)
names(par) <- c("K", paste0("b",1:K), "a", paste0("tau", 0:K),"thdim")
est <- c( FALSE, rep(TRUE,K), rep(FALSE, K+1 + 2 ) )
names(est) <- names(par)
nedelsky.icc <- function( par, Theta, ncat ){
     K <- par[1]
     b <- par[ 1:K + 1]
     a <- par[ K+2]
     tau <- par[1:(K+1) + (K+2) ]
     thdim <- par[ K+2+K+1 +1 ]
     probs <- sirt::nedelsky.irf( Theta, K=K, b=b, a=a, tau=tau, combis,
                    thdim=thdim  )$probs
     return(probs)
}
name <- "nedelsky"
# create item response function
nedelsky.itemfct <- mirt::createItem(name, par=par, est=est, P=nedelsky.icc)

#*** define model in mirt
mirtmodel <- mirt::mirt.model("
           F1=1-14
           COV=F1*F1
           PRIOR=(1-14,b1,norm,-1,2),(1-14,b2,norm,-1,2),
                   (1-14,b3,norm,-1,2)
        " )

itemtype <- rep("nedelsky", I )
customItems <- list("nedelsky"=nedelsky.itemfct)
# define parameters to be estimated
mod1.pars <- mirt::mirt(dat, mirtmodel, itemtype=itemtype,
                   customItems=customItems, pars="values")

#*** estimate model
mod1 <- mirt::mirt(dat,mirtmodel, itemtype=itemtype, customItems=customItems,
               pars=mod1.pars, verbose=TRUE )
#*** summaries
print(mod1)
summary(mod1)
mirt.wrapper.coef( mod1 )$coef
mirt.wrapper.itemplot(mod1,ask=TRUE)

## End(Not run)

NOHARM Model in R

Description

The function is an R implementation of the normal ogive harmonic analysis robust method (the NOHARM model; McDonald, 1997). Exploratory and confirmatory multidimensional item response models for dichotomous data using the probit link function can be estimated. Lower asymptotes (guessing parameters) and upper asymptotes (one minus slipping parameters) can be provided as fixed values.

Usage

noharm.sirt(dat, pm=NULL, N=NULL, weights=NULL, Fval=NULL, Fpatt=NULL, Pval=NULL,
   Ppatt=NULL, Psival=NULL, Psipatt=NULL, dimensions=NULL, lower=0, upper=1, wgtm=NULL,
   pos.loading=FALSE, pos.variance=FALSE, pos.residcorr=FALSE, maxiter=1000, conv=1e-6,
   optimizer="nlminb", par_lower=NULL, reliability=FALSE, ...)

## S3 method for class 'noharm.sirt'
summary(object, file=NULL, ...)

Arguments

dat

Matrix of dichotomous item responses. This matrix may contain missing data (indicated by NA) but missingness is assumed to be missing completely at random (MCAR). Alternatively, a product-moment matrix pm can be used as input.

pm

Optional product-moment matrix

N

Sample size if pm is provided

weights

Optional vector of student weights.

Fval

Initial or fixed values of the loading matrix \bold{F}.

Fpatt

Pattern matrix of the loading matrix \bold{F}. If elements should be estimated, then an entry of 1 must be included in the pattern matrix. Parameters which should be estimated with equality constraints must be indicated by same integers but values largers than 1.

Pval

Initial or fixed values for the covariance matrix \bold{P}.

Ppatt

Pattern matrix for the covariance matrix \bold{P}.

Psival

Initial or fixed values for the matrix of residual correlations \bold{\Psi}.

Psipatt

Pattern matrix for the matrix of residual correlations \bold{\Psi}.

dimensions

Number of dimensions if an exploratory factor analysis should be estimated.

lower

Fixed vector (or numeric) of lower asymptotes c_i.

upper

Fixed vector (or numeric) of upper asymptotes d_i.

wgtm

Matrix with positive entries which indicates by a positive entry which item pairs should be used for estimation.

pos.loading

An optional logical indicating whether all entries in the loading matrix \bold{F} should be positive

pos.variance

An optional logical indicating whether all variances (i.e. diagonal entries in \bold{P}) should be positive

pos.residcorr

An optional logical indicating whether all entries in the matrix of residual correlations \bold{\Psi} should be positive

par_lower

Optional vector of lower parameter bounds

maxiter

Maximum number of iterations

conv

Convergence criterion for parameters

optimizer

Optimization function to be used. Can be "nlminb" for stats::nlminb or "optim" for stats::optim.

reliability

Logical indicating whether reliability should be computed.

...

Further arguments to be passed.

object

Object of class noharm.sirt

file

String indicating a file name for summary.

Details

The NOHARM item response model follows the response equation

P( X_{pi}=1 | \bold{\theta}_p )=c_i + ( d_i - c_i ) \Phi( f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} )

for item responses X_{pi} of person p on item i, \bold{F}=(f_{id}) is a loading matrix and \bold{P} the covariance matrix of \bold{\theta}_p. The lower asymptotes c_i and upper asymptotes d_i must be provided as fixed values. The response equation can be equivalently written by introducing a latent continuous item response X_{pi}^\ast

X_{pi}^\ast=f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} + e_{pi}

with a standard normally distributed residual e_{pi}. These residuals have a correlation matrix \bold{\Psi} with ones in the diagonal. In this R implementation of the NOHARM model, correlations between residuals are allowed.

The estimation relies on a Hermite series approximation of the normal ogive item response functions. In more detail, a series expansion

\Phi(x)=b_0 + b_1 H_1(x) + b_2 H_2(x) + b_3 H_3(x)

is used (McDonald, 1982a). This enables to express cross products p_{ij}=P(X_i=1, X_j=1) as a function of unknown model parameters

\hat{p}_{ij}=b_{0i} b_{0j} + \sum_{m=1}^3 b_{mi} b_{mj} \left( \frac{\bold{f}_i \bold{P} \bold{f}_j }{\sqrt{ (1+\bold{f}_i \bold{P} \bold{f}_i) (1+\bold{f}_j \bold{P} \bold{f}_j)}} \right) ^m

where b_{0i}=p_{i}=P(X_i=1)=c_i + (d_i - c_i) \Phi(\tau_i), b_{1i}=(d_i-c_i)\phi(\tau_i), b_{2i}=(d_i-c_i)\tau_i \phi(\tau_i) / \sqrt{2}, and b_{3i}=(d_i-c_i)(\tau_i^2 - 1)\phi(\tau_i) / \sqrt{6}.

The least squares criterion \sum_{i<j} ( p_{ij} - \hat{p}_{ij})^2 is used for estimating unknown model parameters (McDonald, 1982a, 1982b, 1997).

For derivations of standard errors and fit statistics see Maydeu-Olivares (2001) and Swaminathan and Rogers (2016).

For the statistical properties of the NOHARM approach see Knol and Berger (1991), Finch (2011) or Svetina and Levy (2016).

Value

A list. The most important entries are

tanaka

Tanaka fit statistic

rmsr

RMSR fit statistic

N.itempair

Sample size per item pair

pm

Product moment matrix

wgtm

Matrix of weights for each item pair

sumwgtm

Sum of lower triangle matrix wgtm

lower

Lower asymptotes

upper

Upper asymptotes

residuals

Residual matrix from approximation of the pm matrix

final.constants

Final constants

factor.cor

Covariance matrix

thresholds

Threshold parameters

uniquenesses

Uniquenesses

loadings

Matrix of standardized factor loadings (delta parametrization)

loadings.theta

Matrix of factor loadings \bold{F} (theta parametrization)

residcorr

Matrix of residual correlations

Nobs

Number of observations

Nitems

Number of items

Fpatt

Pattern loading matrix for \bold{F}

Ppatt

Pattern loading matrix for \bold{P}

Psipatt

Pattern loading matrix for \bold{\Psi}

dat

Used dataset

dimensions

Number of dimensions

iter

Number of iterations

Nestpars

Number of estimated parameters

chisquare

Statistic \chi^2

df

Degrees of freedom

chisquare_df

Ratio \chi^2 / df

rmsea

RMSEA statistic

p.chisquare

Significance for \chi^2 statistic

omega.rel

Reliability of the sum score according to Green and Yang (2009)

References

Finch, H. (2011). Multidimensional item response theory parameter estimation with nonsimple structure items. Applied Psychological Measurement, 35(1), 67-82. doi:10.1177/0146621610367787

Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269. doi:10.1207/s15327906mbr2302_9

Fraser, C., & McDonald, R. P. (2012). NOHARM 4 Manual.
http://noharm.niagararesearch.ca/nh4man/nhman.html.

Knol, D. L., & Berger, M. P. (1991). Empirical comparison between factor analysis and multidimensional item response models. Multivariate Behavioral Research, 26(3), 457-477. doi:10.1207/s15327906mbr2603_5

Maydeu-Olivares, A. (2001). Multidimensional item response theory modeling of binary data: Large sample properties of NOHARM estimates. Journal of Educational and Behavioral Statistics, 26(1), 51-71. doi:10.3102/10769986026001051

McDonald, R. P. (1982a). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379-396. doi:10.1177/014662168200600402

McDonald, R. P. (1982b). Unidimensional and multidimensional models for item response theory. I.R.T., C.A.T. conference, Minneapolis, 1982, Proceedings.

Svetina, D., & Levy, R. (2016). Dimensionality in compensatory MIRT when complex structure exists: Evaluation of DETECT and NOHARM. The Journal of Experimental Education, 84(2), 398-420. doi:10.1080/00220973.2015.1048845

Swaminathan, H., & Rogers, H. J. (2016). Normal-ogive multidimensional models. In W. J. van der Linden (Ed.). Handbook of item response theory. Volume One: Models (pp. 167-187). Boca Raton: CRC Press. doi:10.1201/9781315374512

Examples

#############################################################################
# EXAMPLE 1: Two-dimensional IRT model with 10 items
#############################################################################

#**** data simulation
set.seed(9776)
N <- 3400 # sample size
# define difficulties
f0 <- c( .5, .25, -.25, -.5, 0, -.5, -.25, .25, .5, 0 )
I <- length(f0)
# define loadings
f1 <- matrix( 0, I, 2 )
f1[ 1:5,1] <- c(.8,.7,.6,.5, .5)
f1[ 6:10,2] <- c(.8,.7,.6,.5, .5 )
# covariance matrix
Pval <- matrix( c(1,.5,.5,1), 2, 2 )
# simulate theta
library(mvtnorm)
theta <- mvtnorm::rmvnorm(N, mean=c(0,0), sigma=Pval )
# simulate item responses
dat <- matrix( NA, N, I )
for (ii in 1:I){ # ii <- 1
    dat[,ii] <- 1*( stats::pnorm(f0[ii]+theta[,1]*f1[ii,1]+theta[,2]*f1[ii,2])>
                     stats::runif(N) )
        }
colnames(dat) <- paste0("I", 1:I)

#**** Model 1: Two-dimensional CFA with estimated item loadings
# define pattern matrices
Pval <- .3+0*Pval
Ppatt <- 1*(Pval>0)
diag(Ppatt) <- 0
diag(Pval) <- 1
Fval <- .7 * ( f1>0)
Fpatt <- 1 * ( Fval > 0 )
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
# EAP ability estimates
pmod1 <- sirt::R2noharm.EAP(mod1, theta.k=seq(-4,4,len=10) )
# model fit
summary( sirt::modelfit.sirt(mod1) )

## Not run: 
#*** compare results with NOHARM software
noharm.path <- "c:/NOHARM"   # specify path for noharm software
mod1a <- sirt::R2noharm( dat=dat, model.type="CFA",  F.pattern=Fpatt, F.init=Fval,
             P.pattern=Ppatt, P.init=Pval, writename="r2noharm_example",
             noharm.path=noharm.path, dec="," )
summary(mod1a)

#**** Model 1c: put some equality constraints
Fpatt[ c(1,4),1] <- 3
Fpatt[ cbind( c(3,7), c(1,2)) ] <- 4
mod1c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval)
summary(mod1c)

#**** Model 2: Two-dimensional CFA with correlated residuals
# define pattern matrix for residual correlation
Psipatt <- 0*diag(I)
Psipatt[1,2] <- 1
Psival <- 0*Psipatt
# estimate model
mod2 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt )
summary(mod2)

#**** Model 3: Two-dimensional Rasch model
# pattern matrices
Fval <- matrix(0,10,2)
Fval[1:5,1] <- Fval[6:10,2] <- 1
Fpatt <- 0*Fval
Ppatt <- Pval <- matrix(1,2,2)
Pval[1,2] <- Pval[2,1] <- 0
# estimate model
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod3)
# model fit
summary( sirt::modelfit.sirt( mod3 ))

#** compare fit with NOHARM
noharm.path <- "c:/NOHARM"
P.pattern <- Ppatt ; P.init <- Pval
F.pattern <- Fpatt ; F.init <- Fval
mod3b <- sirt::R2noharm( dat=dat, model.type="CFA",
             F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
             P.init=P.init, writename="example_sim_2dim_rasch",
             noharm.path=noharm.path, dec="," )
summary(mod3b)

#############################################################################
# EXAMPLE 2: data.read
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)

#**** Model 1: Unidimensional Rasch model
Fpatt <- matrix( 0, I, 1 )
Fval <- 1 + 0*Fpatt
Ppatt <- Pval <- matrix(1,1,1)
# estimate model
mod1 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval )
summary(mod1)
plot(mod1)    # semPaths plot

#**** Model 2: Rasch model in which item pairs within a testlet are excluded
wgtm <- matrix( 1, I, I )
wgtm[1:4,1:4] <- wgtm[5:8,5:8] <- wgtm[ 9:12, 9:12] <- 0
# estimation
mod2 <- sirt::noharm.sirt(dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval, wgtm=wgtm)
summary(mod2)

#**** Model 3: Rasch model with correlated residuals
Psipatt <- Psival <- 0*diag(I)
Psipatt[1:4,1:4] <- Psipatt[5:8,5:8] <- Psipatt[ 9:12, 9:12] <- 1
diag(Psipatt) <- 0
Psival <- .6*(Psipatt>0)
# estimation
mod3 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt )
summary(mod3)
# allow only positive residual correlations
mod3b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3b)
#* constrain residual correlations
Psipatt[1:4,1:4] <- 2
Psipatt[5:8,5:8] <- 3
Psipatt[ 9:12, 9:12] <- 4
mod3c <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt, Fpatt=Fpatt, Fval=Fval, Pval=Pval,
            Psival=Psival, Psipatt=Psipatt, pos.residcorr=TRUE)
summary(mod3c)

#**** Model 4: Rasch testlet model
Fval <- Fpatt <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- 1
Ppatt <- Pval <- diag(4)
colnames(Ppatt) <- c("g", "A", "B","C")
Pval <- .5*Pval
# estimation
mod4 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod4)
# allow only positive variance entries
mod4b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
               pos.variance=TRUE )
summary(mod4b)

#**** Model 5: Bifactor model
Fval <- matrix( 0, I, 4 )
Fval[,1] <- Fval[1:4,2] <- Fval[5:8,3] <- Fval[9:12,4 ] <- .6
Fpatt <- 1 * ( Fval > 0 )
Pval <- diag(4)
Ppatt <- 0*Pval
colnames(Ppatt) <- c("g", "A", "B","C")
# estimation
mod5 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod5)
# allow only positive loadings
mod5b <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval,
              pos.loading=TRUE )
summary(mod5b)
summary( sirt::modelfit.sirt(mod5b))

#**** Model 6: 3-dimensional Rasch model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- 0*Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod6 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod6)
summary( sirt::modelfit.sirt(mod6) )  # model fit

#**** Model 7: 3-dimensional 2PL model
Fval <- matrix( 0, I, 3 )
Fval[1:4,1] <- Fval[5:8,2] <- Fval[9:12,3 ] <- 1
Fpatt <- Fval
Pval <- .6*diag(3)
diag(Pval) <- 1
Ppatt <- 1+0*Pval
diag(Ppatt) <- 0
colnames(Ppatt) <- c("A", "B","C")
# estimation
mod7 <- sirt::noharm.sirt( dat=dat, Ppatt=Ppatt,Fpatt=Fpatt, Fval=Fval, Pval=Pval  )
summary(mod7)
summary( sirt::modelfit.sirt(mod7) )

#**** Model 8: Exploratory factor analysis with 3 dimensions
# estimation
mod8 <- sirt::noharm.sirt( dat=dat, dimensions=3  )
summary(mod8)

#############################################################################
# EXAMPLE 3: Product-moment matrix input, McDonald (1997)
#############################################################################

# data from Table 1 of McDonald (1997, p. 266)
pm0 <- "
0.828
0.567 0.658
0.664 0.560 0.772
0.532 0.428 0.501 0.606
0.718 0.567 0.672 0.526 0.843
"
pm <- miceadds::string_to_matrix(x=pm0, as_numeric=TRUE, extend=TRUE)
I <- nrow(pm)
rownames(pm) <- colnames(pm) <- paste0("I", 1:I)

#- Model 1: Unidimensional model
Fval <- matrix(.7, nrow=I, ncol=1)
Fpatt <- 1+0*Fval
Pval <- matrix(1, nrow=1,ncol=1)
Ppatt <- 0*Pval

mod1 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt)
summary(mod1)

#- Model 2: Twodimensional exploratory model
mod2 <- sirt::noharm.sirt(pm=pm, N=1000, dimensions=2)
summary(mod2)

#- Model 3: Unidimensional model with correlated residuals
Psival <- matrix(0, nrow=I, ncol=I)
Psipatt <- 0*Psival
Psipatt[5,1] <- 1

mod3 <- sirt::noharm.sirt(pm=pm, N=1000, Fval=Fval, Fpatt=Fpatt, Pval=Pval, Ppatt=Ppatt,
            Psival=Psival, Psipatt=Psipatt)
summary(mod3)

## End(Not run)

Nonparametric Estimation of Item Response Functions

Description

This function does nonparametric item response function estimation (Ramsay, 1991).

Usage

np.dich(dat, theta, thetagrid, progress=FALSE, bwscale=1.1,
       method="normal")

Arguments

dat

An N \times I data frame of dichotomous item responses

theta

Estimated theta values, for example weighted likelihood estimates from wle.rasch

thetagrid

A vector of theta values where the nonparametric item response functions shall be evaluated.

progress

Display progress?

bwscale

The bandwidth parameter h is calculated by the formula h=bwscale\cdot N^{-1/5}

method

The default normal performs kernel regression with untransformed item responses. The method binomial uses nonparametric logistic regression implemented in the sm library.

Value

A list with following entries

dat

Original data frame

thetagrid

Vector of theta values at which the item response functions are evaluated

theta

Used theta values as person parameter estimates

estimate

Estimated item response functions

...

References

Ramsay, J. O. (1991). Kernel smoothing approaches to nonparametric item characteristic curve estimation. Psychometrika, 56, 611-630.

Examples

#############################################################################
# EXAMPLE 1: Reading dataset
#############################################################################
data( data.read )
dat <- data.read

# estimate Rasch model
mod <- sirt::rasch.mml2( dat )
# WLE estimation
wle1 <- sirt::wle.rasch( dat=dat, b=mod$item$b )$theta
# nonparametric function estimation
np1 <- sirt::np.dich( dat=dat, theta=wle1, thetagrid=seq(-2.5, 2.5, len=100 ) )
print( str(np1))
# plot nonparametric item response curves
plot( np1, b=mod$item$b )

Includes Confidence Interval in Parameter Summary Table

Description

Includes confidence interval in parameter summary table.

Usage

parmsummary_extend(dfr, level=.95, est_label="est", se_label="se",
      df_label="df")

Arguments

dfr

Data frame containing parameter summary

level

Significance level

est_label

Label for parameter estimate

se_label

Label for standard error

df_label

Label for degrees of freedom

Value

Extended parameter summary table

Examples

#############################################################################
## EXAMPLE 1: Toy example parameter summary table
#############################################################################

dfr <- data.frame( "parm"=c("b0", "b1" ), "est"=c(0.1, 1.3 ),
                "se"=c(.21, .32) )
print( sirt::parmsummary_extend(dfr), digits=4 )
  ##    parm est   se      t         p lower95 upper95
  ##  1   b0 0.1 0.21 0.4762 6.339e-01 -0.3116  0.5116
  ##  2   b1 1.3 0.32 4.0625 4.855e-05  0.6728  1.9272

Cumulative Function for the Bivariate Normal Distribution

Description

This function evaluates the bivariate normal distribution \Phi_2 ( x, y ; \rho ) assuming zero means and unit variances. It uses a simple approximation by Cox and Wermuth (1991) with corrected formulas in Hong (1999).

Usage

pbivnorm2(x, y, rho)

Arguments

x

Vector of x coordinates

y

Vector of y coordinates

rho

Vector of correlations between random normal variates

Value

Vector of probabilities

Note

The function is less precise for correlations near 1 or -1.

References

Cox, D. R., & Wermuth, N. (1991). A simple approximation for bivariate and trivariate normal integrals. International Statistical Review, 59(2), 263-269.

Hong, H. P. (1999). An approximation to bivariate and trivariate normal integrals. Engineering and Environmental Systems, 16(2), 115-127. doi:10.1080/02630259908970256

Examples

library(pbivnorm)
# define input
x <- c(0, 0,  .5, 1, 1  )
y <- c( 0, -.5,  1, 3, .5 )
rho <- c( .2, .8, -.4, .6, .5 )
# compare pbivnorm2 and pbivnorm functions
pbiv2 <- sirt::pbivnorm2( x=x, y=y, rho=rho )
pbiv <- pbivnorm::pbivnorm(  x,  y, rho=rho )
max( abs(pbiv-pbiv2))
  ## [1] 0.0030626
round( cbind( x, y, rho,pbiv, pbiv2 ), 4 )
  ##          x    y  rho   pbiv  pbiv2
  ##   [1,] 0.0  0.0  0.2 0.2820 0.2821
  ##   [2,] 0.0 -0.5  0.8 0.2778 0.2747
  ##   [3,] 0.5  1.0 -0.4 0.5514 0.5514
  ##   [4,] 1.0  3.0  0.6 0.8412 0.8412
  ##   [5,] 1.0  0.5  0.5 0.6303 0.6304

Conversion of the Parameterization of the Partial Credit Model

Description

Converts a parameterization of the partial credit model (see Details).

Usage

pcm.conversion(b)

Arguments

b

Matrix of item-category-wise intercepts b_{ik} (see Details).

Details

Assume that the input matrix b containing parameters b_{ik} is defined according to the following parametrization of the partial credit model

P( X_{pi}=k | \theta_p ) \propto exp ( k \theta_p - b_{ik} )

if item i possesses K_i categories. The transformed parameterization is defined as

b_{ik}=k \delta_i + \sum_{v=1}^{k} \tau_{iv} \quad \mbox{with} \quad \sum_{k=1}^{K_i} \tau_{ik}=0

The function pcm.conversion has the \delta and \tau parameters as values. The \delta parameter is simply \delta_i=b_{iK_i} / K_i.

Value

List with the following entries

delta

Vector of \delta parameters

tau

Matrix of \tau parameters

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Transformation PCM for data.mg
#############################################################################

library(CDM)
data(data.mg,package="CDM")
dat <- data.mg[ 1:1000, paste0("I",1:11) ]

#*** Model 1: estimate partial credit model in parameterization "PCM"
mod1a <- TAM::tam.mml( dat, irtmodel="PCM")
# use parameterization "PCM2"
mod1b <- TAM::tam.mml( dat, irtmodel="PCM2")
summary(mod1a)
summary(mod1b)

# convert parameterization of Model 1a into parameterization of Model 1b
b <- mod1a$item[, c("AXsi_.Cat1","AXsi_.Cat2","AXsi_.Cat3") ]
# compare results
pcm.conversion(b)
mod1b$xsi

## End(Not run)

Item and Person Fit Statistics for the Partial Credit Model

Description

Computes item and person fit statistics in the partial credit model (Wright & Masters, 1990). The rating scale model is accommodated as a particular partial credit model (see Example 3).

Usage

pcm.fit(b, theta, dat)

Arguments

b

Matrix with item category parameters (see Examples)

theta

Vector with estimated person parameters

dat

Dataset with item responses

Value

A list with entries

itemfit

Item fit statistics

personfit

Person fit statistics

References

Wright, B. D., & Masters, G. N. (1990). Computation of outfit and infit statistics. Rasch Measurement Transactions, 3:4, 84-85.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Partial credit model
#############################################################################

data(data.Students,package="CDM")
dat <- data.Students
# select items
items <- c(paste0("sc", 1:4 ), paste0("mj", 1:4 ) )
dat <- dat[,items]
dat <- dat[ rowSums( 1 - is.na(dat) ) > 0, ]

#*** Model 1a: Partial credit model in TAM
# estimate model
mod1a <- TAM::tam.mml( resp=dat )
summary(mod1a)
# estimate person parameters
wle1a <- TAM::tam.wle(mod1a)
# extract item parameters
b1 <- - mod1a$AXsi[, -1 ]
# parametrization in xsi parameters
b2 <- matrix( mod1a$xsi$xsi, ncol=3, byrow=TRUE )
# convert b2 to b1
b1b <- 0*b1
b1b[,1] <- b2[,1]
b1b[,2] <- rowSums( b2[,1:2] )
b1b[,3] <- rowSums( b2[,1:3] )
# assess fit
fit1a <- sirt::pcm.fit(b=b1, theta=wle1a$theta, dat)
fit1a$item

#############################################################################
# EXAMPLE 2: Rasch model
#############################################################################

data(data.read)
dat <- data.read

#*** Rasch model in TAM
# estimate model
mod <- TAM::tam.mml( resp=dat )
summary(mod)
# estimate person parameters
wle <- TAM::tam.wle(mod)
# extract item parameters
b1 <- - mod$AXsi[, -1 ]
# assess fit
fit1a <- sirt::pcm.fit(b=b1, theta=wle$theta, dat)
fit1a$item

#############################################################################
# EXAMPLE 3: Rating scale model
#############################################################################

data(data.Students,package="CDM")
dat <- data.Students
items <- paste0("sc", 1:4 )
dat <- dat[,items]
dat <- dat[ rowSums( 1 - is.na(dat) ) > 0, ]

#*** Model 1: Rating scale model in TAM
# estimate model
mod1 <- tam.mml( resp=dat, irtmodel="RSM")
summary(mod1)
# estimate person parameters
wle1 <- tam.wle(mod1)
# extract item parameters
b1 <- - mod1a$AXsi[, -1 ]
# fit statistic
pcm.fit(b=b1, theta=wle1$theta, dat)

## End(Not run)

Person Parameter Estimation of the Rasch Copula Model (Braeken, 2011)

Description

Ability estimates as maximum likelihood estimates (MLE) are provided by the Rasch copula model.

Usage

person.parameter.rasch.copula(raschcopula.object, numdiff.parm=0.001,
    conv.parm=0.001, maxiter=20, stepwidth=1,
    print.summary=TRUE, ...)

Arguments

raschcopula.object

Object which is generated by the rasch.copula2 function.

numdiff.parm

Parameter h for numerical differentiation

conv.parm

Convergence criterion

maxiter

Maximum number of iterations

stepwidth

Maximal increment in iterations

print.summary

Print summary?

...

Further arguments to be passed

Value

A list with following entries

person

Estimated person parameters

se.inflat

Inflation of individual standard errors due to local dependence

theta.table

Ability estimates for each unique response pattern

pattern.in.data

Item response pattern

summary.theta.table

Summary statistics of person parameter estimates

Examples

#############################################################################
# EXAMPLE 1: Reading Data
#############################################################################

data(data.read)
dat <- data.read

# define item cluster
itemcluster <- rep( 1:3, each=4 )
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod1)

# person parameter estimation under the Rasch copula model
pmod1 <- sirt::person.parameter.rasch.copula(raschcopula.object=mod1 )
## Mean percentage standard error inflation
##   missing.pattern Mperc.seinflat
## 1               1           6.35

## Not run: 
#############################################################################
# EXAMPLE 2: 12 items nested within 3 item clusters (testlets)
#   Cluster 1 -> Items 1-4; Cluster 2 -> Items 6-9;  Cluster 3 -> Items 10-12
#############################################################################

set.seed(967)
I <- 12                             # number of items
n <- 450                            # number of persons
b <- seq(-2,2, len=I)               # item difficulties
b <- sample(b)                      # sample item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ 1:4 ] <- 1
itemcluster[ 6:9 ] <- 2
itemcluster[ 10:12 ] <- 3
# residual correlations
rho <- c( .35, .25, .30 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimate Rasch copula model
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod1)

# person parameter estimation under the Rasch copula model
pmod1 <- sirt::person.parameter.rasch.copula(raschcopula.object=mod1 )
  ## Mean percentage standard error inflation
  ##   missing.pattern Mperc.seinflat
  ## 1               1          10.48

## End(Not run)

Person Fit Statistics for the Rasch Model

Description

This function collects some person fit statistics for the Rasch model (Karabatsos, 2003; Meijer & Sijtsma, 2001).

Usage

personfit.stat(dat, abil, b)

Arguments

dat

An N \times I data frame of dichotomous item responses

abil

An ability estimate, e.g. the WLE

b

Estimated item difficulty

Value

A data frame with following columns (see Meijer & Sijtsma 2001 for a review of different person fit statistics):

case

Case index

abil

Ability estimate abil

mean

Person mean of correctly solved items

caution

Caution index

depend

Dependability index

ECI1

ECI2

ECI3

ECI4

ECI5

ECI6

l0

Fit statistic l_0

lz

Fit statistic l_z

outfit

Person outfit statistic

infit

Person infit statistic

rpbis

Point biserial correlation of item responses and item p values

rpbis.itemdiff

Point biserial correlation of item responses and item difficulties b

U3

Fit statistic U_3

References

Karabatsos, G. (2003). Comparing the aberrant response detection performance of thirty-six person-fit statistics. Applied Measurement in Education, 16, 277-298.

Meijer, R. R., & Sijtsma, K. (2001). Methodology review: Evaluating person fit. Applied Psychological Measurement, 25, 107-135.

Examples

#############################################################################
# EXAMPLE 1: Person fit Reading Data
#############################################################################

data(data.read)
dat <- data.read

# estimate Rasch model
mod <- sirt::rasch.mml2( dat )
# WLE
wle1 <- sirt::wle.rasch( dat,b=mod$item$b )$theta
b <- mod$item$b # item difficulty

# evaluate person fit
pf1 <- sirt::personfit.stat( dat=dat, abil=wle1, b=b)

## Not run: 
# dimensional analysis of person fit statistics
x0 <- stats::na.omit(pf1[, -c(1:3) ] )
stats::factanal( x=x0, factors=2, rotation="promax" )
  ## Loadings:
  ##                Factor1 Factor2
  ## caution         0.914
  ## depend          0.293   0.750
  ## ECI1            0.869   0.160
  ## ECI2            0.869   0.162
  ## ECI3            1.011
  ## ECI4            1.159  -0.269
  ## ECI5            1.012
  ## ECI6            0.879   0.130
  ## l0              0.409  -1.255
  ## lz             -0.504  -0.529
  ## outfit          0.297   0.702
  ## infit           0.362   0.695
  ## rpbis          -1.014
  ## rpbis.itemdiff  1.032
  ## U3              0.735   0.309
  ##
  ## Factor Correlations:
  ##         Factor1 Factor2
  ## Factor1   1.000  -0.727
  ## Factor2  -0.727   1.000
  ##

## End(Not run)

Calculation of Probabilities and Moments for the Generalized Logistic Item Response Model

Description

Calculation of probabilities and moments for the generalized logistic item response model (Stukel, 1988).

Usage

pgenlogis(x, alpha1=0, alpha2=0)

genlogis.moments(alpha1, alpha2)

Arguments

x

Vector

alpha1

Upper tail parameter \alpha_1 in the generalized logistic item response model. The default is 0.

alpha2

Lower tail parameter \alpha_2 parameter in the generalized logistic item response model. The default is 0.

Details

The class of generalized logistic link functions contain the most important link functions using the specifications (Stukel, 1988):

logistic link function L:

L(x) \approx G_{ ( \alpha_1=0, \alpha_2=0)}[ x ]
probit link function \Phi:

\Phi(x) \approx G_{ ( \alpha_1=0.165, \alpha_2=0.165)}[ 1.47 x ]
loglog link function H:

H(x) \approx G_{ (\alpha_1=-0.037, \alpha_2=0.62)}[ -0.39+1.20x-0.007x^2]
cloglog link function H:

H(x) \approx G_{ ( \alpha_1=0.62, \alpha_2=-0.037)}[ 0.54+1.64x+0.28x^2+0.046x^3]

Value

Vector of probabilities or moments

References

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

Examples

sirt::pgenlogis( x=c(-.3, 0, .25, 1 ), alpha1=0, alpha2=.6 )
  ##   [1] 0.4185580 0.5000000 0.5621765 0.7310586

####################################################################
# compare link functions
x <- seq( -3,3, .1 )

#***
# logistic link
y <- sirt::pgenlogis( x, alpha1=0, alpha2=0 )
plot( x, stats::plogis(x), type="l", main="Logistic Link", lwd=2)
points( x, y, pch=1, col=2 )

#***
# probit link
round( sirt::genlogis.moments( alpha1=.165, alpha2=.165 ), 3 )
  ##       M    SD   Var
  ##   0.000 1.472 2.167
# SD of generalized logistic link function is 1.472
y <- sirt::pgenlogis( x * 1.47, alpha1=.165, alpha2=.165 )
plot( x, stats::pnorm(x), type="l", main="Probit Link", lwd=2)
points( x, y, pch=1, col=2 )

#***
# loglog link
y <- sirt::pgenlogis( -.39 + 1.20*x -.007*x^2, alpha1=-.037, alpha2=.62 )
plot( x, exp( - exp( -x ) ), type="l", main="Loglog Link", lwd=2,
    ylab="loglog(x)=exp(-exp(-x))" )
points( x, y, pch=17, col=2 )

#***
# cloglog link
y <- sirt::pgenlogis( .54+1.64*x +.28*x^2 + .046*x^3, alpha1=.062, alpha2=-.037 )
plot( x, 1-exp( - exp(x) ), type="l", main="Cloglog Link", lwd=2,
    ylab="loglog(x)=1-exp(-exp(x))" )
points( x, y, pch=17, col=2 )

Plausible Value Imputation in Generalized Logistic Item Response Model

Description

This function performs unidimensional plausible value imputation (Adams & Wu, 2007; Mislevy, 1991).

Usage

plausible.value.imputation.raschtype(data=NULL, f.yi.qk=NULL, X,
   Z=NULL, beta0=rep(0, ncol(X)), sig0=1, b=rep(1, ncol(X)),
   a=rep(1, length(b)), c=rep(0, length(b)), d=1+0*b,
   alpha1=0, alpha2=0, theta.list=seq(-5, 5, len=50),
   cluster=NULL, iter, burnin, nplausible=1, printprogress=TRUE)

Arguments

data

An N \times I data frame of dichotomous responses

f.yi.qk

X

A matrix of individual covariates for the latent regression of \theta on X

Z

A matrix of individual covariates for the regression of individual residual variances on Z

beta0

Initial vector of regression coefficients

sig0

Initial vector of coefficients for the variance heterogeneity model

b

Vector of item difficulties. It must not be provided if the individual likelihood f.yi.qk is specified.

a

Optional vector of item slopes

c

Optional vector of lower item asymptotes

d

Optional vector of upper item asymptotes

alpha1

Parameter \alpha_1 in generalized item response model

alpha2

Parameter \alpha_2 in generalized item response model

theta.list

Vector of theta values at which the ability distribution should be evaluated

cluster

Cluster identifier (e.g. schools or classes) for including theta means in the plausible imputation.

iter

Number of iterations

burnin

Number of burn-in iterations for plausible value imputation

nplausible

Number of plausible values

printprogress

A logical indicated whether iteration progress should be displayed at the console.

Details

Plausible values are drawn from the latent regression model with heterogeneous variances:

\theta_p=X_p \beta + \epsilon_p \quad, \quad \epsilon_p \sim N( 0, \sigma_p^2 ) \quad, \quad \log( \sigma_p )=Z_p \gamma + \nu_p

Value

A list with following entries:

coefs.X

Sampled regression coefficients for covariates X

coefs.Z

Sampled coefficients for modeling variance heterogeneity for covariates Z

pvdraws

Matrix with drawn plausible values

posterior

Posterior distribution from last iteration

EAP

Individual EAP estimate

SE.EAP

Standard error of the EAP estimate

pv.indexes

Index of iterations for which plausible values were drawn

References

Adams, R., & Wu. M. (2007). The mixed-coefficients multinomial logit model: A generalized form of the Rasch model. In M. von Davier & C. H. Carstensen: Multivariate and Mixture Distribution Rasch Models: Extensions and Applications (pp. 57-76). New York: Springer.

Mislevy, R. J. (1991). Randomization-based inference about latent variables from complex samples. Psychometrika, 56, 177-196.

Examples

#############################################################################
# EXAMPLE 1: Rasch model with covariates
#############################################################################

set.seed(899)
I <- 21     # number of items
b <- seq(-2,2, len=I)   # item difficulties
n <- 2000       # number of students

# simulate theta and covariates
theta <- stats::rnorm( n )
x <- .7 * theta + stats::rnorm( n, .5 )
y <- .2 * x+ .3*theta + stats::rnorm( n, .4 )
dfr <- data.frame( theta, 1, x, y )

# simulate Rasch model
dat1 <- sirt::sim.raschtype( theta=theta, b=b )

# Plausible value draws
pv1 <- sirt::plausible.value.imputation.raschtype(data=dat1, X=dfr[,-1], b=b,
            nplausible=3, iter=10, burnin=5)
# estimate linear regression based on first plausible value
mod1 <- stats::lm( pv1$pvdraws[,1] ~ x+y )
summary(mod1)
  ##               Estimate Std. Error t value Pr(>|t|)
  ##   (Intercept) -0.27755    0.02121  -13.09   <2e-16 ***
  ##   x            0.40483    0.01640   24.69   <2e-16 ***
  ##   y            0.20307    0.01822   11.15   <2e-16 ***

# true regression estimate
summary( stats::lm( theta ~ x + y ) )
  ## Coefficients:
  ##             Estimate Std. Error t value Pr(>|t|)
  ## (Intercept) -0.27821    0.01984  -14.02   <2e-16 ***
  ## x            0.40747    0.01534   26.56   <2e-16 ***
  ## y            0.18189    0.01704   10.67   <2e-16 ***

## Not run: 
#############################################################################
# EXAMPLE 2: Classical test theory, homogeneous regression variance
#############################################################################

set.seed(899)
n <- 3000       # number of students
x <- round( stats::runif( n, 0,1 ) )
y <- stats::rnorm(n)
# simulate true score theta
theta <- .4*x + .5 * y + stats::rnorm(n)
# simulate observed score by adding measurement error
sig.e <- rep( sqrt(.40), n )
theta_obs <- theta + stats::rnorm( n, sd=sig.e)

# define theta grid for evaluation of density
theta.list <- mean(theta_obs) + stats::sd(theta_obs) * seq( - 5, 5, length=21)
# compute individual likelihood
f.yi.qk <- stats::dnorm( outer( theta_obs, theta.list, "-" ) / sig.e )
f.yi.qk <- f.yi.qk / rowSums(f.yi.qk)
# define covariates
X <- cbind( 1, x, y )
# draw plausible values
mod2 <- sirt::plausible.value.imputation.raschtype( f.yi.qk=f.yi.qk,
                  theta.list=theta.list, X=X, iter=10, burnin=5)

# linear regression
mod1 <- stats::lm( mod2$pvdraws[,1] ~ x+y )
summary(mod1)
  ##             Estimate Std. Error t value Pr(>|t|)
  ## (Intercept) -0.01393    0.02655  -0.525      0.6
  ## x            0.35686    0.03739   9.544   <2e-16 ***
  ## y            0.53759    0.01872  28.718   <2e-16 ***

# true regression model
summary( stats::lm( theta ~ x + y ) )
  ##             Estimate Std. Error t value Pr(>|t|)
  ## (Intercept) 0.002931   0.026171   0.112    0.911
  ## x           0.359954   0.036864   9.764   <2e-16 ***
  ## y           0.509073   0.018456  27.584   <2e-16 ***

#############################################################################
# EXAMPLE 3: Classical test theory, heterogeneous regression variance
#############################################################################

set.seed(899)
n <- 5000       # number of students
x <- round( stats::runif( n, 0,1 ) )
y <- stats::rnorm(n)
# simulate true score theta
theta <- .4*x + .5 * y + stats::rnorm(n) * ( 1 - .4 * x )
# simulate observed score by adding measurement error
sig.e <- rep( sqrt(.40), n )
theta_obs <- theta + stats::rnorm( n, sd=sig.e)

# define theta grid for evaluation of density
theta.list <- mean(theta_obs) + stats::sd(theta_obs) * seq( - 5, 5, length=21)
# compute individual likelihood
f.yi.qk <- stats::dnorm( outer( theta_obs, theta.list, "-" ) / sig.e )
f.yi.qk <- f.yi.qk / rowSums(f.yi.qk)
# define covariates
X <- cbind( 1, x, y )
# draw plausible values (assuming variance homogeneity)
mod3a <- sirt::plausible.value.imputation.raschtype( f.yi.qk=f.yi.qk,
                  theta.list=theta.list, X=X, iter=10, burnin=5)
# draw plausible values (assuming variance heterogeneity)
#  -> include predictor Z
mod3b <- sirt::plausible.value.imputation.raschtype( f.yi.qk=f.yi.qk,
                  theta.list=theta.list, X=X, Z=X, iter=10, burnin=5)

# investigate variance of theta conditional on x
res3 <- sapply( 0:1, FUN=function(vv){
        c( stats::var(theta[x==vv]), stats::var(mod3b$pvdraw[x==vv,1]),
              stats::var(mod3a$pvdraw[x==vv,1]))})
rownames(res3) <- c("true", "pv(hetero)", "pv(homog)" )
colnames(res3) <- c("x=0","x=1")
  ## > round( res3, 2 )
  ##             x=0  x=1
  ## true       1.30 0.58
  ## pv(hetero) 1.29 0.55
  ## pv(homog)  1.06 0.77
## -> assuming heteroscedastic variances recovers true conditional variance

## End(Not run)

Plot Function for Objects of Class `mcmc.sirt`

Description

Plot function for objects of class mcmc.sirt. These objects are generated by: mcmc.2pno, mcmc.2pnoh, mcmc.3pno.testlet, mcmc.2pno.ml

Usage

## S3 method for class 'mcmc.sirt'
plot( x, layout=1, conflevel=0.9, round.summ=3,
   lag.max=.1, col.smooth="red", lwd.smooth=2, col.ci="orange",
   cex.summ=1, ask=FALSE, ...)

Arguments

x

Object of class mcmc.sirt

layout

Layout type. layout=1 is the standard coda plot output, layout=2 gives a slightly different display.

conflevel

Confidence level (only applies to layout=2)

round.summ

Number of digits to be rounded in summary (only applies to layout=2)

lag.max

Maximum lag for autocorrelation plot (only applies to layout=2). The default of .1 means that it is set to 1/10 of the number of iterations.

col.smooth

Color of smooth trend in traceplot (only applies to layout=2)

lwd.smooth

Line type of smooth trend in traceplot (only applies to layout=2)

col.ci

Color for displaying confidence interval (only applies to layout=2)

cex.summ

Cex size for descriptive summary (only applies to layout=2)

ask

Ask for a new plot (only applies to layout=2)

...

Further arguments to be passed

Plot Method for Object of Class `np.dich`

Description

This function plots nonparametric item response functions estimated with dich.np.

Usage

## S3 method for class 'np.dich'
plot(x, b, infit=NULL, outfit=NULL,
    nsize=100, askplot=TRUE, progress=TRUE, bands=FALSE,
    plot.b=FALSE, shade=FALSE, shadecol="burlywood1", ...)

Arguments

x

Object of class np.dich

b

Estimated item difficulty (threshold)

infit

Infit (optional)

outfit

Outfit (optional)

nsize

XXX

askplot

Ask for new plot?

progress

Display progress?

bands

Draw confidence bands?

plot.b

Plot difficulty parameter?

shade

Shade curves?

shadecol

Shade color

...

Further arguments to be passed

Polychoric Correlation

Description

This function estimates the polychoric correlation coefficient using maximum likelihood estimation (Olsson, 1979).

Usage

polychoric2(dat, maxiter=100, cor.smooth=TRUE, use_pbv=1, conv=1e-10,
      rho_init=NULL, weights=NULL)

## exported Rcpp function
sirt_rcpp_polychoric2( dat, maxK, maxiter, use_pbv, conv, rho_init, weights)

Arguments

dat

A dataset with integer values 0,1,\ldots,K

maxiter

Maximum number of iterations

cor.smooth

An optional logical indicating whether the polychoric correlation matrix should be smooth to ensure positive definiteness.

use_pbv

Integer indicating whether the pbv package is used for computation of bivariate normal distribution. 0 stands for the simplest approximation in sirt (Cox & Wermuth, 1991, as implemented in polychoric2) while versions 1 and 2 uses the algorithm of pbv (the first one copied into the sirt package, the second one linking Rcpp code to pbv.)

conv

Convergence criterion

rho_init

Optional matrix of initial values for polychoric correlations

weights

Optional vector of sampling weights

maxK

Maximum number of categories

Value

A list with following entries

tau

Matrix of thresholds

rho

Polychoric correlation matrix

Nobs

Sample size for every item pair

maxcat

Maximum number of categories per item

References

Cox, D. R., & Wermuth, N. (1991). A simple approximation for bivariate and trivariate normal integrals. International Statistical Review, 59(2), 263-269.

Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443-460. doi:10.1007/BF02296207

Examples

#############################################################################
# EXAMPLE 1: data.Students | activity scale
#############################################################################

data(data.Students, package="CDM")
dat <- data.Students[, paste0("act", 1:5 ) ]

# tetrachoric correlation from psych package
library(psych)
t0 <- psych::polychoric(dat)$rho
# Olsson method (maximum likelihood estimation)
t1 <- sirt::polychoric2(dat)$rho
# maximum absolute difference
max( abs( t0 - t1 ) )
  ##   [1] 0.004102429

Parsing a Prior Model

Description

Parses a string specifying a prior model which is needed for the prior argument in LAM::amh

Usage

prior_model_parse(prior_model)

Arguments

prior_model

String specifying the prior conforming to R syntax.

Value

List with specified prior distributions for parameters as needed for the prior argument in LAM::amh

Examples

#############################################################################
# EXAMPLE 1: Toy example prior distributions
#############################################################################

#*** define prior model as a string
prior_model <- "
  # prior distributions means
  mu1 ~ dnorm( NA, mean=0, sd=1 )
  mu2 ~ dnorm(NA)       # mean T2 and T3
  # prior distribution standard deviation
  sig1 ~ dunif(NA,0, max=10)
      "

#*** convert priors into a list
res <- sirt::prior_model_parse( prior_model )
str(res)
  ##  List of 3
  ##   $ mu1 :List of 2
  ##    ..$ : chr "dnorm"
  ##    ..$ :List of 3
  ##    .. ..$ NA  : num NA
  ##    .. ..$ mean: num 0
  ##    .. ..$ sd  : num 1
  ##   $ mu2 :List of 2
  ##    ..$ : chr "dnorm"
  ##    ..$ :List of 1
  ##    .. ..$ : num NA
  ##   $ sig1:List of 2
  ##    ..$ : chr "dunif"
  ##    ..$ :List of 3
  ##    .. ..$ NA : num NA
  ##    .. ..$ NA : num 0
  ##    .. ..$ max: num 10

Proportional Reduction of Mean Squared Error (PRMSE) for Subscale Scores

Description

This function estimates the proportional reduction of mean squared error (PRMSE) according to Haberman (Haberman 2008; Haberman, Sinharay & Puhan, 2008; see Meijer et al. 2017 for an overview).

Usage

prmse.subscores.scales(data, subscale)

Arguments

data

An N \times I data frame of item responses

subscale

Vector of labels corresponding to subscales

Value

Matrix with columns corresponding to subscales
The symbol X denotes the subscale and Z the whole scale (see also in the Examples section for the structure of this matrix).

References

Haberman, S. J. (2008). When can subscores have value? Journal of Educational and Behavioral Statistics, 33, 204-229.

Haberman, S., Sinharay, S., & Puhan, G. (2008). Reporting subscores for institutions. British Journal of Mathematical and Statistical Psychology, 62, 79-95.

Meijer, R. R., Boeve, A. J., Tendeiro, J. N., Bosker, R. J., & Albers, C. J. (2017). The use of subscores in higher education: When is this useful?. Frontiers in Psychology | Educational Psychology, 8.

Examples

#############################################################################
# EXAMPLE 1: PRMSE Reading data data.read
#############################################################################

data( data.read )
p1 <- sirt::prmse.subscores.scales(data=data.read,
         subscale=substring( colnames(data.read), 1,1 ) )
print( p1, digits=3 )
  ##                 A       B       C
  ## N         328.000 328.000 328.000
  ## nX          4.000   4.000   4.000
  ## M.X         2.616   2.811   3.253
  ## Var.X       1.381   1.059   1.107
  ## SD.X        1.175   1.029   1.052
  ## alpha.X     0.545   0.381   0.640
  ## [...]
  ## nZ         12.000  12.000  12.000
  ## M.Z         8.680   8.680   8.680
  ## Var.Z       5.668   5.668   5.668
  ## SD.Z        2.381   2.381   2.381
  ## alpha.Z     0.677   0.677   0.677
  ## [...]
  ## cor.TX_Z    0.799   0.835   0.684
  ## rmse.X      0.585   0.500   0.505
  ## rmse.Z      0.522   0.350   0.614
  ## rmse.XZ     0.495   0.350   0.478
  ## prmse.X     0.545   0.381   0.640
  ## prmse.Z     0.638   0.697   0.468
  ## prmse.XZ    0.674   0.697   0.677
#-> Scales A and B do not have lower RMSE,
#   but for scale C the RMSE is smaller than the RMSE of a
#   prediction based on a whole scale.

Probabilistic Guttman Model

Description

This function estimates the probabilistic Guttman model which is a special case of an ordered latent trait model (Hanson, 2000; Proctor, 1970).

Usage

prob.guttman(dat, pid=NULL, guess.equal=FALSE,  slip.equal=FALSE,
    itemlevel=NULL, conv1=0.001, glob.conv=0.001, mmliter=500)

## S3 method for class 'prob.guttman'
summary(object,...)

## S3 method for class 'prob.guttman'
anova(object,...)

## S3 method for class 'prob.guttman'
logLik(object,...)

## S3 method for class 'prob.guttman'
IRT.irfprob(object,...)

## S3 method for class 'prob.guttman'
IRT.likelihood(object,...)

## S3 method for class 'prob.guttman'
IRT.posterior(object,...)

Arguments

dat

An N \times I data frame of dichotomous item responses

pid

Optional vector of person identifiers

guess.equal

Should the same guessing parameters for all the items estimated?

slip.equal

Should the same slipping parameters for all the items estimated?

itemlevel

A vector of item levels of the Guttman scale for each item. If there are K different item levels, then the Guttman scale possesses K ordered trait values.

conv1

Convergence criterion for item parameters

glob.conv

Global convergence criterion for the deviance

mmliter

Maximum number of iterations

object

Object of class prob.guttman

...

Further arguments to be passed

Value

An object of class prob.guttman

person

Estimated person parameters

item

Estimated item parameters

theta.k

Ability levels

trait

Estimated trait distribution

ic

Information criteria

deviance

Deviance

iter

Number of iterations

itemdesign

Specified allocation of items to trait levels

References

Hanson, B. (2000). IRT parameter estimation using the EM algorithm. Technical Report.

Proctor, C. H. (1970). A probabilistic formulation and statistical analysis for Guttman scaling. Psychometrika, 35, 73-78.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)
dat <- data.read

#***
# Model 1: estimate probabilistic Guttman model
mod1 <- sirt::prob.guttman( dat )
summary(mod1)

#***
# Model 2: probabilistic Guttman model with equal guessing and slipping parameters
mod2 <- sirt::prob.guttman( dat, guess.equal=TRUE, slip.equal=TRUE)
summary(mod2)

#***
# Model 3: Guttman model with three a priori specified item levels
itemlevel <- rep(1,12)
itemlevel[ c(2,5,8,10,12) ] <- 2
itemlevel[ c(3,4,6) ] <- 3
mod3 <- sirt::prob.guttman( dat, itemlevel=itemlevel )
summary(mod3)

## Not run: 
#***
# Model3m: estimate Model 3 in mirt

library(mirt)
# define four ordered latent classes
Theta <- scan(nlines=1)
   0 0 0    1 0 0   1 1 0   1 1 1
Theta <- matrix( Theta, nrow=4, ncol=3,byrow=TRUE)

# define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        # specify factors for each item level
        C1=1,7,9,11
        C2=2,5,8,10,12
        C3=3,4,6
        ")
# get initial parameter values
mod.pars <- mirt::mirt(dat, model=mirtmodel,  pars="values")
# redefine initial parameter values
mod.pars[ mod.pars$name=="d","value" ]  <- -1
mod.pars[ mod.pars$name %in% paste0("a",1:3) & mod.pars$est,"value" ]  <- 2
mod.pars
# define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  # number of latent Theta classes
  TP <- nrow(Theta)
  # prior in initial iteration
  if ( is.null(Etable) ){ prior <- rep( 1/TP, TP ) }
  # process Etable (this is correct for datasets without missing data)
  if ( ! is.null(Etable) ){
    # sum over correct and incorrect expected responses
    prior <- ( rowSums(Etable[, seq(1,2*I,2)]) + rowSums(Etable[,seq(2,2*I,2)]) )/I
                 }
  prior <- prior / sum(prior)
  return(prior)
}
# estimate model in mirt
mod3m <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE,
            technical=list( customTheta=Theta, customPriorFun=lca_prior) )
# correct number of estimated parameters
mod3m@nest <- as.integer(sum(mod.pars$est) + nrow(Theta)-1 )
# extract log-likelihood and compute AIC and BIC
mod3m@logLik
( AIC <- -2*mod3m@logLik+2*mod3m@nest )
( BIC <- -2*mod3m@logLik+log(mod3m@Data$N)*mod3m@nest )
# compare with information criteria from prob.guttman
mod3$ic
# model fit in mirt
mirt::M2(mod3m)
# extract coefficients
( cmod3m <- sirt::mirt.wrapper.coef(mod3m) )
# compare estimated distributions
round( cbind( "sirt"=mod3$trait$prob, "mirt"=mod3m@Prior[[1]] ), 5 )
  ##           sirt    mirt
  ##   [1,] 0.13709 0.13765
  ##   [2,] 0.30266 0.30303
  ##   [3,] 0.15239 0.15085
  ##   [4,] 0.40786 0.40846
# compare estimated item parameters
ipars <- data.frame( "guess.sirt"=mod3$item$guess,
                     "guess.mirt"=plogis( cmod3m$coef$d ) )
ipars$slip.sirt <- mod3$item$slip
ipars$slip.mirt <- 1-plogis( rowSums(cmod3m$coef[, c("a1","a2","a3","d") ] ) )
round( ipars, 4 )
  ##      guess.sirt guess.mirt slip.sirt slip.mirt
  ##   1      0.7810     0.7804    0.1383    0.1382
  ##   2      0.4513     0.4517    0.0373    0.0368
  ##   3      0.3203     0.3200    0.0747    0.0751
  ##   4      0.3009     0.3007    0.3082    0.3087
  ##   5      0.5776     0.5779    0.1800    0.1798
  ##   6      0.3758     0.3759    0.3047    0.3051
  ##   7      0.7262     0.7259    0.0625    0.0623
  ##   [...]

#***
# Model 4: Monotone item response function estimated in mirt

# define four ordered latent classes
Theta <- scan(nlines=1)
   0 0 0    1 0 0   1 1 0   1 1 1
Theta <- matrix( Theta, nrow=4, ncol=3,byrow=TRUE)

# define mirt model
I <- ncol(dat)  # I=12
mirtmodel <- mirt::mirt.model("
        # specify factors for each item level
        C1=1-12
        C2=1-12
        C3=1-12
        ")
# get initial parameter values
mod.pars <- mirt::mirt(dat, model=mirtmodel,  pars="values")
# redefine initial parameter values
mod.pars[ mod.pars$name=="d","value" ]  <- -1
mod.pars[ mod.pars$name %in% paste0("a",1:3) & mod.pars$est,"value" ]  <- .6
# set lower bound to zero ton ensure monotonicity
mod.pars[ mod.pars$name %in% paste0("a",1:3),"lbound" ]  <- 0
mod.pars
# estimate model in mirt
mod4 <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE,
            technical=list( customTheta=Theta, customPriorFun=lca_prior) )
# correct number of estimated parameters
mod4@nest <- as.integer(sum(mod.pars$est) + nrow(Theta)-1 )
# extract coefficients
cmod4 <- sirt::mirt.wrapper.coef(mod4)
cmod4
# compute item response functions
cmod4c <- cmod4$coef[, c("d", "a1", "a2", "a3" ) ]
probs4 <- t( apply( cmod4c, 1, FUN=function(ll){
                 plogis(cumsum(as.numeric(ll))) } ) )
matplot( 1:4,  t(probs4), type="b", pch=1:I)

## End(Not run)

Estimation of the `Q_3` Statistic (Yen, 1984)

Description

This function estimates the Q_3 statistic according to Yen (1984). The statistic Q_3 is calculated for every item pair (i,j) which is the correlation between item residuals after fitting the Rasch model.

Usage

Q3(dat, theta, b, progress=TRUE)

Arguments

dat

An N \times I data frame of dichotomous item responses

theta

Vector of length N of person parameter estimates (e.g. obtained from wle.rasch)

b

Vector of length I (e.g. obtained from rasch.mml2)

progress

Should iteration progress be displayed?

Value

A list with following entries

q3.matrix

An I \times I matrix of Q_3 statistics

q3.long

Just the q3.matrix in long matrix format where every row corresponds to an item pair

expected

An N \times I matrix of expected probabilities by the Rasch model

residual

An N \times I matrix of residuals obtained after fitting the Rasch model

Q3.stat

Vector with descriptive statistics of Q_3

References

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.

Examples

#############################################################################
# EXAMPLE 1: data.read. The 12 items are arranged in 4 testlets
#############################################################################
data(data.read)

# estimate the Rasch model
mod <- sirt::rasch.mml2( data.read)
# estmate WLEs
mod.wle <- sirt::wle.rasch( dat=data.read, b=mod$item$b )
# calculate Yen's Q3 statistic
mod.q3 <- sirt::Q3( dat=data.read, theta=mod.wle$theta, b=mod$item$b )
  ##   Yen's Q3 Statistic based on an estimated theta score
  ##   *** 12 Items | 66 item pairs
  ##   *** Q3 Descriptives
  ##        M     SD    Min    10%    25%    50%    75%    90%    Max
  ##   -0.085  0.110 -0.261 -0.194 -0.152 -0.107 -0.051  0.041  0.412

# plot Q3 statistics
I <- ncol(data.read)
image( 1:I, 1:I, mod.q3$q3.matrix, col=gray( 1 - (0:32)/32),
        xlab="Item", ylab="Item")
abline(v=c(5,9)) # borders for testlets
abline(h=c(5,9))

## Not run: 
# obtain Q3 statistic from modelfit.sirt function which is based on the
# posterior distribution of theta and not on observed values
fitmod <- sirt::modelfit.sirt( mod )
# extract Q3 statistic
q3stat <- fitmod$itempairs$Q3
  ##  > summary(q3stat)
  ##      Min.  1st Qu.   Median     Mean  3rd Qu.     Max.
  ##  -0.21760 -0.11590 -0.07280 -0.05545 -0.01220  0.44710
  ##  > sd(q3stat)
  ##  [1] 0.1101451

## End(Not run)

`Q_3` Statistic of Yen (1984) for Testlets

Description

This function calculates the average Q_3 statistic (Yen, 1984) within and between testlets.

Usage

Q3.testlet(q3.res, testlet.matrix, progress=TRUE)

Arguments

q3.res

An object generated by Q3

testlet.matrix

A matrix with two columns. The first column contains names of the testlets and the second names of the items. See the examples for the definition of such matrices.

progress

Logical indicating whether computation progress should be displayed.

Value

A list with following entries

testlet.q3

Data frame with average Q_3 statistics within testlets

testlet.q3.korr

Matrix of average Q_3 statistics within and between testlets

References

Yen, W. M. (1984). Effects of local item dependence on the fit and equating performance of the three-parameter logistic model. Applied Psychological Measurement, 8, 125-145.

Examples

#############################################################################
# EXAMPLE 1: data.read. The 12 items are arranged in 4 testlets
#############################################################################
data(data.read)

# estimate the Rasch model
mod <- sirt::rasch.mml2( data.read)
mod$item

# estmate WLEs
mod.wle <- sirt::wle.rasch( dat=data.read, b=mod$item$b )

# Yen's Q3 statistic
mod.q3 <- sirt::Q3( dat=data.read, theta=mod.wle$theta, b=mod$item$b )

# Yen's Q3 statistic with testlets
items <- colnames(data.read)
testlet.matrix <- cbind( substring(  items,1,1), items )
mod.testletq3 <- sirt::Q3.testlet( q3.res=mod.q3,testlet.matrix=testlet.matrix)
mod.testletq3

Calculation of Quasi Monte Carlo Integration Points

Description

This function calculates integration nodes based on the multivariate normal distribution with zero mean vector and identity covariance matrix. See Pan and Thompson (2007) and Gonzales et al. (2006) for details.

Usage

qmc.nodes(snodes, ndim)

Arguments

snodes

Number of integration nodes

ndim

Number of dimensions

Value

theta

A matrix of integration points

Note

This function uses the sfsmisc::QUnif function from the sfsmisc package.

References

Gonzalez, J., Tuerlinckx, F., De Boeck, P., & Cools, R. (2006). Numerical integration in logistic-normal models. Computational Statistics & Data Analysis, 51, 1535-1548.

Pan, J., & Thompson, R. (2007). Quasi-Monte Carlo estimation in generalized linear mixed models. Computational Statistics & Data Analysis, 51, 5765-5775.

Examples

## some toy examples

# 5 nodes on one dimension
qmc.nodes( snodes=5, ndim=1 )
  ##            [,1]
  ## [1,]  0.0000000
  ## [2,] -0.3863753
  ## [3,]  0.8409238
  ## [4,] -0.8426682
  ## [5,]  0.3850568

# 7 nodes on two dimensions
qmc.nodes( snodes=7, ndim=2 )
  ##             [,1]        [,2]
  ## [1,]  0.00000000 -0.43072730
  ## [2,] -0.38637529  0.79736332
  ## [3,]  0.84092380 -1.73230641
  ## [4,] -0.84266815 -0.03840544
  ## [5,]  0.38505683  1.51466109
  ## [6,] -0.00122394 -0.86704605
  ## [7,]  1.35539115  0.33491073

Running ConQuest From Within R

Description

The function R2conquest runs the IRT software ConQuest (Wu, Adams, Wilson & Haldane, 2007) from within R.

Other functions are utility functions for reading item parameters, plausible values or person-item maps.

Usage

R2conquest(dat, path.conquest, conquest.name="console", converge=0.001,
    deviancechange=1e-04, iter=800, nodes=20, minnode=-6, maxnode=6,
    show.conquestoutput=FALSE, name="rasch", pid=1:(nrow(dat)), wgt=NULL, X=NULL,
    set.constraints=NULL, model="item", regression=NULL,
    itemcodes=seq(0,max(dat,na.rm=TRUE)), constraints=NULL, digits=5, onlysyntax=FALSE,
    qmatrix=NULL, import.regression=NULL, anchor.regression=NULL,
    anchor.covariance=NULL, pv=TRUE, designmatrix=NULL, only.calibration=FALSE,
    init_parameters=NULL, n_plausible=10,  persons.elim=TRUE, est.wle=TRUE,
    save.bat=TRUE, use.bat=FALSE, read.output=TRUE, ignore.pid=FALSE)

## S3 method for class 'R2conquest'
summary(object, ...)

# read all terms in a show file or only some terms
read.show(showfile)
read.show.term(showfile, term)

# read regression parameters in a show file
read.show.regression(showfile)

# read unidimensional plausible values form a pv file
read.pv(pvfile, npv=5)
# read multidimensional plausible values
read.multidimpv(pvfile, ndim, npv=5)

# read person-item map
read.pimap(showfile)

Arguments

dat

Data frame of item responses

path.conquest

Directory where the ConQuest executable file is located

conquest.name

Name of the ConQuest executable.

converge

Maximal change in parameters

deviancechange

Maximal change in deviance

iter

Maximum number of iterations

nodes

Number of nodes for integration

minnode

Minimum value of discrete grid of \theta nodes

maxnode

Maximum value of discrete grid of \theta nodes

show.conquestoutput

Show ConQuest run log file on console?

name

Name of the output files. The default is 'rasch'.

pid

Person identifier

wgt

Vector of person weights

X

Matrix of covariates for the latent regression model (e.g. gender, socioeconomic status, ..) or for the item design (e.g. raters, booklets, ...)

set.constraints

This is the set.constraints in ConQuest. It can be "cases" (constraint for persons), "items" or "none"

model

Definition model statement. It can be for example "item+item*step" or "item+booklet+rater"

regression

The ConQuest regression statement (for example "gender+status")

itemcodes

Vector of valid codes for item responses. E.g. for partial credit data with at most 3 points it must be c(0,1,2,3).

constraints

Matrix of item parameter constraints. 1st column: Item names, 2nd column: Item parameters. It only works correctly for dichotomous data.

digits

Number of digits for covariates in the latent regression model

onlysyntax

Should only be ConQuest syntax generated?

qmatrix

Matrix of item loadings on dimensions in a multidimensional IRT model

import.regression

Name of an file with initial covariance parameters (follow the ConQuest specification rules!)

anchor.regression

Name of an file with anchored regression parameters

anchor.covariance

Name of an file with anchored covariance parameters (follow the ConQuest specification rules!)

pv

Draw plausible values?

designmatrix

Design matrix for item parameters (see the ConQuest manual)

only.calibration

Estimate only item parameters and not person parameters (no WLEs or plausible values are estimated)?

init_parameters

Name of an file with initial item parameters (follow the ConQuest specification rules!)

n_plausible

Number of plausible values

persons.elim

Eliminate persons with only missing item responses?

est.wle

Estimate weighted likelihood estimate?

save.bat

Save bat file?

use.bat

Run ConQuest from within R due a direct call via the system command (use.bat=FALSE) or via a system call of a bat file in the working directory (use.bat=TRUE)

read.output

Should ConQuest output files be processed? Default is TRUE.

ignore.pid

Logical indicating whether person identifiers (pid) should be processed in ConQuest input syntax.

object

Object of class R2conquest

showfile

A ConQuest show file (shw file)

term

Name of the term to be extracted in the show file

pvfile

File with plausible values

ndim

Number of dimensions

npv

Number of plausible values

...

Further arguments to be passed

Details

Consult the ConQuest manual (Wu et al., 2007) for specification details.

Value

A list with several entries

item

Data frame with item parameters and item statistics

person

Data frame with person parameters

shw.itemparameter

ConQuest output table for item parameters

shw.regrparameter

ConQuest output table for regression parameters

...

More values

References

Wu, M. L., Adams, R. J., Wilson, M. R. & Haldane, S. (2007). ACER ConQuest Version 2.0. Mulgrave. https://shop.acer.edu.au/acer-shop/group/CON3.

Examples

## Not run: 
# define ConQuest path
path.conquest <- "C:/Conquest/"

#############################################################################
# EXAMPLE 1: Dichotomous data (data.pisaMath)
#############################################################################
library(sirt)
data(data.pisaMath)
dat <- data.pisaMath$data

# select items
items <- colnames(dat)[ which( substring( colnames(dat), 1, 1)=="M" ) ]

#***
# Model 11: Rasch model
mod11 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
             pid=dat$idstud, name="mod11")
summary(mod11)
# read show file
shw11 <- sirt::read.show( "mod11.shw" )
# read person-item map
pi11 <- sirt::read.pimap(showfile="mod11.shw")

#***
# Model 12: Rasch model with fixed item difficulties (from Model 1)
mod12 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
             pid=dat$idstud, constraints=mod11$item[, c("item","itemdiff")],
             name="mod12")
summary(mod12)

#***
# Model 13: Latent regression model with predictors female, hisei and migra
mod13a <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
             pid=dat$idstud, X=dat[, c("female", "hisei", "migra") ],
             name="mod13a")
summary(mod13a)

# latent regression with a subset of predictors
mod13b <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
             pid=dat$idstud, X=dat[, c("female", "hisei", "migra") ],
             regression="hisei migra", name="mod13b")

#***
# Model 14: Differential item functioning (female)
mod14 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
             pid=dat$idstud, X=dat[, c("female"), drop=FALSE],
             model="item+female+item*female",  regression="",  name="mod14")

#############################################################################
# EXAMPLE 2: Polytomous data (data.Students)
#############################################################################
library(CDM)
data(data.Students)
dat <- data.Students

# select items
items <- grep.vec( "act", colnames(dat) )$x

#***
# Model 21: Partial credit model
mod21 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
              model="item+item*step",  name="mod21")

#***
# Model 22: Rating scale model
mod22 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
              model="item+step", name="mod22")

#***
# Model 23: Multidimensional model
items <- grep.vec( c("act", "sc" ), colnames(dat),  "OR" )$x
qmatrix <- matrix( 0, nrow=length(items), 2 )
qmatrix[1:5,1] <- 1
qmatrix[6:9,2] <- 1
mod23 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
            model="item+item*step", qmatrix=qmatrix, name="mod23")

#############################################################################
# EXAMPLE 3: Multi facet models (data.ratings1)
#############################################################################
library(sirt)
data(data.ratings1)
dat <- data.ratings1

items <- paste0("k",1:5)

# use numeric rater ID's
raters <- as.numeric( substring( paste( dat$rater ), 3 ) )

#***
# Model 31: Rater model 'item+item*step+rater'
mod31 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
              itemcodes=0:3, model="item+item*step+rater",
              pid=dat$idstud, X=data.frame("rater"=raters),
              regression="", name="mod31")

#***
# Model 32: Rater model 'item+item*step+rater+item*rater'
mod32 <- sirt::R2conquest(dat=dat[,items], path.conquest=path.conquest,
              model="item+item*step+rater+item*rater",
              pid=dat$idstud, X=data.frame("rater"=raters),
              regression="", name="mod32")

## End(Not run)

Estimation of a NOHARM Analysis from within R

Description

This function enables the estimation of a NOHARM analysis (Fraser & McDonald, 1988; McDonald, 1982a, 1982b, 1997) from within R. NOHARM estimates a compensatory multidimensional factor analysis for dichotomous response data. Arguments of this function strictly follow the rules of the NOHARM manual (see Fraser & McDonald, 2012; Lee & Lee, 2016).

Usage

R2noharm(dat=NULL,pm=NULL, n=NULL, model.type, weights=NULL, dimensions=NULL,
      guesses=NULL, noharm.path, F.pattern=NULL, F.init=NULL,
      P.pattern=NULL, P.init=NULL, digits.pm=4, writename=NULL,
      display.fit=5,  dec=".", display=TRUE)

## S3 method for class 'R2noharm'
summary(object, logfile=NULL, ...)

Arguments

dat

An N \times I data frame of item responses for N subjects and I items

pm

A matrix or a vector containing product-moment correlations

n

Sample size. This value must only be included if pm is provided.

model.type

Can be "EFA" (exploratory factor analysis) or "CFA" (confirmatory factor analysis).

weights

Optional vector of student weights

dimensions

Number of dimensions in exploratory factor analysis

guesses

An optional vector of fixed guessing parameters of length I. In case of the default NULL, all guessing parameters are set to zero.

noharm.path

Local path where the NOHARM 4 command line 64-bit version is located.

F.pattern

Pattern matrix for F (I \times D)

F.init

Initial matrix for F (I \times D)

P.pattern

Pattern matrix for P (D \times D)

P.init

Initial matrix for P (D \times D)

digits.pm

Number of digits after decimal separator which are used for estimation

writename

Name for NOHARM input and output files

display.fit

How many digits (after decimal separator) should be used for printing results on the R console?

dec

Decimal separator ("." or ",")

display

Display output?

object

Object of class R2noharm

logfile

File name if the summary should be sunk into a file

...

Further arguments to be passed

Details

NOHARM estimates a multidimensional compensatory item response model with the probit link function \Phi. For item responses X_{pi} of person p on item i the model equation is defined as

P( X_{pi}=1 | \bold{\theta}_p )=c_i + ( 1 - c_i ) \Phi( f_{i0} + f_{i1} \theta_{p1} + ... + f_{iD} \theta_{pD} )

where F=(f_{id}) is a loading matrix and P the covariance matrix of \bold{\theta}_p. The guessing parameters c_i must be provided as fixed values.

For the definition of F and P matrices, please consult the NOHARM manual.

This function needs the 64-bit command line version which can be downloaded from (some links may be broken in the meantime)

http://noharm.niagararesearch.ca/nh4cldl.html
https://noharm.software.informer.com/4.0/
https://cehs.unl.edu/edpsych/software-urls-and-other-interesting-sites/

Value

A list with following entries

tanaka

Tanaka index

rmsr

RMSR statistic

N.itempair

Sample sizes of pairwise item observations

pm

Product moment matrix

weights

Used student weights

guesses

Fixed guessing parameters

residuals

Residual covariance matrix

final.constants

Vector of final constants

thresholds

Threshold parameters

uniquenesses

Item uniquenesses

loadings.theta

Matrix of loadings in theta parametrization (common factor parametrization)

factor.cor

Covariance matrix of factors

difficulties

Item difficulties (for unidimensional models)

discriminations

Item discriminations (for unidimensional models)

loadings

Loading matrix (latent trait parametrization)

model.type

Used model type

Nobs

Number of observations

Nitems

Number of items

modtype

Model type according to the NOHARM specification (see NOHARM manual)

F.init

Initial loading matrix for F

F.pattern

Pattern loading matrix for F

P.init

Initial covariance matrix for P

P.pattern

Pattern covariance matrix for P

dat

Original data frame

systime

System time

noharm.path

Used NOHARM directory

digits.pm

Number of digits in product moment matrix

dec

Used decimal symbol

display.fit

Number of digits for fit display

dimensions

Number of dimensions

chisquare

Statistic \chi^2

Nestpars

Number of estimated parameters

df

Degrees of freedom

chisquare_df

Ratio \chi^2 / df

rmsea

RMSEA statistic

p.chisquare

Significance for \chi^2 statistic

Note

Possible errors often occur due to wrong dec specification.

References

Fraser, C., & McDonald, R. P. (1988). NOHARM: Least squares item factor analysis. Multivariate Behavioral Research, 23, 267-269. https://doi.org/10.1207/s15327906mbr2302_9

Fraser, C., & McDonald, R. P. (2012). NOHARM 4 Manual.
http://noharm.niagararesearch.ca/nh4man/nhman.html.

Lee, J. J., & Lee, M. K. (2016). An overview of the normal ogive harmonic analysis robust method (NOHARM) approach to item response theory. Tutorials in Quantitative Methods for Psychology, 12(1), 1-8. https://doi.org/10.20982/tqmp.12.1.p001

McDonald, R. P. (1982a). Linear versus nonlinear models in item response theory. Applied Psychological Measurement, 6(4), 379-396. doi:10.1177/014662168200600402

McDonald, R. P. (1982b). Unidimensional and multidimensional models for item response theory. I.R.T., C.A.T. conference, Minneapolis, 1982, Proceedings.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Data data.noharm18 with 18 items
#############################################################################

# load data
data(data.noharm18)
dat <- data.noharm18
I <- ncol(dat) # number of items

# locate noharm.path
noharm.path <- "c:/NOHARM"

#****************************************
# Model 1: 1-dimensional Rasch model (1-PL model)

# estimate one factor variance
P.pattern <- matrix( 1, ncol=1, nrow=1 )
P.init <- P.pattern
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 0, nrow=I, ncol=1 )
F.init <- 1 + 0*F.pattern       #
# estimate model
mod <- sirt::R2noharm( dat=dat, model.type="CFA",
           F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
           P.init=P.init, writename="ex1__1dim_1pl",
       noharm.path=noharm.path, dec="," )
# summary
summary(mod, logfile="ex1__1dim_1pl__SUMMARY")
# jackknife
jmod <- sirt::R2noharm.jackknife( mod, jackunits=20 )
summary(jmod, logfile="ex1__1dim_1pl__JACKKNIFE")
# compute factor scores (EAPs)
emod <- sirt::R2noharm.EAP(mod)

#*****-----
# Model 1b: Include student weights in estimation
N <- nrow(dat)
weights <- stats::runif( N, 1, 5 )
mod1b <- sirt::R2noharm( dat=dat, model.type="CFA",  weights=weights,
            F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
            P.init=P.init, writename="ex1__1dim_1pl_w",
            noharm.path=noharm.path, dec="," )
summary(mod1b)

#****************************************
# Model 2: 1-dimensional 2PL Model

# set trait variance equal to 1
P.pattern <- matrix( 0, ncol=1, nrow=1 )
P.init <- 1+0*P.pattern
# loading matrix
F.pattern <- matrix( 1, nrow=I, ncol=1 )
F.init <- 1 + 0*F.pattern

mod <- sirt::R2noharm( dat=dat, model.type="CFA",
            F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
            P.init=P.init, writename="ex2__1dim_2pl",
            noharm.path=noharm.path, dec="," )

summary(mod)
jmod <- sirt::R2noharm.jackknife( mod, jackunits=20 )
summary(jmod)

#****************************************
# Model 3: 1-dimensional 3PL Model with fixed guessing parameters

# set trait variance equal to 1
P.pattern <- matrix( 0, ncol=1, nrow=1 )
P.init <- 1+0*P.pattern
# loading matrix
F.pattern <- matrix( 1, nrow=I, ncol=1 )
F.init <- 1 + 0*F.pattern       #
# fix guessing parameters equal to .2 (for all items)
guesses <- rep( .1, I )

mod <- sirt::R2noharm( dat=dat, model.type="CFA",
          F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
          P.init=P.init, guesses=guesses,
          writename="ex3__1dim_3pl", noharm.path=noharm.path, dec=","  )
summary(mod)
jmod <- sirt::R2noharm.jackknife( mod, jackunits=20 )
summary(jmod)

#****************************************
# Model 4: 3-dimensional Rasch model

# estimate one factor variance
P.pattern <- matrix( 1, ncol=3, nrow=3 )
P.init <- .8*P.pattern
diag(P.init) <- 1
# fix all entries in the loading matrix to 1
F.init <- F.pattern <- matrix( 0, nrow=I, ncol=3 )
F.init[1:6,1] <- 1
F.init[7:12,2] <- 1
F.init[13:18,3] <- 1

mod <- sirt::R2noharm( dat=dat, model.type="CFA",
          F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
          P.init=P.init, writename="ex4__3dim_1pl",
          noharm.path=noharm.path, dec="," )
# write output from R console in a file
summary(mod, logfile="ex4__3dim_1pl__SUMMARY.Rout")

jmod <- sirt::R2noharm.jackknife( mod, jackunits=20 )
summary(jmod)

# extract factor scores
emod <- sirt::R2noharm.EAP(mod)

#****************************************
# Model 5: 3-dimensional 2PL model

# estimate one factor variance
P.pattern <- matrix( 1, ncol=3, nrow=3 )
P.init <- .8*P.pattern
diag(P.init) <- 0
# fix all entries in the loading matrix to 1
F.pattern <- matrix( 0, nrow=I, ncol=3 )
F.pattern[1:6,1] <- 1
F.pattern[7:12,2] <- 1
F.pattern[13:18,3] <- 1
F.init <- F.pattern

mod <- sirt::R2noharm( dat=dat, model.type="CFA",
          F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
          P.init=P.init, writename="ex5__3dim_2pl",
          noharm.path=noharm.path, dec="," )
summary(mod)
# use 50 jackknife units with 4 persons within a unit
jmod <- sirt::R2noharm.jackknife( mod, jackunits=seq( 1:50, each=4 ) )
summary(jmod)

#****************************************
# Model 6: Exploratory Factor Analysis with 3 factors

mod <- sirt::R2noharm( dat=dat, model.type="EFA",  dimensions=3,
           writename="ex6__3dim_efa", noharm.path=noharm.path,dec=",")
summary(mod)

jmod <- sirt::R2noharm.jackknife( mod, jackunits=20 )

#############################################################################
# EXAMPLE 2: NOHARM manual Example A
#############################################################################

# See NOHARM manual: http://noharm.niagararesearch.ca/nh4man/nhman.html
# The following text and data is copied from this manual.
#
# In the first example, we demonstrate how to prepare the input for a 2-dimensional
# model using exploratory analysis. Data from a 9 item test were collected from
# 200 students and the 9x9 product-moment matrix of the responses was computed.
#
# Our hypothesis is for a 2-dimensional model with no guessing,
# i.e., all guesses are equal to zero. However, because we are unsure of any
# particular pattern for matrix F, we wish to prescribe an exploratory analysis, i.e.,
# set EX=1. Also, we will content ourselves with letting the program supply all
# initial values.
#
# We would like both the sample product-moment matrix and the residual matrix to
# be included in the output.

# scan product-moment matrix copied from the NOHARM manual
pm <- scan()
     0.8967
     0.2278 0.2356
     0.6857 0.2061 0.7459
     0.8146 0.2310 0.6873 0.8905
     0.4505 0.1147 0.3729 0.4443 0.5000
     0.7860 0.2080 0.6542 0.7791 0.4624 0.8723
     0.2614 0.0612 0.2140 0.2554 0.1914 0.2800 0.2907
     0.7549 0.1878 0.6236 0.7465 0.4505 0.7590 0.2756 0.8442
     0.6191 0.1588 0.5131 0.6116 0.3845 0.6302 0.2454 0.6129 0.6879

ex2 <- sirt::R2noharm( pm=pm, n=200, model.type="EFA", dimensions=2,
         noharm.path=noharm.path, writename="ex2_noharmExA", dec=",")
summary(ex2)

#############################################################################
# EXAMPLE 3: NOHARM manual Example B
#############################################################################

# See NOHARM manual: http://noharm.niagararesearch.ca/nh4man/nhman.html
# The following text and data is copied from this manual.

# Suppose we have the product-moment matrix of data from 125 students on 9 items.
# Our hypothesis is for 2 dimensions with simple structure. In this case,
# items 1 to 5 have coefficients of theta which are to be estimated for one
# latent trait but are to be fixed at zero for the other one.
# For the latent trait for which items 1 to 5 have zero coefficients,
# items 6 to 9 have coefficients which are to be estimated. For the other
# latent trait, items 6 to 9 will have zero coefficients.
# We also wish to estimate the correlation between the latent traits,
# so we prescribe P as a 2x2 correlation matrix.
#
# Our hypothesis prescribes that there was no guessing involved, i.e.,
# all guesses are equal to zero. For demonstration purposes,
# let us not have the program print out the sample product-moment matrix.
# Also let us not supply any starting values but, rather, use the defaults
# supplied by the program.

pm <- scan()
    0.930
    0.762 0.797
    0.541 0.496 0.560
    0.352 0.321 0.261 0.366
    0.205 0.181 0.149 0.110 0.214
    0.858 0.747 0.521 0.336 0.203 0.918
    0.773 0.667 0.465 0.308 0.184 0.775 0.820
    0.547 0.474 0.347 0.233 0.132 0.563 0.524 0.579
    0.329 0.290 0.190 0.140 0.087 0.333 0.308 0.252 0.348

I <- 9    # number of items
# define loading matrix
F.pattern <- matrix(0,I,2)
F.pattern[1:5,1] <- 1
F.pattern[6:9,2] <- 1
F.init <- F.pattern
# define covariance matrix
P.pattern <- matrix(1,2,2)
diag(P.pattern) <- 0
P.init <- 1+P.pattern

ex3 <- sirt::R2noharm( pm=pm, n=125,, model.type="CFA",
           F.pattern=F.pattern, F.init=F.init, P.pattern=P.pattern,
           P.init=P.init, writename="ex3_noharmExB",
           noharm.path=noharm.path, dec="," )
summary(ex3)

#############################################################################
# EXAMPLE 4: NOHARM manual Example C
#############################################################################

data(data.noharmExC)
# See NOHARM manual: http://noharm.niagararesearch.ca/nh4man/nhman.html
# The following text and data is copied from this manual.

# In this example, suppose that from 300 respondents we have item
# responses scored dichotomously, 1 or 0, for 8 items.
#
# Our hypothesis is for a unidimensional model where all eight items
# have coefficients of theta which are to be estimated.
# Suppose that since the items were multiple choice with 5 options each,
# we set the fixed guesses all to 0.2 (not necessarily good reasoning!)
#
# Let's supply initial values for the coefficients of theta (F matrix)
# as .75 for items 1 to 4 and .6 for items 5 to 8.

I <- 8
guesses <- rep(.2,I)
F.pattern <- matrix(1,I,1)
F.init <- F.pattern
F.init[1:4,1] <- .75
F.init[5:8,1] <- .6
P.pattern <- matrix(0,1,1)
P.init <- 1 + 0 * P.pattern

ex4 <- sirt::R2noharm( dat=data.noharmExC,, model.type="CFA",
           guesses=guesses, F.pattern=F.pattern, F.init=F.init,
           P.pattern=P.pattern, P.init=P.init, writename="ex3_noharmExC",
           noharm.path=noharm.path, dec="," )
summary(ex4)

# modify F pattern matrix
# f11=f51 (since both have equal pattern values of 2),
# f21=f61 (since both have equal pattern values of 3),
# f31=f71 (since both have equal pattern values of 4),
# f41=f81 (since both have equal pattern values of 5).
F.pattern[ c(1,5) ] <- 2
F.pattern[ c(2,6) ] <- 3
F.pattern[ c(3,7) ] <- 4
F.pattern[ c(4,8) ] <- 5
F.init <- .5+0*F.init

ex4a <- sirt::R2noharm( dat=data.noharmExC,, model.type="CFA",
           guesses=guesses, F.pattern=F.pattern, F.init=F.init,
           P.pattern=P.pattern, P.init=P.init, writename="ex3_noharmExC1",
           noharm.path=noharm.path, dec="," )
summary(ex4a)

## End(Not run)

EAP Factor Score Estimation

Description

This function performs EAP factor score estimation of an item response model estimated with NOHARM.

Usage

R2noharm.EAP(noharmobj, theta.k=seq(-6, 6, len=21), print.output=TRUE)

Arguments

noharmobj

Object of class R2noharm or noharm.sirt

theta.k

Vector of discretized theta values on which the posterior is evaluated. This vector applies to all dimensions.

print.output

An optional logical indicating whether output should be displayed at the console

Value

A list with following entries

person

Data frame of person parameter EAP estimates and their corresponding standard errors

theta

Grid of multidimensional theta values where the posterior is evaluated.

posterior

Individual posterior distribution evaluated at theta

like

Individual likelihood

EAP.rel

EAP reliabilities of all dimensions

probs

Item response probabilities evaluated at theta

Jackknife Estimation of NOHARM Analysis

Description

This function performs a jackknife estimation of NOHARM analysis to get standard errors based on a replication method (see Christoffersson, 1977).

Usage

R2noharm.jackknife(object, jackunits=NULL)

## S3 method for class 'R2noharm.jackknife'
summary(object, logfile=NULL, ...)

Arguments

object

Object of class R2noharm

jackunits

A vector of integers or a number. If it is a number, then it refers to the number of jackknife units. If it is a vector of integers, then this vector defines the allocation of persons jackknife units. Integers corresponds to row indexes in the data set.

logfile

File name if the summary should be sunk into a file

...

Further arguments to be passed

Value

A list of lists with following entries:

partable

Data frame with parameters

se.pars

List of estimated standard errors for all parameter estimates: tanaka.stat, rmsr.stat, rmsea.stat, chisquare_df.stat, thresholds.stat, final.constants.stat, uniquenesses.stat, factor.cor.stat, loadings.stat, loadings.theta.stat

jackknife.pars

List with obtained results by jackknifing for all parameters: j.tanaka, j.rmsr, rmsea, chisquare_df, j.pm, j.thresholds, j.factor.cor, j.loadings, j.loadings.theta

u.jacknunits

Unique jackknife elements

References

Christoffersson, A. (1977). Two-step weighted least squares factor analysis of dichotomized variables. Psychometrika, 42, 433-438.

Multidimensional IRT Copula Model

Description

This function handles local dependence by specifying copulas for residuals in multidimensional item response models for dichotomous item responses (Braeken, 2011; Braeken, Tuerlinckx & de Boeck, 2007; Schroeders, Robitzsch & Schipolowski, 2014). Estimation is allowed for item difficulties, item slopes and a generalized logistic link function (Stukel, 1988).

The function rasch.copula3 allows the estimation of multidimensional models while rasch.copula2 only handles unidimensional models.

Usage

rasch.copula2(dat, itemcluster, weights=NULL, copula.type="bound.mixt",
    progress=TRUE, mmliter=1000, delta=NULL,
    theta.k=seq(-4, 4, len=21), alpha1=0, alpha2=0,
    numdiff.parm=1e-06,  est.b=seq(1, ncol(dat)),
    est.a=rep(1, ncol(dat)), est.delta=NULL, b.init=NULL, a.init=NULL,
    est.alpha=FALSE, glob.conv=0.0001, alpha.conv=1e-04, conv1=0.001,
    dev.crit=.2, increment.factor=1.01)

rasch.copula3(dat, itemcluster, dims=NULL, copula.type="bound.mixt",
    progress=TRUE, mmliter=1000, delta=NULL,
    theta.k=seq(-4, 4, len=21), alpha1=0, alpha2=0,
    numdiff.parm=1e-06,  est.b=seq(1, ncol(dat)),
    est.a=rep(1, ncol(dat)), est.delta=NULL, b.init=NULL, a.init=NULL,
    est.alpha=FALSE, glob.conv=0.0001, alpha.conv=1e-04, conv1=0.001,
    dev.crit=.2, rho.init=.5, increment.factor=1.01)

## S3 method for class 'rasch.copula2'
summary(object, file=NULL, digits=3, ...)
## S3 method for class 'rasch.copula3'
summary(object, file=NULL, digits=3, ...)

## S3 method for class 'rasch.copula2'
anova(object,...)
## S3 method for class 'rasch.copula3'
anova(object,...)

## S3 method for class 'rasch.copula2'
logLik(object,...)
## S3 method for class 'rasch.copula3'
logLik(object,...)

## S3 method for class 'rasch.copula2'
IRT.likelihood(object,...)
## S3 method for class 'rasch.copula3'
IRT.likelihood(object,...)

## S3 method for class 'rasch.copula2'
IRT.posterior(object,...)
## S3 method for class 'rasch.copula3'
IRT.posterior(object,...)

Arguments

dat

An N \times I data frame. Cases with only missing responses are removed from the analysis.

itemcluster

An integer vector of length I (number of items). Items with the same integers define a joint item cluster of (positively) locally dependent items. Values of zero indicate that the corresponding item is not included in any item cluster of dependent responses.

weights

Optional vector of sampling weights

dims

A vector indicating to which dimension an item is allocated. The default is that all items load on the first dimension.

copula.type

A character or a vector containing one of the following copula types: bound.mixt (boundary mixture copula), cook.johnson (Cook-Johnson copula) or frank (Frank copula) (see Braeken, 2011). The vector copula.type must match the number of different itemclusters. For every itemcluster, a different copula type may be specified (see Examples).

progress

Print progress? Default is TRUE.

mmliter

Maximum number of iterations.

delta

An optional vector of starting values for the dependency parameter delta.

theta.k

Discretized trait distribution

alpha1

alpha1 parameter in the generalized logistic item response model (Stukel, 1988). The default is 0 which leads together with alpha2=0 to the logistic link function.

alpha2

alpha2 parameter in the generalized logistic item response model

numdiff.parm

Parameter for numerical differentiation

est.b

Integer vector of item difficulties to be estimated

est.a

Integer vector of item discriminations to be estimated

est.delta

Integer vector of length length(itemcluster). Nonzero integers correspond to delta parameters which are estimated. Equal integers indicate parameter equality constraints.

b.init

Initial b parameters

a.init

Initial a parameters

est.alpha

Should both alpha parameters be estimated? Default is FALSE.

glob.conv

Convergence criterion for all parameters

alpha.conv

Maximal change in alpha parameters for convergence

conv1

Maximal change in item parameters for convergence

dev.crit

Maximal change in the deviance. Default is .2.

rho.init

Initial value for off-diagonal elements in correlation matrix

increment.factor

A numeric value larger than one which controls the size of increments in iterations. To stabilize convergence, choose values 1.05 or 1.1 in some situations.

object

Object of class rasch.copula2 or rasch.copula3

file

Optional file name for summary output

digits

Number of digits after decimal in summary output

...

Further arguments to be passed

Value

A list with following entries

N.itemclusters

Number of item clusters

item

Estimated item parameters

iter

Number of iterations

dev

Deviance

delta

Estimated dependency parameters \delta

b

Estimated item difficulties

a

Estimated item slopes

mu

Mean

sigma

Standard deviation

alpha1

Parameter \alpha_1 in the generalized item response model

alpha2

Parameter \alpha_2 in the generalized item response model

ic

Information criteria

theta.k

Discretized ability distribution

pi.k

Fixed \theta distribution

deviance

Deviance

pattern

Item response patterns with frequencies and posterior distribution

person

Data frame with person parameters

datalist

List of generated data frames during estimation

EAP.rel

Reliability of the EAP

copula.type

Type of copula

summary.delta

Summary for estimated \delta parameters

f.qk.yi

Individual posterior

f.yi.qk

Individual likelihood

...

Further values

References

Braeken, J. (2011). A boundary mixture approach to violations of conditional independence. Psychometrika, 76(1), 57-76. doi:10.1007/s11336-010-9190-4

Braeken, J., Kuppens, P., De Boeck, P., & Tuerlinckx, F. (2013). Contextualized personality questionnaires: A case for copulas in structural equation models for categorical data. Multivariate Behavioral Research, 48(6), 845-870. doi:10.1080/00273171.2013.827965

Braeken, J., & Tuerlinckx, F. (2009). Investigating latent constructs with item response models: A MATLAB IRTm toolbox. Behavior Research Methods, 41(4), 1127-1137.

Braeken, J., Tuerlinckx, F., & De Boeck, P. (2007). Copula functions for residual dependency. Psychometrika, 72(3), 393-411. doi:10.1007/s11336-007-9005-4

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

Examples

#############################################################################
# EXAMPLE 1: Reading Data
#############################################################################

data(data.read)
dat <- data.read

# define item clusters
itemcluster <- rep( 1:3, each=4 )

# estimate Copula model
mod1 <- sirt::rasch.copula2( dat=dat, itemcluster=itemcluster)

## Not run: 
# estimate Rasch model
mod2 <- sirt::rasch.copula2( dat=dat, itemcluster=itemcluster,
        delta=rep(0,3), est.delta=rep(0,3) )
summary(mod1)
summary(mod2)

# estimate copula 2PL model
I <- ncol(dat)
mod3 <- sirt::rasch.copula2( dat=dat, itemcluster=itemcluster, est.a=1:I,
                increment.factor=1.05)
summary(mod3)

#############################################################################
# EXAMPLE 2: 11 items nested within 2 item clusters (testlets)
#    with 2 resp. 3 dependent and 6 independent items
#############################################################################

set.seed(5698)
I <- 11                             # number of items
n <- 3000                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# define item clusters
itemcluster <- rep(0,I)
itemcluster[ c(3,5 )] <- 1
itemcluster[c(2,4,9)] <- 2
# residual correlations
rho <- c( .7, .5 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimate Rasch copula model
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod1)

# both item clusters have Cook-Johnson copula as dependency
mod1c <- sirt::rasch.copula2( dat, itemcluster=itemcluster,
            copula.type="cook.johnson")
summary(mod1c)

# first item boundary mixture and second item Cook-Johnson copula
mod1d <- sirt::rasch.copula2( dat, itemcluster=itemcluster,
            copula.type=c( "bound.mixt", "cook.johnson" ) )
summary(mod1d)

# compare result with Rasch model estimation in rasch.copula2
# delta must be set to zero
mod2 <- sirt::rasch.copula2( dat, itemcluster=itemcluster, delta=c(0,0),
            est.delta=c(0,0) )
summary(mod2)

#############################################################################
# EXAMPLE 3: 12 items nested within 3 item clusters (testlets)
#   Cluster 1 -> Items 1-4; Cluster 2 -> Items 6-9;  Cluster 3 -> Items 10-12
#############################################################################

set.seed(967)
I <- 12                             # number of items
n <- 450                            # number of persons
b <- seq(-2,2, len=I)               # item difficulties
b <- sample(b)                      # sample item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ 1:4 ] <- 1
itemcluster[ 6:9 ] <- 2
itemcluster[ 10:12 ] <- 3
# residual correlations
rho <- c( .35, .25, .30 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimate Rasch copula model
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod1)

# person parameter estimation assuming the Rasch copula model
pmod1 <- sirt::person.parameter.rasch.copula(raschcopula.object=mod1 )

# Rasch model estimation
mod2 <- sirt::rasch.copula2( dat, itemcluster=itemcluster,
             delta=rep(0,3), est.delta=rep(0,3) )
summary(mod1)
summary(mod2)

#############################################################################
# EXAMPLE 4: Two-dimensional copula model
#############################################################################

set.seed(5698)
I <- 9
n <- 1500                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
theta0 <- stats::rnorm( n, sd=sqrt( .6 ) )

#*** Dimension 1
theta <- theta0 + stats::rnorm( n, sd=sqrt( .4 ) )   # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ c(3,5 )] <- 1
itemcluster[c(2,4,9)] <- 2
itemcluster1 <- itemcluster
# residual correlations
rho <- c( .7, .5 )
# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("A", seq(1,ncol(dat)), sep="")
dat1 <- dat
# estimate model of dimension 1
mod0a <- sirt::rasch.copula2( dat1, itemcluster=itemcluster1)
summary(mod0a)

#*** Dimension 2
theta <- theta0 + stats::rnorm( n, sd=sqrt( .8 ) )        # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ c(3,7,8 )] <- 1
itemcluster[c(4,6)] <- 2
itemcluster2 <- itemcluster
# residual correlations
rho <- c( .2, .4 )
# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("B", seq(1,ncol(dat)), sep="")
dat2 <- dat
# estimate model of dimension 2
mod0b <- sirt::rasch.copula2( dat2, itemcluster=itemcluster2)
summary(mod0b)

# both dimensions
dat <- cbind( dat1, dat2 )
itemcluster2 <- ifelse( itemcluster2 > 0, itemcluster2 + 2, 0 )
itemcluster <- c( itemcluster1, itemcluster2 )
dims <- rep( 1:2, each=I)

# estimate two-dimensional copula model
mod1 <- sirt::rasch.copula3( dat, itemcluster=itemcluster, dims=dims, est.a=dims,
            theta.k=seq(-5,5,len=15) )
summary(mod1)

#############################################################################
# EXAMPLE 5: Subset of data Example 2
#############################################################################

set.seed(5698)
I <- 11                             # number of items
n <- 3000                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
theta <- stats::rnorm( n, sd=1.3 )  # person abilities
# define item clusters
itemcluster <- rep(0,I)
itemcluster[ c(3,5)] <- 1
itemcluster[c(2,4,9)] <- 2
# residual correlations
rho <- c( .7, .5 )
# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# select subdataset with only one dependent item cluster
item.sel <- scan( what="character", nlines=1 )
    I1 I6 I7 I8 I10 I11 I3 I5
dat1 <- dat[,item.sel]

#******************
#*** Model 1a: estimate Copula model in sirt
itemcluster <- rep(0,8)
itemcluster[c(7,8)] <- 1
mod1a <- sirt::rasch.copula2( dat3, itemcluster=itemcluster )
summary(mod1a)

#******************
#*** Model 1b: estimate Copula model in mirt
library(mirt)
#*** redefine dataset for estimation in mirt
dat2 <- dat1[, itemcluster==0 ]
dat2 <- as.data.frame(dat2)
# combine items 3 and 5
dat2$C35 <- dat1[,"I3"] + 2*dat1[,"I5"]
table( dat2$C35, paste0( dat1[,"I3"],dat1[,"I5"]) )
#* define mirt model
mirtmodel <- mirt::mirt.model("
      F=1-7
      CONSTRAIN=(1-7,a1)
      " )
#-- Copula function with two dependent items
# define item category function for pseudo-items like C35
P.copula2 <- function(par,Theta, ncat){
     b1 <- par[1]
     b2 <- par[2]
     a1 <- par[3]
     ldelta <- par[4]
     P1 <- stats::plogis( a1*(Theta - b1 ) )
     P2 <- stats::plogis( a1*(Theta - b2 ) )
     Q1 <- 1-P1
     Q2 <- 1-P2
     # define vector-wise minimum function
     minf2 <- function( x1, x2 ){
         ifelse( x1 < x2, x1, x2 )
                                }
     # distribution under independence
     F00 <- Q1*Q2
     F10 <- Q1*Q2 + P1*Q2
     F01 <- Q1*Q2 + Q1*P2
     F11 <- 1+0*Q1
     F_ind <- c(F00,F10,F01,F11)
     # distribution under maximal dependence
     F00 <- minf2(Q1,Q2)
     F10 <- Q2              #=minf2(1,Q2)
     F01 <- Q1              #=minf2(Q1,1)
     F11 <- 1+0*Q1          #=minf2(1,1)
     F_dep <- c(F00,F10,F01,F11)
     # compute mixture distribution
     delta <- stats::plogis(ldelta)
     F_tot <- (1-delta)*F_ind + delta * F_dep
     # recalculate probabilities of mixture distribution
     L1 <- length(Q1)
     v1 <- 1:L1
     F00 <- F_tot[v1]
     F10 <- F_tot[v1+L1]
     F01 <- F_tot[v1+2*L1]
     F11 <- F_tot[v1+3*L1]
     P00 <- F00
     P10 <- F10 - F00
     P01 <- F01 - F00
     P11 <- 1 - F10 - F01 + F00
     prob_tot <- c( P00, P10, P01, P11 )
     return(prob_tot)
        }
# create item
copula2 <- mirt::createItem(name="copula2", par=c(b1=0, b2=0.2, a1=1, ldelta=0),
                est=c(TRUE,TRUE,TRUE,TRUE), P=P.copula2,
                lbound=c(-Inf,-Inf,0,-Inf), ubound=c(Inf,Inf,Inf,Inf) )
# define item types
itemtype <- c( rep("2PL",6), "copula2" )
customItems <- list("copula2"=copula2)
# parameter table
mod.pars <- mirt::mirt(dat2, 1, itemtype=itemtype,
                customItems=customItems, pars='values')
# estimate model
mod1b <- mirt::mirt(dat2, mirtmodel, itemtype=itemtype, customItems=customItems,
                verbose=TRUE, pars=mod.pars,
                technical=list(customTheta=as.matrix(seq(-4,4,len=21)) ) )
# estimated coefficients
cmod <- sirt::mirt.wrapper.coef(mod)$coef

# compare common item discrimination
round( c("sirt"=mod1a$item$a[1], "mirt"=cmod$a1[1] ), 4 )
  ##     sirt   mirt
  ##   1.2845 1.2862
# compare delta parameter
round( c("sirt"=mod1a$item$delta[7], "mirt"=stats::plogis( cmod$ldelta[7] ) ), 4 )
  ##     sirt   mirt
  ##   0.6298 0.6297
# compare thresholds a*b
dfr <- cbind( "sirt"=mod1a$item$thresh,
               "mirt"=c(- cmod$d[-7],cmod$b1[7]*cmod$a1[1], cmod$b2[7]*cmod$a1[1]))
round(dfr,4)
  ##           sirt    mirt
  ##   [1,] -1.9236 -1.9231
  ##   [2,] -0.0565 -0.0562
  ##   [3,]  0.3993  0.3996
  ##   [4,]  0.8058  0.8061
  ##   [5,]  1.5293  1.5295
  ##   [6,]  1.9569  1.9572
  ##   [7,] -1.1414 -1.1404
  ##   [8,] -0.4005 -0.3996

## End(Not run)

Estimation of the Partial Credit Model using the Eigenvector Method

Description

This function performs the eigenvector approach to estimate item parameters which is based on a pairwise estimation approach (Garner & Engelhard, 2002). No assumption about person parameters is required for item parameter estimation. Statistical inference is performed by Jackknifing. If a group identifier is provided, tests for differential item functioning are performed.

Usage

rasch.evm.pcm(dat, jackunits=20, weights=NULL, pid=NULL,
    group=NULL, powB=2, adj_eps=0.3, progress=TRUE )

## S3 method for class 'rasch.evm.pcm'
summary(object, digits=3, file=NULL, ...)

## S3 method for class 'rasch.evm.pcm'
coef(object,...)

## S3 method for class 'rasch.evm.pcm'
vcov(object,...)

Arguments

dat

Data frame with dichotomous or polytomous item responses

jackunits

A number of Jackknife units (if an integer is provided as the argument value) or a vector in which the Jackknife units are already defined.

weights

Optional vector of sample weights

pid

Optional vector of person identifiers

group

Optional vector of group identifiers. In this case, item parameters are group wise estimated and tests for differential item functioning are performed.

powB

Power created in B matrix which is the basis of parameter estimation

adj_eps

Adjustment parameter for person parameter estimation (see mle.pcm.group)

progress

An optional logical indicating whether progress should be displayed

object

Object of class rasch.evm.pcm

digits

Number of digits after decimals for rounding in summary.

file

Optional file name if summary should be sunk into a file.

...

Further arguments to be passed

Value

A list with following entries

item

Data frame with item parameters. The item parameter estimate is denoted by est while a Jackknife bias-corrected estimate is est_jack. The Jackknife standard error is se.

b

Item threshold parameters

person

Data frame with person parameters obtained (MLE)

B

Paired comparison matrix

D

Transformed paired comparison matrix

coef

Vector of estimated coefficients

vcov

Covariance matrix of estimated item parameters

JJ

Number of jackknife units

JJadj

Reduced number of jackknife units

powB

Used power of comparison matrix B

maxK

Maximum number of categories per item

G

Number of groups

desc

Some descriptives

difstats

Statistics for differential item functioning if group is provided as an argument

References

Choppin, B. (1985). A fully conditional estimation procedure for Rasch Model parameters. Evaluation in Education, 9, 29-42.

Garner, M., & Engelhard, G. J. (2002). An eigenvector method for estimating item parameters of the dichotomous and polytomous Rasch models. Journal of Applied Measurement, 3, 107-128.

Wang, J., & Engelhard, G. (2014). A pairwise algorithm in R for rater-mediated assessments. Rasch Measurement Transactions, 28(1), 1457-1459.

Examples

#############################################################################
# EXAMPLE 1: Dataset Liking for Science
#############################################################################

data(data.liking.science)
dat <- data.liking.science

# estimate partial credit model using 10 Jackknife units
mod1 <- sirt::rasch.evm.pcm( dat, jackunits=10 )
summary(mod1)

## Not run: 
# compare results with TAM
library(TAM)
mod2 <- TAM::tam.mml( dat )
r1 <- mod2$xsi$xsi
r1 <- r1 - mean(r1)
# item parameters are similar
dfr <- data.frame( "b_TAM"=r1, mod1$item[,c( "est","est_jack") ] )
round( dfr, 3 )
  ##      b_TAM    est est_jack
  ##  1  -2.496 -2.599   -2.511
  ##  2   0.687  0.824    1.030
  ##  3  -0.871 -0.975   -0.943
  ##  4  -0.360 -0.320   -0.131
  ##  5  -0.833 -0.970   -0.856
  ##  6   1.298  1.617    1.444
  ##  7   0.476  0.465    0.646
  ##  8   2.808  3.194    3.439
  ##  9   1.611  1.460    1.433
  ##  10  2.396  1.230    1.095
  ##  [...]

# partial credit model in eRm package
miceadds::library_install("eRm")
mod3 <- eRm::PCM(X=dat)
summary(mod3)
eRm::plotINFO(mod3)      # plot item and test information
eRm::plotICC(mod3)       # plot ICCs
eRm::plotPImap(mod3)     # plot person-item maps

#############################################################################
# EXAMPLE 2: Garner and Engelhard (2002) toy example dichotomous data
#############################################################################

dat <- scan()
   1 0 1 1   1 1 0 0   1 0 0 0   0 1 1 1   1 1 1 0
   1 1 0 1   1 1 1 1   1 0 1 0   1 1 1 1   1 1 0 0

dat <- matrix( dat, 10, 4, byrow=TRUE)
colnames(dat) <- paste0("I", 1:4 )

# estimate Rasch model with no jackknifing
mod1 <- sirt::rasch.evm.pcm( dat, jackunits=0 )

# paired comparison matrix
mod1$B
  ##          I1_Cat1 I2_Cat1 I3_Cat1 I4_Cat1
  ##  I1_Cat1       0       3       4       5
  ##  I2_Cat1       1       0       3       3
  ##  I3_Cat1       1       2       0       2
  ##  I4_Cat1       1       1       1       0

#############################################################################
# EXAMPLE 3: Garner and Engelhard (2002) toy example polytomous data
#############################################################################

dat <- scan()
   2 2 1 1 1   2 1 2 0 0   1 0 0 0 0   0 1 1 2 0   1 2 2 1 1
   2 2 0 2 1   2 2 1 1 0   1 0 1 0 0   2 1 2 2 2   2 1 0 0 1

dat <- matrix( dat, 10, 5, byrow=TRUE)
colnames(dat) <- paste0("I", 1:5 )

# estimate partial credit model with no jackknifing
mod1 <- sirt::rasch.evm.pcm( dat, jackunits=0, powB=3 )

# paired comparison matrix
mod1$B
  ##          I1_Cat1 I1_Cat2 I2_Cat1 I2_Cat2 I3_Cat1 I3_Cat2 I4_Cat1 I4_Cat2 I5_Cat1 I5_Cat2
  ##  I1_Cat1       0       0       2       0       1       1       2       1       2       1
  ##  I1_Cat2       0       0       0       3       2       2       2       2       2       3
  ##  I2_Cat1       1       0       0       0       1       1       2       0       2       1
  ##  I2_Cat2       0       1       0       0       1       2       0       3       1       3
  ##  I3_Cat1       1       1       1       1       0       0       1       2       3       1
  ##  I3_Cat2       0       1       0       2       0       0       1       1       1       1
  ##  I4_Cat1       0       1       0       0       0       2       0       0       1       2
  ##  I4_Cat2       1       0       0       2       1       1       0       0       1       1
  ##  I5_Cat1       0       1       0       1       2       1       1       2       0       0
  ##  I5_Cat2       0       0       0       1       0       0       0       0       0       0

#############################################################################
# EXAMPLE 4: Partial credit model for dataset data.mg from CDM package
#############################################################################

library(CDM)
data(data.mg,package="CDM")
dat <- data.mg[, paste0("I",1:11) ]

#*** Model 1: estimate partial credit model
mod1 <- sirt::rasch.evm.pcm( dat )
# item parameters
round( mod1$b, 3 )
  ##        Cat1   Cat2  Cat3
  ##  I1  -1.537     NA    NA
  ##  I2  -2.360     NA    NA
  ##  I3  -0.574     NA    NA
  ##  I4  -0.971 -2.086    NA
  ##  I5  -0.104  0.201    NA
  ##  I6   0.470  0.806    NA
  ##  I7  -1.027  0.756 1.969
  ##  I8   0.897     NA    NA
  ##  I9   0.766     NA    NA
  ##  I10  0.069     NA    NA
  ##  I11 -1.122  1.159 2.689

#*** Model 2: estimate PCM with pairwise package
miceadds::library_install("pairwise")
mod2 <- pairwise::pair(daten=dat)
summary(mod2)
plot(mod2)
# compute standard errors
semod2 <- pairwise::pairSE(daten=dat,  nsample=20)
semod2

#############################################################################
# EXAMPLE 5: Differential item functioning for dataset data.mg
#############################################################################

library(CDM)
data(data.mg,package="CDM")
dat <- data.mg[ data.mg$group %in% c(2,3,11), ]
# define items
items <- paste0("I",1:11)
# estimate model
mod1 <- sirt::rasch.evm.pcm( dat[,items], weights=dat$weight, group=dat$group )
summary(mod1)

#############################################################################
# EXAMPLE 6: Differential item functioning for Rasch model
#############################################################################

# simulate some data
set.seed(9776)
N <- 1000    # number of persons
I <- 10        # number of items
# simulate data for first group
b <- seq(-1.5,1.5,len=I)
dat1 <- sirt::sim.raschtype( stats::rnorm(N), b )
# simulate data for second group
b1 <- b
b1[4] <- b1[4] + .5 # introduce DIF for fourth item
dat2 <- sirt::sim.raschtype( stats::rnorm(N,mean=.3), b1 )
dat <- rbind(dat1, dat2 )
group <- rep( 1:2, each=N )
# estimate model
mod1 <- sirt::rasch.evm.pcm( dat, group=group )
summary(mod1)

## End(Not run)

Joint Maximum Likelihood (JML) Estimation of the Rasch Model

Description

This function estimates the Rasch model using joint maximum likelihood estimation (Lincare, 1994). The PROX algorithm (Lincare, 1994) is used for the generation of starting values of item parameters.

Usage

rasch.jml(dat, method="MLE", b.init=NULL, constraints=NULL, weights=NULL,
    center="persons", glob.conv=10^(-6), conv1=1e-05, conv2=0.001, progress=TRUE,
    bsteps=4, thetasteps=2, wle.adj=0, jmliter=100, prox=TRUE,
    proxiter=30, proxconv=0.01, dp=NULL, theta.init=NULL, calc.fit=TRUE,
    prior_sd=NULL)

## S3 method for class 'rasch.jml'
summary(object, digits=3, ...)

Arguments

dat

An N \times I data frame of dichotomous item responses where N indicates the number of persons and I the number of items

method

Method for estimating person parameters during JML iterations. MLE is maximum likelihood estimation (where person with perfect scores are deleted from analysis). WLE uses weighted likelihood estimation (Warm, 1989) for person parameter estimation. Default is MLE.

b.init

Initial values of item difficulties

constraints

Optional matrix or data.frame with two columns. First column is an integer of item indexes or item names (colnames(dat)) which shall be fixed during estimation. The second column is the corresponding item difficulty.

weights

Person sample weights. Default is NULL, i.e. all persons in the sample are equally weighted.

center

Character indicator whether persons ("persons"), items ("items") should be centered or ("none") should be conducted.

glob.conv

Global convergence criterion with respect to the log-likelihood function

conv1

Convergence criterion for estimation of item parameters

conv2

Convergence criterion for estimation of person parameters

progress

Display progress? Default is TRUE

bsteps

Number of steps for b parameter estimation

thetasteps

Number of steps for theta parameter estimation

wle.adj

Score adjustment for WLE estimation

jmliter

Number of maximal iterations during JML estimation

prox

Should the PROX algorithm (see rasch.prox) be used as initial estimations? Default is TRUE.

proxiter

Number of maximal PROX iterations

proxconv

Convergence criterion for PROX iterations

dp

Object created from data preparation function (.data.prep) which could be created in earlier JML runs. Default is NULL.

theta.init

Initial person parameter estimate

calc.fit

Should itemfit being calculated?

prior_sd

Optional value for standard deviation of prior distribution for ability values if penalized JML should be utilized

object

Object of class rasch.jml

digits

Number of digits used for rounding

...

Further arguments to be passed

Details

The estimation is known to have a bias in item parameters for a fixed (finite) number of items. In literature (Lincare, 1994), a simple bias correction formula is proposed and included in the value item$itemdiff.correction in this function. If I denotes the number of items, then the correction factor is \frac{I-1}{I}.

Value

A list with following entries

item

Estimated item parameters

person

Estimated person parameters

method

Person parameter estimation method

dat

Original data frame

deviance

Deviance

data.proc

Processed data frames excluding persons with extreme scores

dp

Value of data preparation (it is used in the function rasch.jml.jackknife1)

References

Linacre, J. M. (1994). Many-Facet Rasch Measurement. Chicago: MESA Press.

Warm, T. A. (1989). Weighted likelihood estimation of ability in the item response theory. Psychometrika, 54, 427-450.

Examples

#############################################################################
# EXAMPLE 1: Simulated data from the Rasch model
#############################################################################

set.seed(789)
N <- 500    # number of persons
I <- 11     # number of items
b <- seq( -2, 2, length=I )
dat <- sirt::sim.raschtype( stats::rnorm( N, mean=.5 ), b )
colnames(dat) <- paste( "I", 1:I, sep="")

# JML estimation of the Rasch model (centering persons)
mod1 <- sirt::rasch.jml( dat )
summary(mod1)

# JML estimation of the Rasch model (centering items)
mod1b <- sirt::rasch.jml( dat, center="items" )
summary(mod1b)

# MML estimation with rasch.mml2 function
mod2 <- sirt::rasch.mml2( dat )
summary(mod2)

# Pairwise method of Fischer
mod3 <- sirt::rasch.pairwise( dat )
summary(mod3)

# JML estimation in TAM
## Not run: 
library(TAM)
mod4 <- TAM::tam.jml( resp=dat )

#******
# item parameter constraints in JML estimation
# fix item difficulties: b[4]=-.76 and b[6]=.10
constraints <- matrix( cbind( 4, -.76,
                              6, .10 ),
                  ncol=2, byrow=TRUE )
mod6 <- sirt::rasch.jml( dat, constraints=constraints )
summary(mod6)
  # For constrained item parameters, it this not obvious
  # how to calculate a 'right correction' of item parameter bias

## End(Not run)

Bias Correction of Item Parameters for Joint Maximum Likelihood Estimation in the Rasch model

Description

This function computes an analytical bias correction for the Rasch model according to the method of Arellano and Hahn (2007).

Usage

rasch.jml.biascorr(jmlobj,itemfac=NULL)

Arguments

jmlobj

An object which is the output of the rasch.jml function

itemfac

Number of items which are used for bias correction. By default it is the average number of item responses per person.

Value

A list with following entries

b.biascorr

Matrix of item difficulty estimates. The column b.analytcorr1 contains item difficulties by analytical bias correction of Method 1 in Arellano and Hahn (2007) whereas b.analytcorr2 corresponds to Method 2.

b.bias1

Estimated bias by Method 1

b.bias2

Estimated bias by Method 2

itemfac

Number of items which are used as the factor for bias correction

References

Arellano, M., & Hahn, J. (2007). Understanding bias in nonlinear panel models: Some recent developments. In R. Blundell, W. Newey & T. Persson (Eds.): Advances in Economics and Econometrics, Ninth World Congress, Cambridge University Press.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)
dat <- data( data.read )

# estimate Rasch model
mod <- sirt::rasch.jml( data.read  )

# JML with analytical bias correction
res1 <- sirt::rasch.jml.biascorr( jmlobj=mod  )
print( res1$b.biascorr, digits=3 )
  ##        b.JML b.JMLcorr b.analytcorr1 b.analytcorr2
  ##   1  -2.0086   -1.8412        -1.908        -1.922
  ##   2  -1.1121   -1.0194        -1.078        -1.088
  ##   3  -0.0718   -0.0658        -0.150        -0.127
  ##   4   0.5457    0.5002         0.393         0.431
  ##   5  -0.9504   -0.8712        -0.937        -0.936
  ##  [...]

Jackknifing the IRT Model Estimated by Joint Maximum Likelihood (JML)

Description

Jackknife estimation is an alternative to other ad hoc proposed methods for bias correction (Hahn & Newey, 2004).

Usage

rasch.jml.jackknife1(jmlobj)

Arguments

jmlobj

Output of rasch.jml

Details

Note that items are used for jackknifing (Hahn & Newey, 2004). By default, all I items in the data frame are used as jackknife units.

Value

A list with following entries

item

A data frame with item parameters

b.JML: Item difficulty from JML estimation
b.JMLcorr: Item difficulty from JML estimation by applying the correction factor (I-1)/I
b.jack: Item difficulty from Jackknife estimation
b.jackse: Standard error of Jackknife estimation for item difficulties. Note that this parameter refer to the standard error with respect to item sampling
b.JMLse: Standard error for item difficulties obtained from JML estimation

jack.itemdiff

A matrix containing all item difficulties obtained by Jackknife

References

Hahn, J., & Newey, W. (2004). Jackknife and analytical bias reduction for nonlinear panel models. Econometrica, 72, 1295-1319.

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Simulated data from the Rasch model
#############################################################################
set.seed(7655)
N <- 5000    # number of persons
I <- 11      # number of items
b <- seq( -2, 2, length=I )
dat <- sirt::sim.raschtype( rnorm( N ), b )
colnames(dat) <- paste( "I", 1:I, sep="")

# estimate the Rasch model with JML
mod <- sirt::rasch.jml(dat)
summary(mod)

# re-estimate the Rasch model using Jackknife
mod2 <- sirt::rasch.jml.jackknife1( mod )
  ##
  ##   Joint Maximum Likelihood Estimation
  ##   Jackknife Estimation
  ##   11 Jackknife Units are used
  ##   |--------------------PROGRESS--------------------|
  ##   |------------------------------------------------|
  ##
  ##          N     p  b.JML b.JMLcorr b.jack b.jackse b.JMLse
  ##   I1  4929 0.853 -2.345    -2.131 -2.078    0.079   0.045
  ##   I2  4929 0.786 -1.749    -1.590 -1.541    0.075   0.039
  ##   I3  4929 0.723 -1.298    -1.180 -1.144    0.065   0.036
  ##   I4  4929 0.657 -0.887    -0.806 -0.782    0.059   0.035
  ##   I5  4929 0.576 -0.420    -0.382 -0.367    0.055   0.033
  ##   I6  4929 0.492  0.041     0.038  0.043    0.054   0.033
  ##   I7  4929 0.409  0.502     0.457  0.447    0.056   0.034
  ##   I8  4929 0.333  0.939     0.854  0.842    0.058   0.035
  ##   I9  4929 0.264  1.383     1.257  1.229    0.065   0.037
  ##   I10 4929 0.210  1.778     1.617  1.578    0.071   0.040
  ##   I11 4929 0.154  2.266     2.060  2.011    0.077   0.044
#-> Item parameters obtained by jackknife seem to be acceptable.

## End(Not run)

Multidimensional Latent Class 1PL and 2PL Model

Description

This function estimates the multidimensional latent class Rasch (1PL) and 2PL model (Bartolucci, 2007; Bartolucci, Montanari & Pandolfi, 2012) for dichotomous data which emerges from the original latent class model (Goodman, 1974) and a multidimensional IRT model.

Usage

rasch.mirtlc(dat, Nclasses=NULL, modeltype="LC", dimensions=NULL,
    group=NULL, weights=rep(1,nrow(dat)), theta.k=NULL, ref.item=NULL,
    distribution.trait=FALSE,  range.b=c(-8,8), range.a=c(.2, 6 ),
    progress=TRUE, glob.conv=10^(-5), conv1=10^(-5), mmliter=1000,
    mstep.maxit=3, seed=0, nstarts=1, fac.iter=.35)

## S3 method for class 'rasch.mirtlc'
summary(object,...)

## S3 method for class 'rasch.mirtlc'
anova(object,...)

## S3 method for class 'rasch.mirtlc'
logLik(object,...)

## S3 method for class 'rasch.mirtlc'
IRT.irfprob(object,...)

## S3 method for class 'rasch.mirtlc'
IRT.likelihood(object,...)

## S3 method for class 'rasch.mirtlc'
IRT.posterior(object,...)

## S3 method for class 'rasch.mirtlc'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.rasch.mirtlc'
summary(object,...)

Arguments

dat

An N \times I data frame

Nclasses

Number of latent classes. If the trait vector (or matrix) theta.k is specified, then Nclasses is set to the dimension of theta.k.

modeltype

Modeltype. LC is the latent class model of Goodman (1974). MLC1 is the multidimensional latent class Rasch model with item discrimination parameter of 1. MLC2 allows for the estimation of item discriminations.

dimensions

Vector of dimension integers which allocate items to dimensions.

group

A group identifier for multiple group estimation

weights

Vector of sample weights

theta.k

A grid of theta values can be specified if theta should not be estimated. In the one-dimensional case, it must be a vector, in the D-dimensional case it must be a matrix of dimension D.

ref.item

An optional vector of integers which indicate the items whose intercept and slope are fixed at 0 and 1, respectively.

distribution.trait

A type of the assumed theta distribution can be specified. One alternative is normal for the normal distribution assumption. The options smooth2, smooth3 and smooth4 use the log-linear smoothing of Xu and von Davier (2008) to smooth the distribution up to two, three or four moments, respectively. This function only works in unidimensional models.
If a different string is provided as an input (e.g. no), then no smoothing is conducted.

range.b

Range of item difficulties which are allowed for estimation

range.a

Range of item slopes which are allowed for estimation

progress

Display progress? Default is TRUE.

glob.conv

Global relative deviance convergence criterion

conv1

Item parameter convergence criterion

mmliter

Maximum number of iterations

mstep.maxit

Maximum number of iterations within an M step

seed

Set random seed for latent class estimation. A seed can be specified. If the seed is negative, then the function will generate a random seed.

nstarts

If a positive integer is provided, then a nstarts starts with different starting values are conducted.

fac.iter

A parameter between 0 and 1 to control the maximum increment in each iteration. The larger the parameter the more increments will become smaller from iteration to iteration.

object

Object of class rasch.mirtlc

...

Further arguments to be passed

Details

The multidimensional latent class Rasch model (Bartolucci, 2007) is an item response model which combines ideas from latent class analysis and item response models with continuous variables. With modeltype="MLC2" the following D-dimensional item response model is estimated

logit P(X_{pi}=1 | \theta_p )=a_i \theta_{pcd}- b_i

Besides the item thresholds b_i and item slopes a_i, for a prespecified number of latent classes c=1,\ldots,C a set of C D-dimensional \{\theta_{cd} \}_{cd} vectors are estimated. These vectors represent the locations of latent classes. If the user provides a grid of theta distribution theta.k as an argument in rasch.mirtlc, then the ability distribution is fixed.

In the unidimensional Rasch model with I items, (I+1)/2 (if I odd) or I/2 + 1 (if I even) trait location parameters are identified (see De Leeuw & Verhelst, 1986; Lindsay et al., 1991; for a review see Formann, 2007).

Value

A list with following entries

pjk

Item probabilities evaluated at discretized ability distribution

rprobs

Item response probabilities like in pjk, but for each item category

pi.k

Estimated trait distribution

theta.k

Discretized ability distribution

item

Estimated item parameters

trait

Estimated ability distribution (theta.k and pi.k)

mean.trait

Estimated mean of ability distribution

sd.trait

Estimated standard deviation of ability distribution

skewness.trait

Estimated skewness of ability distribution

cor.trait

Estimated correlation between abilities (only applies for multidimensional models)

ic

Information criteria

D

Number of dimensions

G

Number of groups

deviance

Deviance

ll

Log-likelihood

Nclasses

Number of classes

modeltype

Used model type

estep.res

Result from E step: f.qk.yi is the individual posterior, f.yi.qk is the individual likelihood

dat

Original data frame

devL

Vector of deviances if multiple random starts were conducted

seedL

Vector of seed if multiple random starts were conducted

iter

Number of iterations

Note

For the estimation of latent class models, rerunning the model with different starting values (different random seeds) is recommended.

References

Bartolucci, F. (2007). A class of multidimensional IRT models for testing unidimensionality and clustering items. Psychometrika, 72(2), 141-157. doi:10.1007/s11336-005-1376-9

De Leeuw, J., & Verhelst, N. (1986). Maximum likelihood estimation in generalized Rasch models. Journal of Educational and Behavioral Statistics, 11(3), 183-196. doi:10.3102/10769986011003183

Formann, A. K. (2007). (Almost) Equivalence between conditional and mixture maximum likelihood estimates for some models of the Rasch type. In M. von Davier & C. H. Carstensen: Multivariate and Mixture Distribution Rasch Models (pp. 177-189). Springer: New York. doi:10.1007/978-0-387-49839-3_11

Goodman, L. A. (1974). Exploratory latent structure analysis using both identifiable and unidentifiable models. Biometrika, 61(2), 215-231. doi:10.1093/biomet/61.2.215

Xu, X., & von Davier, M. (2008). Fitting the structured general diagnostic model to NAEP data. ETS Research Report ETS RR-08-27. Princeton, ETS. doi:10.1002/j.2333-8504.2008.tb02113.x

Examples

#############################################################################
# EXAMPLE 1: Reading data
#############################################################################
data( data.read )
dat <- data.read

#***************
# latent class models

# latent class model with 1 class
mod1 <- sirt::rasch.mirtlc( dat, Nclasses=1 )
summary(mod1)

# latent class model with 2 classes
mod2 <- sirt::rasch.mirtlc( dat, Nclasses=2 )
summary(mod2)

## Not run: 
# latent class model with 3 classes
mod3 <- sirt::rasch.mirtlc( dat, Nclasses=3, seed=- 30)
summary(mod3)

# extract individual likelihood
lmod3 <- IRT.likelihood(mod3)
str(lmod3)
# extract likelihood value
logLik(mod3)
# extract item response functions
IRT.irfprob(mod3)

# compare models 1, 2 and 3
anova(mod2,mod3)
IRT.compareModels(mod1,mod2,mod3)
# avsolute and relative model fit
smod2 <- IRT.modelfit(mod2)
smod3 <- IRT.modelfit(mod3)
summary(smod2)
IRT.compareModels(smod2,smod3)

# latent class model with 4 classes and 3 starts with different seeds
mod4 <- sirt::rasch.mirtlc( dat, Nclasses=4,seed=-30,  nstarts=3 )
# display different solutions
sort(mod4$devL)
summary(mod4)

# latent class multiple group model
# define group identifier
group <- rep( 1, nrow(dat))
group[ 1:150 ] <- 2
mod5 <- sirt::rasch.mirtlc( dat, Nclasses=3, group=group )
summary(mod5)

#*************
# Unidimensional IRT models with ordered trait

# 1PL model with 3 classes
mod11 <- sirt::rasch.mirtlc( dat, Nclasses=3, modeltype="MLC1", mmliter=30)
summary(mod11)

# 1PL model with 11 classes
mod12 <- sirt::rasch.mirtlc( dat, Nclasses=11,modeltype="MLC1", mmliter=30)
summary(mod12)

# 1PL model with 11 classes and fixed specified theta values
mod13 <- sirt::rasch.mirtlc( dat,  modeltype="MLC1",
             theta.k=seq( -4, 4, len=11 ), mmliter=100)
summary(mod13)

# 1PL model with fixed theta values and normal distribution
mod14 <- sirt::rasch.mirtlc( dat,  modeltype="MLC1", mmliter=30,
             theta.k=seq( -4, 4, len=11 ), distribution.trait="normal")
summary(mod14)

# 1PL model with a smoothed trait distribution (up to 3 moments)
mod15 <- sirt::rasch.mirtlc( dat,  modeltype="MLC1", mmliter=30,
             theta.k=seq( -4, 4, len=11 ),  distribution.trait="smooth3")
summary(mod15)

# 2PL with 3 classes
mod16 <- sirt::rasch.mirtlc( dat, Nclasses=3, modeltype="MLC2", mmliter=30 )
summary(mod16)

# 2PL with fixed theta and smoothed distribution
mod17 <- sirt::rasch.mirtlc( dat, theta.k=seq(-4,4,len=12), mmliter=30,
             modeltype="MLC2", distribution.trait="smooth4"  )
summary(mod17)

# 1PL multiple group model with 8 classes
# define group identifier
group <- rep( 1, nrow(dat))
group[ 1:150 ] <- 2
mod21 <- sirt::rasch.mirtlc( dat, Nclasses=8, modeltype="MLC1", group=group )
summary(mod21)

#***************
# multidimensional latent class IRT models

# define vector of dimensions
dimensions <- rep( 1:3, each=4 )

# 3-dimensional model with 8 classes and seed 145
mod31 <- sirt::rasch.mirtlc( dat, Nclasses=8, mmliter=30,
             modeltype="MLC1", seed=145, dimensions=dimensions )
summary(mod31)

# try the model above with different starting values
mod31s <- sirt::rasch.mirtlc( dat, Nclasses=8,
             modeltype="MLC1", seed=-30, nstarts=30, dimensions=dimensions )
summary(mod31s)

# estimation with fixed theta vectors
#=> 4^3=216 classes
theta.k <- seq(-4, 4, len=6 )
theta.k <- as.matrix( expand.grid( theta.k, theta.k, theta.k ) )
mod32 <- sirt::rasch.mirtlc( dat,  dimensions=dimensions,
              theta.k=theta.k, modeltype="MLC1"  )
summary(mod32)

# 3-dimensional 2PL model
mod33 <- sirt::rasch.mirtlc( dat, dimensions=dimensions, theta.k=theta.k, modeltype="MLC2")
summary(mod33)

#############################################################################
# EXAMPLE 2: Skew trait distribution
#############################################################################
set.seed(789)
N <- 1000   # number of persons
I <- 20     # number of items
theta <- sqrt( exp( stats::rnorm( N ) ) )
theta <- theta - mean(theta )
# calculate skewness of theta distribution
mean( theta^3 ) / stats::sd(theta)^3
# simulate item responses
dat <- sirt::sim.raschtype( theta, b=seq(-2,2,len=I ) )

# normal distribution
mod1 <- sirt::rasch.mirtlc( dat, theta.k=seq(-4,4,len=15), modeltype="MLC1",
               distribution.trait="normal", mmliter=30)

# allow for skew distribution with smoothed distribution
mod2 <- sirt::rasch.mirtlc( dat, theta.k=seq(-4,4,len=15), modeltype="MLC1",
               distribution.trait="smooth3", mmliter=30)

# nonparametric distribution
mod3 <- sirt::rasch.mirtlc( dat, theta.k=seq(-4,4,len=15), modeltype="MLC1", mmliter=30)

summary(mod1)
summary(mod2)
summary(mod3)

#############################################################################
# EXAMPLE 3: Stouffer-Toby dataset data.si02 with 5 items
#############################################################################

data(dat.si02)
dat <- data.si02$data
weights <- data.si02$weights   # extract weights

# Model 1: 2 classes Rasch model
mod1 <- sirt::rasch.mirtlc( dat, Nclasses=2, modeltype="MLC1", weights=weights,
                 ref.item=4, nstarts=5)
summary(mod1)

# Model 2: 3 classes Rasch model: not all parameters are identified
mod2 <- sirt::rasch.mirtlc( dat, Nclasses=3, modeltype="MLC1", weights=weights,
                ref.item=4, nstarts=5)
summary(mod2)

# Model 3: Latent class model with 2 classes
mod3 <- sirt::rasch.mirtlc( dat, Nclasses=2, modeltype="LC", weights=weights, nstarts=5)
summary(mod3)

# Model 4: Rasch model with normal distribution
mod4 <- sirt::rasch.mirtlc( dat,  modeltype="MLC1", weights=weights,
            theta.k=seq( -6, 6, len=21 ), distribution.trait="normal", ref.item=4)
summary(mod4)

## End(Not run)

#############################################################################
# EXAMPLE 4: 5 classes, 3 dimensions and 27 items
#############################################################################

set.seed(979)
I <- 9
N <- 5000
b <- seq( - 1.5, 1.5, len=I)
b <- rep(b,3)
# define class locations
theta.k <- c(-3.0, -4.1, -2.8, 1.7, 2.3, 1.8,
   0.2, 0.4, -0.1,   2.6, 0.1, -0.9, -1.1,-0.7, 0.9 )

Nclasses <- 5
theta.k0 <- theta.k <- matrix( theta.k, Nclasses, 3, byrow=TRUE )
pi.k <- c(.20,.25,.25,.10,.15)
theta <- theta.k[ rep( 1:Nclasses, round(N*pi.k) ), ]
dimensions <- rep( 1:3, each=I)
# simulate item responses
dat <- matrix( NA, nrow=N, ncol=I*3)
for (ii in 1:(3*I) ){
    dat[,ii] <- 1 * ( stats::runif(N) < stats::plogis( theta[,dimensions[ii]] - b[ii]))
}
colnames(dat) <- paste0( rep( LETTERS[1:3], each=I ), 1:(3*I) )

# estimate model
mod1 <- sirt::rasch.mirtlc( dat, Nclasses=Nclasses, dimensions=dimensions,
             modeltype="MLC1", ref.item=c(5,14,23), glob.conv=.0005, conv1=.0005)

round( cbind( mod1$theta.k, mod1$pi.k ), 3 )
  ##          [,1]   [,2]   [,3]  [,4]
  ##   [1,] -2.776 -3.791 -2.667 0.250
  ##   [2,] -0.989 -0.605  0.957 0.151
  ##   [3,]  0.332  0.418 -0.046 0.246
  ##   [4,]  2.601  0.171 -0.854 0.101
  ##   [5,]  1.791  2.330  1.844 0.252
cbind( theta.k, pi.k )
  ##                       pi.k
  ##   [1,] -3.0 -4.1 -2.8 0.20
  ##   [2,]  1.7  2.3  1.8 0.25
  ##   [3,]  0.2  0.4 -0.1 0.25
  ##   [4,]  2.6  0.1 -0.9 0.10
  ##   [5,] -1.1 -0.7  0.9 0.15

# plot class locations
plot( 1:3, mod1$theta.k[1,], xlim=c(1,3), ylim=c(-5,3), col=1, pch=1, type="n",
    axes=FALSE, xlab="Dimension", ylab="Location")
axis(1, 1:3 ) ;  axis(2) ; axis(4)
for (cc in 1:Nclasses){ # cc <- 1
    lines(1:3, mod1$theta.k[cc,], col=cc, lty=cc )
    points(1:3, mod1$theta.k[cc,], col=cc,  pch=cc )
}

## Not run: 
#------
# estimate model with gdm function in CDM package
library(CDM)
# define Q-matrix
Qmatrix <- matrix(0,3*I,3)
Qmatrix[ cbind( 1:(3*I), rep(1:3, each=I) ) ] <- 1

set.seed(9176)
# random starting values for theta locations
theta.k <- matrix( 2*stats::rnorm(5*3), 5, 3 )
colnames(theta.k) <- c("Dim1","Dim2","Dim3")
# try possibly different starting values

# estimate model in CDM
b.constraint  <- cbind( c(5,14,23), 1, 0 )
mod2 <- CDM::gdm( dat, theta.k=theta.k, b.constraint=b.constraint, skillspace="est",
               irtmodel="1PL",  Qmatrix=Qmatrix)
summary(mod2)

#------
# estimate model with MultiLCIRT package
miceadds::library_install("MultiLCIRT")

# define matrix to allocate each item to one dimension
multi1 <- matrix( 1:(3*I), nrow=3, byrow=TRUE )
# define reference items in item-dimension allocation matrix
multi1[ 1, c(1,5)  ] <- c(5,1)
multi1[ 2, c(10,14) - 9  ] <- c(14,9)
multi1[ 3, c(19,23) - 18 ] <- c(23,19)

# Rasch model with 5 latent classes (random start: start=1)
mod3 <- MultiLCIRT::est_multi_poly(S=dat,k=5,       # k=5 ability levels
                start=1,link=1,multi=multi1,tol=10^-5,
                output=TRUE, disp=TRUE, fort=TRUE)
# estimated location points and class probabilities in MultiLCIRT
cbind( t( mod3$Th ), mod3$piv )
# compare results with rasch.mirtlc
cbind( mod1$theta.k, mod1$pi.k )
# simulated data parameters
cbind( theta.k, pi.k )

#----
# estimate model with cutomized input in mirt
library(mirt)
#-- define Theta design matrix for 5 classes
Theta <- diag(5)
Theta <- cbind( Theta, Theta, Theta )
r1 <- rownames(Theta) <- paste0("C",1:5)
colnames(Theta) <- c( paste0(r1, "D1"), paste0(r1, "D2"), paste0(r1, "D3") )
  ##      C1D1 C2D1 C3D1 C4D1 C5D1 C1D2 C2D2 C3D2 C4D2 C5D2 C1D3 C2D3 C3D3 C4D3 C5D3
  ##   C1    1    0    0    0    0    1    0    0    0    0    1    0    0    0    0
  ##   C2    0    1    0    0    0    0    1    0    0    0    0    1    0    0    0
  ##   C3    0    0    1    0    0    0    0    1    0    0    0    0    1    0    0
  ##   C4    0    0    0    1    0    0    0    0    1    0    0    0    0    1    0
  ##   C5    0    0    0    0    1    0    0    0    0    1    0    0    0    0    1
#-- define mirt model
I <- ncol(dat)  # I=27
mirtmodel <- mirt::mirt.model("
        C1D1=1-9 \n C2D1=1-9 \n  C3D1=1-9 \n  C4D1=1-9  \n  C5D1=1-9
        C1D2=10-18 \n C2D2=10-18 \n  C3D2=10-18 \n  C4D2=10-18  \n  C5D2=10-18
        C1D3=19-27 \n C2D3=19-27 \n  C3D3=19-27 \n  C4D3=19-27  \n  C5D3=19-27
        CONSTRAIN=(1-9,a1),(1-9,a2),(1-9,a3),(1-9,a4),(1-9,a5),
                    (10-18,a6),(10-18,a7),(10-18,a8),(10-18,a9),(10-18,a10),
                    (19-27,a11),(19-27,a12),(19-27,a13),(19-27,a14),(19-27,a15)
                ")
#-- get initial parameter values
mod.pars <- mirt::mirt(dat, model=mirtmodel,  pars="values")
#-- redefine initial parameter values
# set all d parameters initially to zero
ind <- which( ( mod.pars$name=="d" ) )
mod.pars[ ind,"value" ]  <- 0
# fix item difficulties of reference items to zero
mod.pars[ ind[ c(5,14,23) ], "est"] <- FALSE
mod.pars[ind,]
# initial item parameters of cluster locations (a1,...,a15)
ind <- which( ( mod.pars$name %in% paste0("a", c(1,6,11) ) ) & ( mod.pars$est ) )
mod.pars[ind,"value"] <- -2
ind <- which( ( mod.pars$name %in% paste0("a", c(1,6,11)+1 ) ) & ( mod.pars$est ) )
mod.pars[ind,"value"] <- -1
ind <- which( ( mod.pars$name %in% paste0("a", c(1,6,11)+2 ) ) & ( mod.pars$est ) )
mod.pars[ind,"value"] <- 0
ind <- which( ( mod.pars$name %in% paste0("a", c(1,6,11)+3 ) ) & ( mod.pars$est ) )
mod.pars[ind,"value"] <- 1
ind <- which( ( mod.pars$name %in% paste0("a", c(1,6,11)+4 ) ) & ( mod.pars$est ) )
mod.pars[ind,"value"] <- 0
#-- define prior for latent class analysis
lca_prior <- function(Theta,Etable){
  TP <- nrow(Theta)
  if ( is.null(Etable) ){ prior <- rep( 1/TP, TP ) }
  if ( ! is.null(Etable) ){
    prior <- ( rowSums(Etable[, seq(1,2*I,2)]) + rowSums(Etable[,seq(2,2*I,2)]) )/I
  }
  prior <- prior / sum(prior)
  return(prior)
}

#-- estimate model in mirt
mod4 <- mirt::mirt(dat, mirtmodel, pars=mod.pars, verbose=TRUE,
              technical=list( customTheta=Theta, customPriorFun=lca_prior,
                    MAXQUAD=1E20) )
# correct number of estimated parameters
mod4@nest <- as.integer(sum(mod.pars$est) + nrow(Theta)-1 )
# extract coefficients
# source.all(pfsirt)
cmod4 <- sirt::mirt.wrapper.coef(mod4)

# estimated item difficulties
dfr <- data.frame( "sim"=b, "mirt"=-cmod4$coef$d, "sirt"=mod1$item$thresh )
round( dfr, 4 )
  ##         sim    mirt    sirt
  ##   1  -1.500 -1.3782 -1.3382
  ##   2  -1.125 -1.0059 -0.9774
  ##   3  -0.750 -0.6157 -0.6016
  ##   4  -0.375 -0.2099 -0.2060
  ##   5   0.000  0.0000  0.0000
  ##   6   0.375  0.5085  0.4984
  ##   7   0.750  0.8661  0.8504
  ##   8   1.125  1.3079  1.2847
  ##   9   1.500  1.5891  1.5620
  ##   [...]

#-- reordering estimated latent clusters to make solutions comparable
#* extract estimated cluster locations from sirt
order.sirt <- c(1,5,3,4,2)  # sort(order.sirt)
round(mod1$trait[,1:3],3)
dfr <- data.frame( "sim"=theta.k, mod1$trait[order.sirt,1:3] )
colnames(dfr)[4:6] <- paste0("sirt",1:3)
#* extract estimated cluster locations from mirt
c4 <- cmod4$coef[, paste0("a",1:15) ]
c4 <- apply( c4,2, FUN=function(ll){ ll[ ll!=0 ][1] } )
trait.loc <- matrix(c4,5,3)
order.mirt <- c(1,4,3,5,2)  # sort(order.mirt)
dfr <- cbind( dfr, trait.loc[ order.mirt, ] )
colnames(dfr)[7:9] <- paste0("mirt",1:3)
# compare estimated cluster locations
round(dfr,3)
  ##     sim.1 sim.2 sim.3  sirt1  sirt2  sirt3  mirt1  mirt2  mirt3
  ##   1  -3.0  -4.1  -2.8 -2.776 -3.791 -2.667 -2.856 -4.023 -2.741
  ##   5   1.7   2.3   1.8  1.791  2.330  1.844  1.817  2.373  1.869
  ##   3   0.2   0.4  -0.1  0.332  0.418 -0.046  0.349  0.421 -0.051
  ##   4   2.6   0.1  -0.9  2.601  0.171 -0.854  2.695  0.166 -0.876
  ##   2  -1.1  -0.7   0.9 -0.989 -0.605  0.957 -1.009 -0.618  0.962
#* compare estimated cluster sizes
dfr <- data.frame( "sim"=pi.k, "sirt"=mod1$pi.k[order.sirt,1],
            "mirt"=mod4@Prior[[1]][ order.mirt] )
round(dfr,4)
  ##      sim   sirt   mirt
  ##   1 0.20 0.2502 0.2500
  ##   2 0.25 0.2522 0.2511
  ##   3 0.25 0.2458 0.2494
  ##   4 0.10 0.1011 0.0986
  ##   5 0.15 0.1507 0.1509

#############################################################################
# EXAMPLE 5: Dataset data.si04 from Bartolucci et al. (2012)
#############################################################################

data(data.si04)

# define reference items
ref.item <- c(7,17,25,44,64)
dimensions <- data.si04$itempars$dim

# estimate a Rasch latent class with 9 classes
mod1 <- sirt::rasch.mirtlc( data.si04$data, Nclasses=9, dimensions=dimensions,
             modeltype="MLC1", ref.item=ref.item, glob.conv=.005, conv1=.005,
             nstarts=1, mmliter=200 )

# compare estimated distribution with simulated distribution
round( cbind( mod1$theta.k, mod1$pi.k ), 4 ) # estimated
  ##            [,1]    [,2]    [,3]    [,4]    [,5]   [,6]
  ##    [1,] -3.6043 -5.1323 -5.3022 -6.8255 -4.3611 0.1341
  ##    [2,]  0.2083 -2.7422 -2.8754 -5.3416 -2.5085 0.1573
  ##    [3,] -2.8641 -4.0272 -5.0580 -0.0340 -0.9113 0.1163
  ##    [4,] -0.3575 -2.0081 -1.7431  1.2992 -0.1616 0.0751
  ##    [5,]  2.9329  0.3662 -1.6516 -3.0284  0.1844 0.1285
  ##    [6,]  1.5092 -2.0461 -4.3093  1.0481  1.0806 0.1094
  ##    [7,]  3.9899  3.1955 -4.0010  1.8879  2.2988 0.1460
  ##    [8,]  4.3062  0.7080 -1.2324  1.4351  2.0893 0.1332
  ##    [9,]  5.0855  4.1214 -0.9141  2.2744  1.5314 0.0000

round(d2,4) # simulated
  ##         class      A      B      C      D      E     pi
  ##    [1,]     1 -3.832 -5.399 -5.793 -7.042 -4.511 0.1323
  ##    [2,]     2 -2.899 -4.217 -5.310 -0.055 -0.915 0.1162
  ##    [3,]     3 -0.376 -2.137 -1.847  1.273 -0.078 0.0752
  ##    [4,]     4  0.208 -2.934 -3.011 -5.526 -2.511 0.1583
  ##    [5,]     5  1.536 -2.137 -4.606  1.045  1.143 0.1092
  ##    [6,]     6  2.042 -0.573 -0.404 -4.331 -1.044 0.0471
  ##    [7,]     7  3.853  0.841 -2.993 -2.746  0.803 0.0822
  ##    [8,]     8  4.204  3.296 -4.328  1.892  2.419 0.1453
  ##    [9,]     9  4.466  0.700 -1.334  1.439  2.161 0.1343

## End(Not run)

Estimation of the Generalized Logistic Item Response Model, Ramsay's Quotient Model, Nonparametric Item Response Model, Pseudo-Likelihood Estimation and a Missing Data Item Response Model

Description

This function employs marginal maximum likelihood estimation of item response models for dichotomous data. First, the Rasch type model (generalized item response model) can be estimated. The generalized logistic link function (Stukel, 1988) can be estimated or fixed for conducting IRT with different link functions than the logistic one. The Four-Parameter logistic item response model is a special case of this model (Loken & Rulison, 2010). Second, Ramsay's quotient model (Ramsay, 1989) can be estimated by specifying irtmodel="ramsay.qm". Third, quite general item response functions can be estimated in a nonparametric framework (Rossi, Wang & Ramsay, 2002). Fourth, pseudo-likelihood estimation for fractional item responses can be conducted for Rasch type models. Fifth, a simple two-dimensional missing data item response model (irtmodel='missing1'; Mislevy & Wu, 1996) can be estimated.

See Details for more explanations.

Usage

rasch.mml2( dat, theta.k=seq(-6,6,len=21), group=NULL, weights=NULL,
   constraints=NULL, glob.conv=10^(-5), parm.conv=10^(-4), mitermax=4,
   mmliter=1000, progress=TRUE,  fixed.a=rep(1,ncol(dat)),
   fixed.c=rep(0,ncol(dat)), fixed.d=rep(1,ncol(dat)),
   fixed.K=rep(3,ncol(dat)), b.init=NULL, est.a=NULL, est.b=NULL,
   est.c=NULL, est.d=NULL, min.b=-99, max.b=99, min.a=-99, max.a=99,
   min.c=0, max.c=1, min.d=0, max.d=1, prior.b=NULL, prior.a=NULL, prior.c=NULL,
   prior.d=NULL, est.K=NULL, min.K=1, max.K=20, min.delta=-20, max.delta=20,
   beta.init=NULL, min.beta=-8, pid=1:(nrow(dat)), trait.weights=NULL,  center.trait=TRUE,
   center.b=FALSE, alpha1=0, alpha2=0,est.alpha=FALSE, equal.alpha=FALSE,
   designmatrix=NULL, alpha.conv=parm.conv, numdiff.parm=0.00001,
   numdiff.alpha.parm=numdiff.parm, distribution.trait="normal", Qmatrix=NULL,
   variance.fixed=NULL, variance.init=NULL,
   mu.fixed=cbind(seq(1,ncol(Qmatrix)),rep(0,ncol(Qmatrix))),
   irtmodel="raschtype", npformula=NULL, npirt.monotone=TRUE,
   use.freqpatt=is.null(group), delta.miss=0, est.delta=rep(NA,ncol(dat)),
   nimps=0, ... )

## S3 method for class 'rasch.mml'
summary(object, file=NULL, ...)

## S3 method for class 'rasch.mml'
plot(x, items=NULL, xlim=NULL, main=NULL, ...)

## S3 method for class 'rasch.mml'
anova(object,...)

## S3 method for class 'rasch.mml'
logLik(object,...)

## S3 method for class 'rasch.mml'
IRT.irfprob(object,...)

## S3 method for class 'rasch.mml'
IRT.likelihood(object,...)

## S3 method for class 'rasch.mml'
IRT.posterior(object,...)

## S3 method for class 'rasch.mml'
IRT.modelfit(object,...)

## S3 method for class 'rasch.mml'
IRT.expectedCounts(object,...)

## S3 method for class 'IRT.modelfit.rasch.mml'
summary(object,...)

Arguments

dat

An N \times I data frame of dichotomous item responses.
For the missing data item response model (irtmodel='missing1'), code item responses by 9 which should be treated by the missing data model. Other missing responses can be coded by NA.

theta.k

Optional vector of discretized theta values. For multidimensional IRT models with D dimensions, it is a matrix with D columns.

group

Vector of integers with group identifiers in multiple group estimation. The multiple group does not work for irtmodel="missing1".

weights

Optional vector of person weights (sample weights).

constraints

Constraints on b parameters (item difficulties). It must be a matrix with two columns: the first column contains item names, the second column fixed parameter values.

glob.conv

Convergence criterion for deviance

parm.conv

Convergence criterion for item parameters

mitermax

Maximum number of iterations in M step. This argument does only apply for the estimation of the b parameters.

mmliter

Maximum number of iterations

progress

Should progress be displayed at the console?

fixed.a

Fixed or initial a parameters

fixed.c

Fixed or initial c parameters

fixed.d

Fixed or initial d parameters

fixed.K

Fixed or initial K parameters in Ramsay's quotient model.

b.init

Initial b parameters

est.a

Vector of integers which indicate which a parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.b

Vector of integers which indicate which b parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.c

Vector of integers which indicate which c parameters should be estimated. Equal integers correspond to the same estimated parameters.

est.d

Vector of integers which indicate which d parameters should be estimated. Equal integers correspond to the same estimated parameters.

min.b

Minimal b parameter to be estimated

max.b

Maximal b parameter to be estimated

min.a

Minimal a parameter to be estimated

max.a

Maximal a parameter to be estimated

min.c

Minimal c parameter to be estimated

max.c

Maximal c parameter to be estimated

min.d

Minimal d parameter to be estimated

max.d

Maximal d parameter to be estimated

prior.b

Optional prior distribution for b parameters: N(\mu, \sigma). Input is a vector of length two with parameters \mu and \sigma.

prior.a

Optional prior distribution for a parameters: N(\mu, \sigma). Input is a vector of length two with parameters \mu and \sigma.

prior.c

Optional prior distribution for c parameters: Beta(a, b). Input is a vector of length two with parameters a and b.

prior.d

Optional prior distribution for d parameters: Beta(a, b). Input is a vector of length two with parameters a and b.

est.K

Vector of integers which indicate which K parameters should be estimated. Equal integers correspond to the same estimated parameters.

min.K

Minimal K parameter to be estimated

max.K

Maximal K parameter to be estimated

min.delta

Minimal delta.miss parameter to be estimated

max.delta

Maximal delta.miss parameter to be estimated

beta.init

Optional vector of initial \beta parameters

min.beta

Minimum \beta parameter to be estimated.

pid

Optional vector of person identifiers

trait.weights

Optional vector of trait weights for a fixing the trait distribution.

center.trait

Should the trait distribution be centered

center.b

An optional logical indicating whether b parameters should be centered at each dimension

alpha1

Fixed or initial \alpha_1 parameter

alpha2

Fixed or initial \alpha_2 parameter

est.alpha

Should \alpha parameters be estimated?

equal.alpha

Estimate \alpha parameters under the assumption \alpha_1=\alpha_2?

designmatrix

Design matrix for item difficulties b to estimate linear logistic test models

alpha.conv

Convergence criterion for \alpha parameter

numdiff.parm

Parameter for numerical differentiation

numdiff.alpha.parm

Parameter for numerical differentiation for \alpha parameter

distribution.trait

Assumed trait distribution. The default is the normal distribution ("normal"). Log-linear smoothing of the trait distribution is also possible ("smooth2", "smooth3" or "smooth4" for smoothing up to 2, 3 or 4 moments, respectively).

Qmatrix

The Q-matrix

variance.fixed

Matrix for fixing covariance matrix (See Examples)

variance.init

Optional initial covariance matrix

mu.fixed

Matrix for fixing mean vector (See Examples)

irtmodel

Specify estimable IRT models: raschtype (Rasch type model), ramsay.qm (Ramsay's quotient model), npirt (Nonparametric item response model). If npirt is used as the argument for irtmodel, the argument npformula specifies different item response functions in the R formula framework (like "y~I(theta^2)"; see Examples). For estimating the missing data item response model, use irtmodel='missing1'.

npformula

A string or a vector which contains R formula objects for specifying the item response function. For example, "y~theta" is the specification of the 2PL model (see Details). If irtmodel="npirt" and npformula is not specified, then an unrestricted item response functions on the grid of \theta values is estimated.

npirt.monotone

Should nonparametrically estimated item response functions be monotone? The default is TRUE. This function applies only to irtmodel='npirt' and npformula=NULL.

use.freqpatt

A logical if frequencies of pattern should be used or not. The default is is.null(group). This means that for single group analyses, frequency patterns are used but not for multiple groups. If data processing times are large, then use.freqpatt=FALSE is recommended.

delta.miss

Missingness parameter \delta quantifying the meaning of responding to an item between the two extremes of ignoring missing responses and setting all missing responses to incorrect

est.delta

Vector with indices indicating the \delta parameters to be estimated if irtmodel="missing1".

nimps

Number of imputed datasets of item responses

object

Object of class rasch.mml

x

Object of class rasch.mml

items

Vector of integer or item names which should be plotted

xlim

Specification for xlim in plot

main

Title of the plot

file

Optional file name for summary output

...

Further arguments to be passed

Details

The item response function of the generalized item response model (irtmodel="raschtype"; Stukel, 1988) can be written as

P( X_{pi}=1 | \theta_{pd} )=c_i + (d_i - c_i ) g_{\alpha_1, \alpha_2} [ a_i ( \theta_{pd} - b_i ) ]

where g is the generalized logistic link function depending on parameters \alpha_1 and \alpha_2.

For the most important link functions the specifications are (Stukel, 1988):

logistic link function: \alpha_1=0 and \alpha_2=0
probit link function: \alpha_1=0.165 and \alpha_2=0.165
loglog link function: \alpha_1=-0.037 and \alpha_2=0.62
cloglog link function: \alpha_1=0.62 and \alpha_2=-0.037

See pgenlogis for exact transformation formulas of the mentioned link functions.

A D-dimensional model can also be specified but only allows for between item dimensionality (one item loads on one and only dimension). Setting c_i=0, d_i=1 and a_i=1 for all items i, an additive item response model

P( X_{pi}=1 | \theta_p )=g_{\alpha_1, \alpha_2} ( \theta_p - b_i )

is estimated.

Ramsay's quotient model (irtmodel="qm.ramsay") uses the item response function

P( X_{pi}=1 | \theta_p )=\frac{ \exp(\theta_p / b_i)} { K_i + \exp (\theta_p / b_i )}

Quite general unidimensional item response models can be estimated in a nonparametric framework (irtmodel="npirt"). The response functions are a linear combination of transformed \theta values

logit[ P( X_{pi}=1 | \theta_p ) ]=Y_\theta \beta

Where Y_\theta is a design matrix of \theta and \beta are item parameters to be estimated. The formula Y_\theta \beta can be specified in the R formula framework (see Example 3, Model 3c).

Pseudo-likelihood estimation can be conducted for fractional item response data as input (i.e. some item response x_{pi} do have values between 0 and 1). Then the pseudo-likelihood L_p for person p is defined as

L_p=\prod_i P_i ( \theta_p )^{x_{pi}} [1-P_i ( \theta_p )]^{(1-x_{pi})}

Note that for dichotomous responses this term corresponds to the ordinary likelihood. See Example 7.

A special two-dimensional missing data item response model (irtmodel="missing1") is implemented according to Mislevy and Wu (1996). Besides an unidimensional ability \theta_p, an individual response propensity \xi_p is proposed. We define item responses X_{pi} and response indicators R_{pi} indicating whether item responses X_{pi} are observed or not. Denoting the logistic function by \Psi, the item response model for ability is defined as

P( X_{pi}=1 | \theta_p, \xi_p )=P( X_{pi}=1 | \theta_p ) =\Psi( a_i (\theta_p - b_i ))

We also define a measurement model for response indicators R_{pi} which depends on the item response X_{pi} itself:

P( R_{pi}=1 | X_{pi}=k, \theta_p, \xi_p )= P( R_{pi}=1 | X_{pi}=k, \xi_p )= \Psi \left[ \xi_p - \beta_i - k \delta _i \right] \quad \mbox{ for } \quad k=0,1

If \delta _i=0, then the probability of responding to an item is independent of the incompletely observed item X_{pi} which is an item response model with nonignorable missings (Holman & Glas, 2005; see also Pohl, Graefe & Rose, 2014). If \delta _i is a large negative number (e.g. \delta=-100), then it follows P( R_{pi}=1 | X_{pi}=1, \theta_p, \xi_p )=1 and as a consequence it holds that P(X_{pi}=1 | R_{pi}=0, \theta_p, \xi_p)=0, which is equivalent to treating all missing item responses as incorrect. The missingness parameter \delta can be specified by the user and studied as a sensitivity analysis under different missing not at random assumptions or can be estimated by choosing est.delta=TRUE.

Value

A list with following entries

dat

Original data frame

item

Estimated item parameters in the generalized item response model

item2

Estimated item parameters for Ramsay's quotient model

trait.distr

Discretized ability distribution points and probabilities

mean.trait

Estimated mean vector

sd.trait

Estimated standard deviations

skewness.trait

Estimated skewnesses

deviance

Deviance

pjk

Estimated probabilities of item correct evaluated at theta.k

rprobs

Item response probabilities like in pjk, but slightly extended to accommodate all categories

person

Person parameter estimates: mode (MAP) and mean (EAP) of the posterior distribution

pid

Person identifier

ability.est.pattern

Response pattern estimates

f.qk.yi

Individual posterior distribution

f.yi.qk

Individual likelihood

fixed.a

Estimated a parameters

fixed.c

Estimated c parameters

G

Number of groups

alpha1

Estimated \alpha_1 parameter in generalized logistic item response model

alpha2

Estimated \alpha_2 parameter in generalized logistic item response model

se.b

Standard error of b parameter in generalized logistic model or Ramsay's quotient model

se.a

Standard error of a parameter in generalized logistic model

se.c

Standard error of c parameter in generalized logistic model

se.d

Standard error of d parameter in generalized logistic model

se.alpha

Standard error of \alpha parameter in generalized logistic model

se.K

Standard error of K parameter in Ramsay's quotient model

iter

Number of iterations

reliability

EAP reliability

irtmodel

Type of estimated item response model

D

Number of dimensions

mu

Mean vector (for multidimensional models)

Sigma.cov

Covariance matrix (for multdimensional models)

theta.k

Grid of discretized ability distributions

trait.weights

Fixed vector of probabilities for the ability distribution

pi.k

Trait distribution

ic

Information criteria

esttype

Estimation type: ll (Log-Likelihood), pseudoll (Pseudo-Log-Likelihood)

...

Note

Multiple group estimation is not possible for Ramsay's quotient model and multdimensional models.

References

Holman, R., & Glas, C. A. (2005). Modelling non-ignorable missing-data mechanisms with item response theory models. British Journal of Mathematical and Statistical Psychology, 58(1), 1-17. doi:10.1348/000711005X47168

Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63(3), 509-525. doi:10.1348/000711009X474502

Mislevy, R. J., & Wu, P. K. (1996). Missing responses and IRT ability estimation: Omits, choice, time Limits, and adaptive testing. ETS Research Report ETS RR-96-30. Princeton, ETS. doi:10.1002/j.2333-8504.1996.tb01708.x

Pohl, S., Graefe, L., & Rose, N. (2014). Dealing with omitted and not-reached items in competence tests evaluating approaches accounting for missing responses in item response theory models. Educational and Psychological Measurement, 74(3), 423-452. doi:10.1177/0013164413504926

Ramsay, J. O. (1989). A comparison of three simple test theory models. Psychometrika, 54, 487-499. doi:10.1007/BF02294631

Rossi, N., Wang, X., & Ramsay, J. O. (2002). Nonparametric item response function estimates with the EM algorithm. Journal of Educational and Behavioral Statistics, 27(3), 291-317. doi:10.3102/10769986027003291

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83(402), 426-431. doi:10.1080/01621459.1988.10478613

van der Maas, H. J. L., Molenaar, D., Maris, G., Kievit, R. A., & Borsboom, D. (2011). Cognitive psychology meets psychometric theory: On the relation between process models for decision making and latent variable models for individual differences. Psychological Review, 118(2), 339-356. doi: 10.1037/a0022749

Examples

#############################################################################
# EXAMPLE 1: Reading dataset
#############################################################################

library(CDM)
data(data.read)
dat <- data.read
I <- ncol(dat) # number of items

# Rasch model
mod1 <- sirt::rasch.mml2( dat )
summary(mod1)
plot( mod1 )    # plot all items
# title 'Rasch model', display curves from -3 to 3 only for items 1, 5 and 8
plot(mod1, main="Rasch model Items 1, 5 and 8", xlim=c(-3,3), items=c(1,5,8) )

# Rasch model with constraints on item difficulties
# set item parameters of A1 and C3 equal to -2
constraints <- data.frame( c("A1","C3"), c(-2,-2) )
mod1a <- sirt::rasch.mml2( dat, constraints=constraints)
summary(mod1a)

# estimate equal item parameters for 1st and 11th item
est.b <- 1:I
est.b[11] <- 1
mod1b <- sirt::rasch.mml2( dat, est.b=est.b )
summary(mod1b)

# estimate Rasch model with skew trait distribution
mod1c <- sirt::rasch.mml2( dat, distribution.trait="smooth3")
summary(mod1c)

# 2PL model
mod2 <- sirt::rasch.mml2( dat, est.a=1:I )
summary(mod2)
plot(mod2)    # plot 2PL item response curves

# extract individual likelihood
llmod2 <- IRT.likelihood(mod2)
str(llmod2)

## Not run: 
library(CDM)
# model comparisons
CDM::IRT.compareModels(mod1, mod1c, mod2 )
anova(mod1,mod2)

# assess model fit
smod1 <- IRT.modelfit(mod1)
smod2 <- IRT.modelfit(mod2)
IRT.compareModels(smod1, smod2)

# set some bounds for a and b parameters
mod2a <- sirt::rasch.mml2( dat, est.a=1:I, min.a=.7, max.a=2, min.b=-2 )
summary(mod2a)

# 3PL model
mod3 <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I,
              mmliter=400 # maximal 400 iterations
                 )
summary(mod3)

# 3PL model with fixed guessing paramters of .25 and equal slopes
mod4 <- sirt::rasch.mml2( dat, fixed.c=rep( .25, I )   )
summary(mod4)

# 3PL model with equal guessing paramters for all items
mod5 <- sirt::rasch.mml2( dat, est.c=rep(1, I )   )
summary(mod5)

# difficulty + guessing model
mod6 <- sirt::rasch.mml2( dat, est.c=1:I   )
summary(mod6)

# 4PL model
mod7 <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I, est.d=1:I,
            min.d=.95, max.c=.25)
        # set minimal d and maximal c parameter to .95 and .25
summary(mod7)

# 4PL model with prior distributions
mod7b <- sirt::rasch.mml2( dat, est.a=1:I, est.c=1:I, est.d=1:I, prior.a=c(1,2),
            prior.c=c(5,17), prior.d=c(20,2) )
summary(mod7b)

# constrained 4PL model
# equal slope, guessing and slipping parameters
mod8 <- sirt::rasch.mml2( dat,est.c=rep(1,I), est.d=rep(1,I) )
summary(mod8)

# estimation of an item response model with an
# uniform theta distribution
theta.k <- seq( 0.01, .99, len=20 )
trait.weights <- rep( 1/length(theta.k), length(theta.k) )
mod9 <- sirt::rasch.mml2( dat, theta.k=theta.k, trait.weights=trait.weights,
              normal.trait=FALSE, est.a=1:12  )
summary(mod9)

#############################################################################
# EXAMPLE 2: Longitudinal data
#############################################################################

data(data.long)
dat <- data.long[,-1]

# define Q loading matrix
Qmatrix <- matrix( 0, 12, 2 )
Qmatrix[1:6,1] <- 1 # T1 items
Qmatrix[7:12,2] <- 1    # T2 items

# define restrictions on item difficulties
est.b <- c(1,2,3,4,5,6,   3,4,5,6,7,8)
mu.fixed <- cbind(1,0)
    # set first mean to 0 for identification reasons

# Model 1: 2-dimensional Rasch model
mod1 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, miterstep=4,
            est.b=est.b,  mu.fixed=mu.fixed, mmliter=30 )
summary(mod1)
plot(mod1)
##     Plot function is only applicable for unidimensional models

## End(Not run)

#############################################################################
# EXAMPLE 3: One group, estimation of alpha parameter in the generalized logistic model
#############################################################################

# simulate theta values
set.seed(786)
N <- 1000                  # number of persons
theta <- stats::rnorm( N, sd=1.5 ) # N persons with SD 1.5
b <- seq( -2, 2, len=15)

# simulate data
dat <- sirt::sim.raschtype( theta=theta, b=b, alpha1=0, alpha2=-0.3 )

#  estimating alpha parameters
mod1 <- sirt::rasch.mml2( dat, est.alpha=TRUE, mmliter=30 )
summary(mod1)
plot(mod1)

## Not run: 
# fixed alpha parameters
mod1b <- sirt::rasch.mml2( dat, est.alpha=FALSE, alpha1=0, alpha2=-.3 )
summary(mod1b)

# estimation with equal alpha parameters
mod1c <- sirt::rasch.mml2( dat, est.alpha=TRUE, equal.alpha=TRUE )
summary(mod1c)

# Ramsay QM
mod2a <- sirt::rasch.mml2( dat, irtmodel="ramsay.qm" )
summary(mod2a)

## End(Not run)

# Ramsay QM with estimated K parameters
mod2b <- sirt::rasch.mml2( dat, irtmodel="ramsay.qm", est.K=1:15, mmliter=30)
summary(mod2b)
plot(mod2b)

## Not run: 
# nonparametric estimation of monotone item response curves
mod3a <- sirt::rasch.mml2( dat, irtmodel="npirt", mmliter=100,
            theta.k=seq( -3, 3, len=10) ) # evaluations at 10 theta grid points
# nonparametric ICC of first 4 items
round( t(mod3a$pjk)[1:4,], 3 )
summary(mod3a)
plot(mod3a)

# nonparametric IRT estimation without monotonicity assumption
mod3b <- sirt::rasch.mml2( dat, irtmodel="npirt", mmliter=10,
            theta.k=seq( -3, 3, len=10), npirt.monotone=FALSE)
plot(mod3b)

# B-Spline estimation of ICCs
library(splines)
mod3c <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~bs(theta,df=3)", theta.k=seq(-3,3,len=15) )
summary(mod3c)
round( t(mod3c$pjk)[1:6,], 3 )
plot(mod3c)

# estimation of quadratic item response functions: ~ theta + I( theta^2)
mod3d <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~theta + I(theta^2)" )
summary(mod3d)
plot(mod3d)

# estimation of a stepwise ICC function
# ICCs are constant on the theta domains: [-Inf,-1], [-1,1], [1,Inf]
mod3e <- sirt::rasch.mml2( dat, irtmodel="npirt",
             npformula="y~I(theta>-1 )+I(theta>1)" )
summary(mod3e)
plot(mod3e, xlim=c(-2.5,2.5) )

# 2PL model
mod4 <- sirt::rasch.mml2( dat,  est.a=1:15)
summary(mod4)

#############################################################################
# EXAMPLE 4: Two groups, estimation of generalized logistic model
#############################################################################

# simulate generalized logistic Rasch model in two groups
set.seed(8765)
N1 <- 1000     # N1=1000 persons in group 1
N2 <- 500      # N2=500 persons in group 2
dat1 <- sirt::sim.raschtype( theta=stats::rnorm( N1, sd=1.5 ), b=b,
            alpha1=-0.3, alpha2=0)
dat2 <- sirt::sim.raschtype( theta=stats::rnorm( N2, mean=-.5, sd=.75),
            b=b, alpha1=-0.3, alpha2=0)
dat1 <- rbind( dat1, dat2 )
group <- c( rep(1,N1), rep(2,N2))

mod1 <-  sirt::rasch.mml2( dat1, parm.conv=.0001, group=group, est.alpha=TRUE )
summary(mod1)

#############################################################################
# EXAMPLE 5: Multidimensional model
#############################################################################

#***
# (1) simulate data
set.seed(785)
library(mvtnorm)
N <- 500
theta <- mvtnorm::rmvnorm( N,mean=c(0,0), sigma=matrix( c(1.45,.5,.5,1.7), 2, 2 ))
I <- 10
# 10 items load on the first dimension
p1 <- stats::plogis( outer( theta[,1], seq( -2, 2, len=I ), "-" ) )
resp1 <- 1 * ( p1 > matrix( stats::runif( N*I ), nrow=N, ncol=I ) )
# 10 items load on the second dimension
p1 <- stats::plogis( outer( theta[,2], seq( -2, 2, len=I ), "-" ) )
resp2 <- 1 * ( p1 > matrix( stats::runif( N*I ), nrow=N, ncol=I ) )
#Combine the two sets of items into one response matrix
resp <- cbind(resp1,resp2)
colnames(resp) <- paste("I", 1:(2*I), sep="")
dat <- resp

# define Q-matrix
Qmatrix <- matrix( 0, 2*I, 2 )
Qmatrix[1:I,1] <- 1
Qmatrix[1:I+I,2] <- 1

#***
# (2) estimation of models
# 2-dimensional Rasch model
mod1 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix )
summary(mod1)

# 2-dimensional 2PL model
mod2 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, est.a=1:(2*I) )
summary(mod2)

# estimation with some fixed variances and covariances
# set variance of 1st dimension to 1 and
#  covariance to zero
variance.fixed <- matrix( cbind(c(1,1), c(1,2), c(1,0)),
             byrow=FALSE, ncol=3 )
mod3 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, variance.fixed=variance.fixed )
summary(mod3)

# constraints on item difficulties
#  useful for example in longitudinal linking
est.b <- c( 1:I, 1:I )
    # equal indices correspond to equally estimated item parameters
mu.fixed <- cbind( 1, 0 )
mod4 <- sirt::rasch.mml2( dat, Qmatrix=Qmatrix, est.b=est.b, mu.fixed=mu.fixed )
summary(mod4)

#############################################################################
# EXAMPLE 6: Two booklets with same items but with item context effects.
# Therefore, item slopes and item difficulties are assumed to be shifted in the
# second design group.
#############################################################################

#***
# simulate data
set.seed(987)
I <- 10     # number of items
# define person design groups 1 and 2
n1 <- 700
n2 <- 1500
# item difficulties group 1
b1 <- seq(-1.5,1.5,length=I)
# item slopes group 1
a1 <- rep(1, I)
# simulate data group 1
dat1 <- sirt::sim.raschtype( stats::rnorm(n1), b=b1, fixed.a=a1 )
colnames(dat1) <- paste0("I", 1:I, "des1" )
# group 2
b2 <- b1 - .15
a2 <- 1.1*a1
# Item parameters are slightly transformed in the second group
# compared to the first group. This indicates possible item context effects.

# simulate data group 2
dat2 <- sirt::sim.raschtype( stats::rnorm(n2), b=b2, fixed.a=a2 )
colnames(dat2) <- paste0("I", 1:I, "des2" )
# define joint dataset
dat <- matrix( NA, nrow=n1+n2, ncol=2*I)
colnames(dat) <- c( colnames(dat1), colnames(dat2) )
dat[ 1:n1, 1:I ] <- dat1
dat[ n1 + 1:n2, I + 1:I ] <- dat2
# define group identifier
group <- c( rep(1,n1), rep(2,n2) )

#***
# Model 1: Rasch model two groups
itemindex <- rep( 1:I, 2 )
mod1 <- sirt::rasch.mml2( dat, group=group, est.b=itemindex )
summary(mod1)

#***
# Model 2: two item slope groups and designmatrix for intercepts
designmatrix <- matrix( 0, 2*I, I+1)
designmatrix[ ( 1:I )+ I,1:I] <- designmatrix[1:I,1:I] <- diag(I)
designmatrix[ ( 1:I )+ I,I+1] <- 1
mod2 <- sirt::rasch.mml2( dat, est.a=rep(1:2,each=I), designmatrix=designmatrix )
summary(mod2)

#############################################################################
# EXAMPLE 7: PIRLS dataset with missing responses
#############################################################################

data(data.pirlsmissing)
items <- grep( "R31", colnames(data.pirlsmissing), value=TRUE )
I <- length(items)
dat <- data.pirlsmissing

#****
# Model 1: recode missing responses as missing (missing are ignorable)

# data recoding
dat1 <- dat
dat1[ dat1==9 ] <- NA
# estimate Rasch model
mod1 <- sirt::rasch.mml2( dat1[,items], weights=dat$studwgt, group=dat$country )
summary(mod1)
##   Mean=0 0.341 -0.134 0.219
##   SD=1.142 1.166 1.197 0.959

#****
# Model 2: recode missing responses as wrong

# data recoding
dat2 <- dat
dat2[ dat2==9 ] <- 0
# estimate Rasch model
mod2 <- sirt::rasch.mml2( dat2[,items], weights=dat$studwgt, group=dat$country )
summary(mod2)
  ##   Mean=0 0.413 -0.172 0.446
  ##   SD=1.199 1.263 1.32 0.996

#****
# Model 3: recode missing responses as rho * P_i( theta ) and
#          apply pseudo-log-likelihood estimation
# Missing item responses are predicted by the model implied probability
# P_i( theta ) where theta is the ability estimate when ignoring missings (Model 1)
# and rho is an adjustment parameter. rho=0 is equivalent to Model 2 (treating
# missing as wrong) and rho=1 is equivalent to Model 1 (treating missing as ignorable).

# data recoding
dat3 <- dat
# simulate theta estimate from posterior distribution
theta <- stats::rnorm( nrow(dat3), mean=mod1$person$EAP, sd=mod1$person$SE.EAP )
rho <- .3   # define a rho parameter value of .3
for (ii in items){
    ind <- which( dat[,ii]==9 )
    dat3[ind,ii] <- rho*stats::plogis( theta[ind] - mod1$item$b[ which( items==ii ) ] )
                }

# estimate Rasch model
mod3 <- sirt::rasch.mml2( dat3[,items], weights=dat$studwgt, group=dat$country )
summary(mod3)
  ##   Mean=0 0.392 -0.153 0.38
  ##   SD=1.154 1.209 1.246 0.973

#****
# Model 4: simulate missing responses as rho * P_i( theta )
# The definition is the same as in Model 3. But it is now assumed
# that the missing responses are 'latent responses'.
set.seed(789)

# data recoding
dat4 <- dat
# simulate theta estimate from posterior distribution
theta <- stats::rnorm( nrow(dat4), mean=mod1$person$EAP, sd=mod1$person$SE.EAP )
rho <- .3   # define a rho parameter value of .3
for (ii in items){
    ind <- which( dat[,ii]==9 )
    p3 <- rho*stats::plogis( theta[ind] - mod1$item$b[ which( items==ii ) ] )
    dat4[ ind, ii ] <- 1*( stats::runif( length(ind), 0, 1 ) < p3)
                }

# estimate Rasch model
mod4 <- sirt::rasch.mml2( dat4[,items], weights=dat$studwgt, group=dat$country )
summary(mod4)
  ##   Mean=0 0.396 -0.156 0.382
  ##   SD=1.16 1.216 1.253 0.979

#****
# Model 5: recode missing responses for multiple choice items with four alternatives
#          to 1/4 and apply pseudo-log-likelihood estimation.
#          Missings for constructed response items are treated as incorrect.

# data recoding
dat5 <- dat
items_mc <- items[ substring( items, 7,7)=="M" ]
items_cr <- items[ substring( items, 7,7)=="C" ]
for (ii in items_mc){
    ind <- which( dat[,ii]==9 )
    dat5[ind,ii] <- 1/4
                }
for (ii in items_cr){
    ind <- which( dat[,ii]==9 )
    dat5[ind,ii] <- 0
                }

# estimate Rasch model
mod5 <- sirt::rasch.mml2( dat5[,items], weights=dat$studwgt, group=dat$country )
summary(mod5)
  ##   Mean=0 0.411 -0.165 0.435
  ##   SD=1.19 1.245 1.293 0.995

#*** For the following analyses, we ignore sample weights and the
#    country grouping.
data(data.pirlsmissing)
items <- grep( "R31", colnames(data.pirlsmissing), value=TRUE )
dat <- data.pirlsmissing
dat1 <- dat
dat1[ dat1==9 ] <- 0

#*** Model 6: estimate item difficulties assuming incorrect missing data treatment
mod6 <- sirt::rasch.mml2( dat1[,items], mmliter=50 )
summary(mod6)

#*** Model 7: reestimate model with constrained item difficulties
I <- length(items)
constraints <- cbind( 1:I, mod6$item$b )
mod7 <- sirt::rasch.mml2( dat1[,items], constraints=constraints)
summary(mod7)

#*** Model 8: score all missings responses as missing items
dat2 <- dat[,items]
dat2[ dat2==9 ] <- NA
mod8 <- sirt::rasch.mml2( dat2, constraints=constraints, mu.fixed=NULL )
summary(mod8)

#*** Model 9: estimate missing data model 'missing1' assuming a missingness
#       parameter delta.miss of zero
dat2 <-  dat[,items]    # note that missing item responses must be defined by 9
mod9 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
            theta.k=seq(-5,5,len=10), delta.miss=0, mitermax=4, mu.fixed=NULL )
summary(mod9)

#*** Model 10: estimate missing data model with a large negative missing delta parameter
#=> This model is equivalent to treating missing responses as wrong
mod10 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
             theta.k=seq(-5, 5, len=10), delta.miss=-10, mitermax=4, mmliter=200,
             mu.fixed=NULL )
summary(mod10)

#*** Model 11: choose a missingness delta parameter of -1
mod11 <- sirt::rasch.mml2( dat2, constraints=constraints, irtmodel="missing1",
             theta.k=seq(-5, 5, len=10), delta.miss=-1, mitermax=4,
             mmliter=200, mu.fixed=NULL )
summary(mod11)

#*** Model 12: estimate joint delta parameter
mod12 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=rep(1,I)  )
summary(mod12)

#*** Model 13: estimate delta parameter in item groups defined by item format
est.delta <- 1 + 1 * ( substring( colnames(dat2),7,7 )=="M" )
mod13 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=est.delta  )
summary(mod13)

#*** Model 14: estimate item specific delta parameter
mod14 <- sirt::rasch.mml2( dat2, irtmodel="missing1", mu.fixed=cbind( c(1,2), 0 ),
             theta.k=seq(-8, 8, len=10), delta.miss=0, mitermax=4,
             mmliter=30, est.delta=1:I  )
summary(mod14)

#############################################################################
# EXAMPLE 8: Comparison of different models for polytomous data
#############################################################################

data(data.Students, package="CDM")
head(data.Students)
dat <- data.Students[, paste0("act",1:5) ]
I <- ncol(dat)

#**************************************************
#*** Model 1: Partial Credit Model (PCM)

#*** Model 1a: PCM in TAM
mod1a <- TAM::tam.mml( dat )
summary(mod1a)

#*** Model 1b: PCM in sirt
mod1b <- sirt::rm.facets( dat )
summary(mod1b)

#*** Model 1c: PCM in mirt
mod1c <- mirt::mirt( dat, 1, itemtype=rep("Rasch",I), verbose=TRUE )
print(mod1c)

#**************************************************
#*** Model 2: Sequential Model (SM): Equal Loadings

#*** Model 2a: SM in sirt
dat1 <- CDM::sequential.items(dat)
resp <- dat1$dat.expand
iteminfo <- dat1$iteminfo
# fit model
mod2a <- sirt::rasch.mml2( resp )
summary(mod2a)

#**************************************************
#*** Model 3: Sequential Model (SM): Different Loadings

#*** Model 3a: SM in sirt
mod3a <- sirt::rasch.mml2( resp, est.a=iteminfo$itemindex )
summary(mod3a)

#**************************************************
#*** Model 4: Generalized partial credit model (GPCM)

#*** Model 4a: GPCM in TAM
mod4a <- TAM::tam.mml.2pl( dat, irtmodel="GPCM")
summary(mod4a)

#**************************************************
#*** Model 5: Graded response model (GRM)

#*** Model 5a: GRM in mirt
mod5a <- mirt::mirt( dat, 1, itemtype=rep("graded",I), verbose=TRUE)
print(mod5a)

# model comparison
logLik(mod1a);logLik(mod1b);mod1c@logLik  # PCM
logLik(mod2a)   # SM (Rasch)
logLik(mod3a)   # SM (GPCM)
logLik(mod4a)   # GPCM
mod5a@logLik    # GRM

## End(Not run)

Pairwise Estimation Method of the Rasch Model

Description

This function estimates the Rasch model with a minimum chi square estimation method (cited in Fischer, 2007, p. 544) which is a pairwise conditional likelihood estimation approach.

Usage

rasch.pairwise(dat, weights=NULL, conv=1e-04, maxiter=3000, progress=TRUE,
        b.init=NULL, zerosum=FALSE, power=1, direct_optim=TRUE)

## S3 method for class 'rasch.pairwise'
summary(object, digits=3, file=NULL, ...)

Arguments

dat

An N \times I data frame of dichotomous item responses

weights

Optional vector of sampling weights

conv

Convergence criterion

maxiter

Maximum number of iterations

progress

Display iteration progress?

b.init

An optional vector of length I of item difficulties

zerosum

Optional logical indicating whether item difficulties should be centered in each iteration. The default is that no centering is conducted.

power

Power used for computing pairwise response probabilities like in row averaging approach

direct_optim

Logical indicating whether least squares criterion funcion should be minimized with stats::nlminb

object

Object of class rasch.pairwise

digits

Number of digits after decimal for rounding

file

Optional file name for summary output

...

Further arguments to be passed

Value

An object of class rasch.pairwise with following entries

b

Item difficulties

eps

Exponentiated item difficulties, i.e. eps=exp(-b)

iter

Number of iterations

conv

Convergence criterion

dat

Original data frame

freq.ij

Frequency table of all item pairs

item

Summary table of item parameters

References

Fischer, G. H. (2007). Rasch models. In C. R. Rao and S. Sinharay (Eds.), Handbook of Statistics, Vol. 26 (pp. 515-585). Amsterdam: Elsevier.

Examples

#############################################################################
# EXAMPLE 1: Reading data set | pairwise estimation Rasch model
#############################################################################

data(data.read)
dat <- data.read

#*** Model 1: no constraint on item difficulties
mod1 <- sirt::rasch.pairwise(dat)
summary(mod1)

#*** Model 2: sum constraint on item difficulties
mod2 <- sirt::rasch.pairwise(dat, zerosum=TRUE)
summary(mod2)

## Not run: 
#** obtain standard errors by bootstrap
mod2$item$b   # extract item difficulties

# Bootstrap of item difficulties
boot_pw <- function(data, indices ){
    dd <- data[ indices, ] # bootstrap of indices
    mod <- sirt::rasch.pairwise( dat=dd, zerosum=TRUE, progress=FALSE)
    return(mod$item$b)
}
set.seed(986)
library(boot)
bmod2 <- boot::boot(data=dat, statistic=boot_pw, R=999 )
print(bmod2)
summary(bmod2)
# quantiles for bootstrap sample (and confidence interval)
apply(bmod2$t, 2, stats::quantile, probs=c(.025, .5, .975) )

## End(Not run)

Pairwise Estimation of the Rasch Model for Locally Dependent Items

Description

This function uses pairwise conditional likelihood estimation for estimating item parameters in the Rasch model.

Usage

rasch.pairwise.itemcluster(dat, itemcluster=NULL, b.fixed=NULL, weights=NULL,
    conv=1e-05, maxiter=3000, progress=TRUE, b.init=NULL, zerosum=FALSE)

Arguments

dat

An N \times I data frame. Missing responses are allowed and must be recoded as NA.

itemcluster

Optional integer vector of itemcluster (see Examples). Different integers correspond to different item clusters. No item cluster is set as default.

b.fixed

Matrix for fixing item parameters. The first columns contains the item (number or name), the second column the parameter to be fixed.

weights

Optional Vector of sampling weights

conv

Convergence criterion in maximal absolute parameter change

maxiter

Maximal number of iterations

progress

A logical which displays progress. Default is TRUE.

b.init

Vector of initial item difficulty estimates. Default is NULL.

zerosum

Optional logical indicating whether item difficulties should be centered in each iteration. The default is that no centering is conducted.

Details

This is an adaptation of the algorithm of van der Linden and Eggen (1986). Only item pairs of different item clusters are taken into account for item difficulty estimation. Therefore, the problem of locally dependent items within each itemcluster is (almost) eliminated (see Examples below) because contributions of local dependencies do not appear in the pairwise likelihood terms. In detail, the estimation rests on observed frequency tables of items i and j and therefore on conditional probabilities

\frac{P(X_i=x, X_j=y)}{P(X_i + X_j=1 )} \quad \mbox{with} \quad x,y=0,1 \quad \mbox{and} \quad x+y=1

If for some item pair (i,j) a higher positive (or negative) correlation is expected (i.e. deviation from local dependence), then this pair is removed from estimation. Clearly, there is a loss in precision but item parameters can be less biased.

Value

Object of class rasch.pairwise with elements

b

Vector of item difficulties

item

Data frame of item parameters (N, p and item difficulty)

Note

No standard errors are provided by this function. Use resampling methods for conducting statistical inference.

Formulas for asymptotic standard errors of this pairwise estimation method are described in Zwinderman (1995).

References

van der Linden, W. J., & Eggen, T. J. H. M. (1986). An empirical Bayes approach to item banking. Research Report 86-6, University of Twente.

Zwinderman, A. H. (1995). Pairwise parameter estimation in Rasch models. Applied Psychological Measurement, 19, 369-375.

Examples

#############################################################################
# EXAMPLE 1: Example with locally dependent items
#      12 Items: Cluster 1 -> Items 1,...,4
#                Cluster 2 -> Items 6,...,9
#                Cluster 3 -> Items 10,11,12
#############################################################################

set.seed(7896)
I <- 12                             # number of items
n <- 5000                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
bsamp <- b <- sample(b)             # sample item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ 1:4 ] <- 1
itemcluster[ 6:9 ] <- 2
itemcluster[ 10:12 ] <- 3
# residual correlations
rho <- c( .55, .25, .45 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimation with pairwise Rasch model
mod3 <- sirt::rasch.pairwise( dat )
summary(mod3)

# use item cluster in rasch pairwise estimation
mod <- sirt::rasch.pairwise.itemcluster( dat=dat, itemcluster=itemcluster )
summary(mod)

## Not run: 
# Rasch MML estimation
mod4 <- sirt::rasch.mml2( dat )
summary(mod4)

# Rasch Copula estimation
mod5 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod5)

# compare different item parameter estimates
M1 <- cbind( "true.b"=bsamp, "b.rasch"=mod4$item$b, "b.rasch.copula"=mod5$item$thresh,
         "b.rasch.pairwise"=mod3$b, "b.rasch.pairwise.cluster"=mod$b )
# center item difficulties
M1 <- scale( M1, scale=FALSE )
round( M1, 3 )
round( apply( M1, 2, stats::sd ), 3 )

#  Below the output of the example is presented.
#  The rasch.pairwise.itemcluster is pretty close to the estimate in the Rasch copula model.

  ##   > round( M1, 3 )
  ##       true.b b.rasch b.rasch.copula b.rasch.pairwise b.rasch.pairwise.cluster
  ##   I1   0.545   0.561          0.526            0.628                    0.524
  ##   I2  -0.182  -0.168         -0.174           -0.121                   -0.156
  ##   I3  -0.909  -0.957         -0.867           -0.971                   -0.899
  ##   I4  -1.636  -1.726         -1.625           -1.765                   -1.611
  ##   I5   1.636   1.751          1.648            1.694                    1.649
  ##   I6   0.909   0.892          0.836            0.898                    0.827
  ##   I7  -2.000  -2.134         -2.020           -2.051                   -2.000
  ##   I8  -1.273  -1.355         -1.252           -1.303                   -1.271
  ##   I9  -0.545  -0.637         -0.589           -0.581                   -0.598
  ##   I10  1.273   1.378          1.252            1.308                    1.276
  ##   I11  0.182   0.241          0.226            0.109                    0.232
  ##   I12  2.000   2.155          2.039            2.154                    2.026
  ##   > round( apply( M1, 2, sd ), 3 )
  ##                     true.b                  b.rasch           b.rasch.copula
  ##                      1.311                    1.398                    1.310
  ##      b.rasch.pairwise    b.rasch.pairwise.cluster
  ##                 1.373                       1.310

# set item parameters of first item to 0 and of second item to -0.7
b.fixed <- cbind( c(1,2), c(0,-.7) )
mod5 <- sirt::rasch.pairwise.itemcluster( dat=dat, b.fixed=b.fixed,
             itemcluster=itemcluster )
# difference between estimations 'mod' and 'mod5'
dfr <- cbind( mod5$item$b, mod$item$b )
plot( mod5$item$b, mod$item$b, pch=16)
apply( dfr, 1, diff)

## End(Not run)

Pairwise Marginal Likelihood Estimation for the Probit Rasch Model

Description

This function estimates unidimensional 1PL and 2PL models with the probit link using pairwise marginal maximum likelihood estimation (PMML; Renard, Molenberghs & Geys, 2004). Item pairs within an itemcluster can be excluded from the pairwise likelihood (argument itemcluster). The other alternative is to model a residual error structure with itemclusters (argument error.corr).

Usage

rasch.pml3(dat, est.b=seq(1, ncol(dat)), est.a=rep(0,ncol(dat)),
    est.sigma=TRUE, itemcluster=NULL, weight=rep(1, nrow(dat)), numdiff.parm=0.001,
    b.init=NULL, a.init=NULL,  sigma.init=NULL, error.corr=0*diag( 1, ncol(dat) ),
    err.constraintM=NULL, err.constraintV=NULL, glob.conv=10^(-6), conv1=10^(-4),
    pmliter=300, progress=TRUE, use.maxincrement=TRUE )

## S3 method for class 'rasch.pml'
summary(object,...)

Arguments

dat

An N \times I data frame of dichotomous item responses

est.b

Vector of integers of length I. Same integers mean that the corresponding items do have the same item difficulty b. Entries of 0 mean fixing item parameters to values specified in b.init.

est.a

Vector of integers of length I. Same integers mean that the corresponding items do have the same item slope a. Entries of 0 mean fixing item parameters to values specified in a.init.

est.sigma

Should sigma (the trait standard deviation) be estimated? The default is TRUE.

itemcluster

Optional vector of length I of integers which indicates itemclusters. Same integers correspond to the same itemcluster. An entry of 0 correspond to an item which is not included in any itemcluster.

weight

Optional vector of person weights

numdiff.parm

Step parameter for numerical differentiation

b.init

Initial or fixed item difficulty

a.init

Initial or fixed item slopes

sigma.init

Initial or fixed trait standard deviation

error.corr

An optional I \times I integer matrix which defines the estimation of residual correlations. Entries of zero indicate that the corresponding residual correlation should not be estimated. Integers which differ from zero indicate correlations to be estimated. All entries with an equal integer are estimated by the same residual correlation. The default of error.corr is a diagonal matrix which means that no residual correlation is estimated. If error.corr deviates from this default, then the argument itemcluster is set to NULL.
If some error correlations are estimated, then no itempairs in itemcluster can be excluded from the pairwise modeling.

err.constraintM

An optional P \times L matrix where P denotes the number of item pairs in pseudolikelihood estimation and L is the number of linear constraints for residual correlations (see Details).

err.constraintV

An optional L \times 1 matrix with specified values for linear constraints on residual correlations (see Details).

glob.conv

Global convergence criterion

conv1

Convergence criterion for model parameters

pmliter

Maximum number of iterations

progress

Display progress?

use.maxincrement

Optional logical whether increments in slope parameters should be controlled in size in iterations. The default is TRUE.

object

Object of class rasch.pml

...

Further arguments to be passed

Details

The probit item response model can be estimated with this function:

P(X_{pi}=1|\theta_p )=\Phi( a_i \theta_p - b_i ) \quad, \quad \theta_p \sim N ( 0, \sigma^2 )

where \Phi denotes the normal distribution function. This model can also be expressed as a latent variable model which assumes a latent response tendency X_{pi}^\ast which is equal to 1 if X_{pi}> - b_i and otherwise zero. If \epsilon_{pi} is standard normally distributed, then

X_{pi}^{\ast}=a_i \theta_p - b_i + \epsilon_{pi}

An arbitrary pattern of residual correlations between \epsilon_{pi} and \epsilon_{pj} for item pairs i and j can be imposed using the error.corr argument.

Linear constraints Me=v on residual correlations e=Cov( \epsilon_{pi}, \epsilon_{pj})_{ij} (in a vectorized form) can be specified using the arguments err.constraintM (matrix M) and err.constraintV (vector v). The estimation is described in Neuhaus (1996).

For the pseudo likelihood information criterion (PLIC) see Stanford and Raftery (2002).

Value

A list with following entries:

item

Data frame with estimated item parameters

iter

Number of iterations

deviance

Pseudolikelihood multiplied by minus 2

b

Estimated item difficulties

sigma

Estimated standard deviation

dat

Original dataset

ic

Data frame with information criteria (sample size, number of estimated parameters, pseudolikelihood information criterion PLIC)

link

Used link function (only probit is permitted)

itempairs

Estimated statistics of item pairs

error.corr

Estimated error correlation matrix

eps.corr

Vectorized error correlation matrix

omega.rel

Reliability of the sum score according to Green and Yang (2009). If some item pairs are excluded in the estimation, the residual correlation for these item pairs is assumed to be zero.

...

Note

This function needs the combinat library.

References

Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74, 155-167.

Neuhaus, W. (1996). Optimal estimation under linear constraints. Astin Bulletin, 26, 233-245.

Renard, D., Molenberghs, G., & Geys, H. (2004). A pairwise likelihood approach to estimation in multilevel probit models. Computational Statistics & Data Analysis, 44, 649-667.

Stanford, D. C., & Raftery, A. E. (2002). Approximate Bayes factors for image segmentation: The pseudolikelihood information criterion (PLIC). IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 1517-1520.

Examples

#############################################################################
# EXAMPLE 1: Reading data set
#############################################################################

data(data.read)
dat <- data.read

#******
# Model 1: Rasch model with PML estimation
mod1 <- sirt::rasch.pml3( dat )
summary(mod1)

#******
# Model 2: Excluding item pairs with local dependence
#          from bivariate composite likelihood
itemcluster <- rep( 1:3, each=4)
mod2 <- sirt::rasch.pml3( dat, itemcluster=itemcluster )
summary(mod2)

## Not run: 
#*****
# Model 3: Modelling error correlations:
#          joint residual correlations for each itemcluster
error.corr <- diag(1,ncol(dat))
for ( ii in 1:3){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
# estimate the model with error correlations
mod3 <- sirt::rasch.pml3( dat, error.corr=error.corr )
summary(mod3)

#****
# Model 4: model separate residual correlations
I <- ncol(error.corr)
error.corr1 <- matrix( 1:(I*I), ncol=I )
error.corr <- error.corr1 * ( error.corr > 0 )
# estimate the model with error correlations
mod4 <- sirt::rasch.pml3( dat, error.corr=error.corr )
summary(mod4)

#****
# Model 5:  assume equal item difficulties:
# b_1=b_7 and b_2=b_12
# fix item difficulty of the 6th item to .1
est.b <- 1:I
est.b[7] <- 1; est.b[12] <- 2 ; est.b[6] <- 0
b.init <- rep( 0, I ) ; b.init[6] <- .1
mod5 <- sirt::rasch.pml3( dat, est.b=est.b, b.init=b.init)
summary(mod5)

#****
# Model 6: estimate three item slope groups
est.a <- rep(1:3, each=4 )
mod6 <- sirt::rasch.pml3( dat, est.a=est.a, est.sigma=0)
summary(mod6)

#############################################################################
# EXAMPLE 2: PISA reading
#############################################################################

data(data.pisaRead)
dat <- data.pisaRead$data

# select items
dat <- dat[, substring(colnames(dat),1,1)=="R" ]

#******
# Model 1: Rasch model with PML estimation
mod1 <- sirt::rasch.pml3( as.matrix(dat) )
  ## Trait SD (Logit Link) : 1.419

#******
# Model 2: Model correlations within testlets
error.corr <- diag(1,ncol(dat))
testlets <- paste( data.pisaRead$item$testlet )
itemcluster <- match( testlets, unique(testlets ) )
for ( ii in 1:(length(unique(testlets))) ){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
# estimate the model with error correlations
mod2 <- sirt::rasch.pml3( dat, error.corr=error.corr )
  ## Trait SD (Logit Link) : 1.384

#****
# Model 3: model separate residual correlations
I <- ncol(error.corr)
error.corr1 <- matrix( 1:(I*I), ncol=I )
error.corr <- error.corr1 * ( error.corr > 0 )
# estimate the model with error correlations
mod3 <- sirt::rasch.pml3( dat, error.corr=error.corr )
  ## Trait SD (Logit Link) : 1.384

#############################################################################
# EXAMPLE 3: 10 locally independent items
#############################################################################

#**********
# simulate some data
set.seed(554)
N <- 500    # persons
I <- 10        # items
theta <- stats::rnorm(N,sd=1.3 )    # trait SD of 1.3
b <- seq(-2, 2, length=I) # item difficulties

# simulate data from the Rasch model
dat <- sirt::sim.raschtype( theta=theta, b=b )

# estimation with rasch.pml and probit link
mod1 <- sirt::rasch.pml3( dat )
summary(mod1)

# estimation with rasch.mml2 function
mod2 <- sirt::rasch.mml2( dat )

# estimate item parameters for groups with five item parameters each
est.b <- rep( 1:(I/2), each=2 )
mod3 <- sirt::rasch.pml3( dat, est.b=est.b )
summary(mod3)

# compare parameter estimates
summary(mod1)
summary(mod2)
summary(mod3)

#############################################################################
# EXAMPLE 4: 11 items and 2 item clusters with 2 and 3 items
#############################################################################

set.seed(5698)
I <- 11                             # number of items
n <- 5000                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[c(3,5)] <- 1
itemcluster[c(2,4,9)] <- 2
# residual correlations
rho <- c( .7, .5 )

# simulate data (under the logit link)
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

#***
# Model 1: estimation using the Rasch model (with probit link)
mod1 <- sirt::rasch.pml3( dat )
#***
# Model 2: estimation when pairs of locally dependent items are eliminated
mod2 <- sirt::rasch.pml3( dat, itemcluster=itemcluster)

#***
# Model 3: Positive correlations within testlets
est.corrs <- diag( 1, I )
est.corrs[ c(3,5), c(3,5) ] <- 2
est.corrs[ c(2,4,9), c(2,4,9) ] <- 3
mod3 <- sirt::rasch.pml3( dat, error.corr=est.corrs )

#***
# Model 4: Negative correlations between testlets
est.corrs <- diag( 1, I )
est.corrs[ c(3,5), c(2,4,9) ] <- 2
est.corrs[ c(2,4,9), c(3,5) ] <- 2
mod4 <- sirt::rasch.pml3( dat, error.corr=est.corrs )

#***
# Model 5: sum constraint of zero within and between testlets
est.corrs <- matrix( 1:(I*I),  I, I )
cluster2 <- c(2,4,9)
est.corrs[ setdiff( 1:I, c(cluster2)),  ] <- 0
est.corrs[, setdiff( 1:I, c(cluster2))  ] <- 0
# define an error constraint matrix
itempairs0 <- mod4$itempairs
IP <- nrow(itempairs0)
err.constraint <- matrix( 0, IP, 1 )
err.constraint[ ( itempairs0$item1 %in% cluster2 )
       & ( itempairs0$item2 %in% cluster2 ), 1 ] <- 1
# set sum of error covariances to 1.2
err.constraintV <- matrix(3*.4,1,1)

mod5 <- sirt::rasch.pml3( dat, error.corr=est.corrs,
         err.constraintM=err.constraint, err.constraintV=err.constraintV)

#****
# Model 6: Constraint on sum of all correlations
est.corrs <- matrix( 1:(I*I),  I, I )
# define an error constraint matrix
itempairs0 <- mod4$itempairs
IP <- nrow(itempairs0)
# define two side conditions
err.constraint <- matrix( 0, IP, 2 )
err.constraintV <- matrix( 0, 2, 1)
# sum of all correlations is zero
err.constraint[, 1 ] <- 1
err.constraintV[1,1] <- 0
# sum of items cluster c(1,2,3) is 0
cluster2 <- c(1,2,3)
err.constraint[ ( itempairs0$item1 %in%  cluster2 )
       & ( itempairs0$item2 %in% cluster2 ), 2 ] <- 1
err.constraintV[2,1] <- 0

mod6 <- sirt::rasch.pml3( dat, error.corr=est.corrs,
    err.constraintM=err.constraint,  err.constraintV=err.constraintV)
summary(mod6)

#############################################################################
# EXAMPLE 5: 10 Items: Cluster 1 -> Items 1,2
#         Cluster 2 -> Items 3,4,5;   Cluster 3 -> Items 7,8,9
#############################################################################

set.seed(7650)
I <- 10                             # number of items
n <- 5000                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
bsamp <- b <- sample(b)             # sample item difficulties
theta <- stats::rnorm( n, sd=1 ) # person abilities
# define itemcluster
itemcluster <- rep(0,I)
itemcluster[ 1:2 ] <- 1
itemcluster[ 3:5 ] <- 2
itemcluster[ 7:9 ] <- 3
# define residual correlations
rho <- c( .55, .35, .45)

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

#***
# Model 1: residual correlation (equal within item clusters)
# define a matrix of integers for estimating error correlations
error.corr <- diag(1,ncol(dat))
for ( ii in 1:3){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
# estimate the model
mod1 <- sirt::rasch.pml3( dat, error.corr=error.corr )

#***
# Model 2: residual correlation (different within item clusters)
# define again a matrix of integers for estimating error correlations
error.corr <- diag(1,ncol(dat))
for ( ii in 1:3){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
I <- ncol(error.corr)
error.corr1 <- matrix( 1:(I*I), ncol=I )
error.corr <- error.corr1 * ( error.corr > 0 )
# estimate the model
mod2 <- sirt::rasch.pml3( dat, error.corr=error.corr )

#***
# Model 3: eliminate item pairs within itemclusters for PML estimation
mod3 <- sirt::rasch.pml3( dat, itemcluster=itemcluster )

#***
# Model 4: Rasch model ignoring dependency
mod4 <- sirt::rasch.pml3( dat )

# compare different models
summary(mod1)
summary(mod2)
summary(mod3)
summary(mod4)

## End(Not run)

PROX Estimation Method for the Rasch Model

Description

This function estimates the Rasch model using the PROX algorithm (cited in Wright & Stone, 1999).

Usage

rasch.prox(dat, dat.resp=1 - is.na(dat), freq=rep(1,nrow(dat)),
    conv=0.001, maxiter=30, progress=FALSE)

Arguments

dat

An N \times I data frame of dichotomous response data. NAs are not allowed and must be indicated by zero entries in the response indicator matrix dat.resp.

dat.resp

An N \times I indicator data frame of nonmissing item responses.

freq

A vector of frequencies (or weights) of all rows in data frame dat.

conv

Convergence criterion for item parameters

maxiter

Maximum number of iterations

progress

Display progress?

Value

A list with following entries

b

Estimated item difficulties

theta

Estimated person abilities

iter

Number of iterations

sigma.i

Item standard deviations

sigma.n

Person standard deviations

References

Wright, B., & Stone, W. (1999). Measurement Essentials. Wilmington: Wide Range.

Examples

#############################################################################
# EXAMPLE 1: PROX data.read
#############################################################################

data(data.read)
mod <- sirt::rasch.prox( data.read )
mod$b       # item difficulties

Estimation of the Rasch Model with Variational Approximation

Description

This function estimates the Rasch model by the estimation method of variational approximation (Rijmen & Vomlel, 2008).

Usage

rasch.va(dat, globconv=0.001, maxiter=1000)

Arguments

dat

Data frame with dichotomous item responses

globconv

Convergence criterion for item parameters

maxiter

Maximal number of iterations

Value

A list with following entries:

sig

Standard deviation of the trait

item

Data frame with item parameters

xsi.ij

Data frame with variational parameters \xi_{ij}

mu.i

Vector with individual means \mu_i

sigma2.i

Vector with individual variances \sigma_i^2

References

Rijmen, F., & Vomlel, J. (2008). Assessing the performance of variational methods for mixed logistic regression models. Journal of Statistical Computation and Simulation, 78, 765-779.

Examples

#############################################################################
# EXAMPLE 1: Rasch model
#############################################################################
set.seed(8706)
N <- 5000
I <- 20
dat <- sirt::sim.raschtype( stats::rnorm(N,sd=1.3), b=seq(-2,2,len=I) )

# estimation via variational approximation
mod1 <- sirt::rasch.va(dat)

# estimation via marginal maximum likelihood
mod2 <- sirt::rasch.mml2(dat)

# estmation via joint maximum likelihood
mod3 <- sirt::rasch.jml(dat)

# compare sigma
round( c( mod1$sig, mod2$sd.trait ), 3 )
## [1] 1.222 1.314

# compare b
round( cbind( mod1$item$b, mod2$item$b, mod3$item$itemdiff), 3 )
##         [,1]   [,2]   [,3]
##  [1,] -1.898 -1.967 -2.090
##  [2,] -1.776 -1.841 -1.954
##  [3,] -1.561 -1.618 -1.715
##  [4,] -1.326 -1.375 -1.455
##  [5,] -1.121 -1.163 -1.228

Estimation of Reliability for Confirmatory Factor Analyses Based on Dichotomous Data

Description

This function estimates a model based reliability using confirmatory factor analysis (Green & Yang, 2009).

Usage

reliability.nonlinearSEM(facloadings, thresh, resid.cov=NULL, cor.factors=NULL)

Arguments

facloadings

Matrix of factor loadings

thresh

Vector of thresholds

resid.cov

Matrix of residual covariances

cor.factors

Optional matrix of covariances (correlations) between factors. The default is a diagonal matrix with variances of 1.

Value

A list. The reliability is the list element omega.rel

Note

This function needs the mvtnorm package.

References

Green, S. B., & Yang, Y. (2009). Reliability of summed item scores using structural equation modeling: An alternative to coefficient alpha. Psychometrika, 74, 155-167.

Examples

#############################################################################
# EXAMPLE 1: Reading data set
#############################################################################
data(data.read)
dat <- data.read
I <- ncol(dat)

# define item clusters
itemcluster <- rep( 1:3, each=4)
error.corr <- diag(1,ncol(dat))
for ( ii in 1:3){
    ind.ii <- which( itemcluster==ii )
    error.corr[ ind.ii, ind.ii ] <- ii
        }
# estimate the model with error correlations
mod1 <- sirt::rasch.pml3( dat, error.corr=error.corr)
summary(mod1)

# extract item parameters
thresh <- - matrix( mod1$item$a * mod1$item$b, I, 1 )
A <- matrix( mod1$item$a * mod1$item$sigma, I, 1 )
# extract estimated correlation matrix
corM <- mod1$eps.corrM
# compute standardized factor loadings
facA <- 1 / sqrt( A^2 + 1 )
resvar <- 1 - facA^2
covM <- outer( sqrt(resvar[,1]), sqrt(resvar[,1] ) ) * corM
facloadings <- A *facA

# estimate reliability
rel1 <- sirt::reliability.nonlinearSEM( facloadings=facloadings, thresh=thresh,
           resid.cov=covM)
rel1$omega.rel

Creates Group-Wise Item Response Dataset

Description

Creates group-wise item response dataset.

Usage

resp_groupwise(resp, group, items_group)

Arguments

resp

Dataset with item responses

group

Vector of group identifiers

items_group

List containing vectors of groups for each item which should be made group-specific

Value

Dataset

Examples

## Not run: 
#############################################################################
# EXAMPLE 1: Toy dataset
#############################################################################

library(CDM)
library(TAM)

data(data.ex11, package="TAM")
dat <- data.ex11
dat[ dat==9 ] <- 0
resp <- dat[,-1]

# group labels
booklets <- sort( unique(paste(dat$booklet)))

#- fit initial model
mod0 <- TAM::tam.mml( resp, group=dat$booklet)
summary(mod0)

# fit statistics
fmod <- IRT.RMSD(mod)
stat <- abs(fmod$MD[,-1])
stat[ is.na( fmod$RMSD[,2:4] ) ] <- NA
thresh <- .01
round(stat,3)
# define list define groups for group-specific items
items_group <- apply( stat, 1, FUN=function(ll){
                v1 <- booklets[ which( ll > thresh ) ]
                v1[ ! is.na(v1) ]  } )

#- create extended response dataset
dat2 <- sirt::resp_groupwise(resp=resp, group=paste(dat$booklet), items_group=items_group)
colSums( ! is.na(dat2) )

#- fit model for extended response dataset
mod2 <- TAM::tam.mml( dat2, group=dat$booklet)
summary(mod2)

## End(Not run)

Inverse Gamma Distribution in Prior Sample Size Parameterization

Description

Random draws and density of inverse gamma distribution parameterized in prior sample size n0 and prior variance var0 (see Gelman et al., 2014).

Usage

rinvgamma2(n, n0, var0)

dinvgamma2(x, n0, var0)

Arguments

n

Number of draws for inverse gamma distribution

n0

Prior sample size

var0

Prior variance

x

Vector with numeric values for density evaluation

Value

A vector containing random draws or density values

References

Gelman, A., Carlin, J. B., Stern, H. S., Dunson, D. B., Vehtari, A., & Rubin, D. B. (2014). Bayesian data analysis (Vol. 3). Boca Raton, FL, USA: Chapman & Hall/CRC.

Examples

#############################################################################
# EXAMPLE 1: Inverse gamma distribution
#############################################################################

# prior sample size of 100 and prior variance of 1.5
n0 <- 100
var0 <- 1.5

# 100 random draws
y1 <- sirt::rinvgamma2( n=100, n0, var0 )
summary(y1)
graphics::hist(y1)

# density y at grid x
x <- seq( 0, 2, len=100 )
y <- sirt::dinvgamma2( x, n0, var0 )
graphics::plot( x, y, type="l")

Description

This function estimates the unidimensional rater facets model (Lincare, 1994) and an extension to slopes (see Details; Robitzsch & Steinfeld, 2018). The estimation is conducted by an EM algorithm employing marginal maximum likelihood.

Usage

rm.facets(dat, pid=NULL, rater=NULL, Qmatrix=NULL, theta.k=seq(-9, 9, len=30),
    est.b.rater=TRUE, est.a.item=FALSE, est.a.rater=FALSE, rater_item_int=FALSE,
    est.mean=FALSE, tau.item.fixed=NULL, a.item.fixed=NULL, b.rater.fixed=NULL,
    a.rater.fixed=NULL, b.rater.center=2, a.rater.center=2, a.item.center=2, a_lower=.05,
    a_upper=10, reference_rater=NULL, max.b.increment=1, numdiff.parm=0.00001,
    maxdevchange=0.1, globconv=0.001, maxiter=1000, msteps=4, mstepconv=0.001,
    PEM=FALSE, PEM_itermax=maxiter)

## S3 method for class 'rm.facets'
summary(object, file=NULL, ...)

## S3 method for class 'rm.facets'
anova(object,...)

## S3 method for class 'rm.facets'
logLik(object,...)

## S3 method for class 'rm.facets'
IRT.irfprob(object,...)

## S3 method for class 'rm.facets'
IRT.factor.scores(object, type="EAP", ...)

## S3 method for class 'rm.facets'
IRT.likelihood(object,...)

## S3 method for class 'rm.facets'
IRT.posterior(object,...)

## S3 method for class 'rm.facets'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.rm.facets'
summary(object, ...)

## function for processing data
rm_proc_data( dat, pid, rater, rater_item_int=FALSE, reference_rater=NULL )

Arguments

dat

Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.

pid

Person identifier.

rater

Rater identifier

Qmatrix

An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of K) is used.

theta.k

A grid of theta values for the ability distribution.

est.b.rater

Should the rater severities b_r be estimated?

est.a.item

Should the item slopes a_i be estimated?

est.a.rater

Should the rater slopes a_r be estimated?

rater_item_int

Logical indicating whether rater-item-interactions should be modeled.

est.mean

Optional logical indicating whether the mean of the trait distribution should be estimated.

tau.item.fixed

Matrix with fixed \tau parameters. Non-fixed parameters must be declared by NA values.

a.item.fixed

Vector with fixed item discriminations

b.rater.fixed

Vector with fixed rater intercept parameters

a.rater.fixed

Vector with fixed rater discrimination parameters

b.rater.center

Centering method for rater intercept parameters. The value 0 corresponds to no centering, the values 1 and 2 to different methods to ensure that they sum to zero.

a.rater.center

Centering method for rater discrimination parameters. The value 0 corresponds to no centering, the values 1 and 2 to different methods to ensure that their product equals one.

a.item.center

Centering method for item discrimination parameters. The value 0 corresponds to no centering, the values 1 and 2 to different methods to ensure that their product equals one.

a_lower

Lower bound for a parameters

a_upper

Upper bound for a parameters

reference_rater

Identifier for rater as a reference rater for which a fixed rater mean of 0 and a fixed rater slope of 1 is assumed.

max.b.increment

Maximum increment of item parameters during estimation

numdiff.parm

Numerical differentiation step width

maxdevchange

Maximum relative deviance change as a convergence criterion

globconv

Maximum parameter change

maxiter

Maximum number of iterations

msteps

Maximum number of iterations during an M step

mstepconv

Convergence criterion in an M step

PEM

Logical indicating whether the P-EM acceleration should be applied (Berlinet & Roland, 2012).

PEM_itermax

Number of iterations in which the P-EM method should be applied.

object

Object of class rm.facets

file

Optional file name in which summary should be written.

type

Factor score estimation method. Factor score types "EAP", "MLE" and "WLE" are supported.

...

Further arguments to be passed

Details

This function models ratings X_{pri} for person p, rater r and item i and category k (see also Robitzsch & Steinfeld, 2018; Uto & Ueno, 2010; Wu, 2017)

P( X_{pri}=k | \theta_p ) \propto \exp( a_i a_r q_{ik} \theta_p - q_{ik} b_r - \tau_{ik} ) \quad, \quad \theta_p \sim N( 0, \sigma^2 )

By default, the scores in the Q matrix are q_{ik}=k. Item slopes a_i and rater slopes a_r are standardized such that their product equals one, i.e. \prod_i a_i=\prod_r a_r=1.

Value

A list with following entries:

deviance

Deviance

ic

Information criteria and number of parameters

item

Data frame with item parameters

rater

Data frame with rater parameters

person

Data frame with person parameters: EAP and corresponding standard errors

EAP.rel

EAP reliability

mu

Mean of the trait distribution

sigma

Standard deviation of the trait distribution

theta.k

Grid of theta values

pi.k

Fitted distribution at theta.k values

tau.item

Item parameters \tau_{ik}

se.tau.item

Standard error of item parameters \tau_{ik}

a.item

Item slopes a_i

se.a.item

Standard error of item slopes a_i

delta.item

Delta item parameter. See pcm.conversion.

b.rater

Rater severity parameter b_r

se.b.rater

Standard error of rater severity parameter b_r

a.rater

Rater slope parameter a_r

se.a.rater

Standard error of rater slope parameter a_r

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior distribution

probs

Item probabilities at grid theta.k

n.ik

Expected counts

maxK

Maximum number of categories

procdata

Processed data

iter

Number of iterations

ipars.dat2

Item parameters for expanded dataset dat2

...

Further values

Note

If the trait standard deviation sigma strongly differs from 1, then a user should investigate the sensitivity of results using different theta integration points theta.k.

References

Berlinet, A. F., & Roland, C. (2012). Acceleration of the EM algorithm: P-EM versus epsilon algorithm. Computational Statistics & Data Analysis, 56(12), 4122-4137.

Linacre, J. M. (1994). Many-Facet Rasch Measurement. Chicago: MESA Press.

Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.

Uto, M., & Ueno, M. (2016). Item response theory for peer assessment. IEEE Transactions on Learning Technologies, 9(2), 157-170.

Wu, M. (2017). Some IRT-based analyses for interpreting rater effects. Psychological Test and Assessment Modeling, 59(4), 453-470.

Examples

#############################################################################
# EXAMPLE 1: Partial Credit Model and Generalized partial credit model
#                   5 items and 1 rater
#############################################################################
data(data.ratings1)
dat <- data.ratings1

# select rater db01
dat <- dat[ paste(dat$rater)=="db01", ]

#****  Model 1: Partial Credit Model
mod1 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], pid=dat$idstud )

#****  Model 2: Generalized Partial Credit Model
mod2 <- sirt::rm.facets( dat[, paste0( "k",1:5) ],  pid=dat$idstud, est.a.item=TRUE)

summary(mod1)
summary(mod2)

## Not run: 
#############################################################################
# EXAMPLE 2: Facets Model: 5 items, 7 raters
#############################################################################

data(data.ratings1)
dat <- data.ratings1

#****  Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             pid=dat$idstud, est.b.rater=FALSE )

#****  Model 2: Partial Credit Model: intercept rater effects
mod2 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater, pid=dat$idstud)

# extract individual likelihood
lmod1 <- IRT.likelihood(mod1)
str(lmod1)
# likelihood value
logLik(mod1)
# extract item response functions
pmod1 <- IRT.irfprob(mod1)
str(pmod1)
# model comparison
anova(mod1,mod2)
# absolute and relative model fit
smod1 <- IRT.modelfit(mod1)
summary(smod1)
smod2 <- IRT.modelfit(mod2)
summary(smod2)
IRT.compareModels( smod1, smod2 )
# extract factor scores (EAP is the default)
IRT.factor.scores(mod2)
# extract WLEs
IRT.factor.scores(mod2, type="WLE")

#****  Model 2a: compare results with TAM package
#   Results should be similar to Model 2
library(TAM)
mod2a <- TAM::tam.mml.mfr( resp=dat[, paste0( "k",1:5) ],
             facets=dat[, "rater", drop=FALSE],
             pid=dat$pid, formulaA=~ item*step + rater )

#****  Model 2b: Partial Credit Model: some fixed parameters
# fix rater parameters for raters 1, 4 and 5
b.rater.fixed <- rep(NA,7)
b.rater.fixed[ c(1,4,5) ] <- c(1,-.8,0)  # fixed parameters
# fix item parameters of first and second item
tau.item.fixed <- round( mod2$tau.item, 1 )    # use parameters from mod2
tau.item.fixed[ 3:5, ] <- NA    # free item parameters of items 3, 4 and 5
mod2b <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             b.rater.fixed=b.rater.fixed, tau.item.fixed=tau.item.fixed,
             est.mean=TRUE, pid=dat$idstud)
summary(mod2b)

#****  Model 3: estimated rater slopes
mod3 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
            est.a.rater=TRUE)

#****  Model 4: estimated item slopes
mod4 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             pid=dat$idstud, est.a.item=TRUE)

#****  Model 5: estimated rater and item slopes
mod5 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             pid=dat$idstud, est.a.rater=TRUE, est.a.item=TRUE)
summary(mod1)
summary(mod2)
summary(mod2a)
summary(mod3)
summary(mod4)
summary(mod5)

#****  Model 5a: Some fixed parameters in Model 5
# fix rater b parameters for raters 1, 4 and 5
b.rater.fixed <- rep(NA,7)
b.rater.fixed[ c(1,4,5) ] <- c(1,-.8,0)
# fix rater a parameters for first four raters
a.rater.fixed <- rep(NA,7)
a.rater.fixed[ c(1,2,3,4) ] <- c(1.1,0.9,.85,1)
# fix item b parameters of first item
tau.item.fixed <- matrix( NA, nrow=5, ncol=3 )
tau.item.fixed[ 1, ] <- c(-2,-1.5, 1 )
# fix item a parameters
a.item.fixed <- rep(NA,5)
a.item.fixed[ 1:4 ] <- 1
# estimate model
mod5a <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater,
             pid=dat$idstud, est.a.rater=TRUE, est.a.item=TRUE,
             tau.item.fixed=tau.item.fixed, b.rater.fixed=b.rater.fixed,
             a.rater.fixed=a.rater.fixed, a.item.fixed=a.item.fixed,
             est.mean=TRUE)
summary(mod5a)

#****  Model 6: Estimate rater model with reference rater 'db03'
mod6 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater, est.a.item=TRUE,
             est.a.rater=TRUE, pid=dat$idstud, reference_rater="db03" )
summary(mod6)

#**** Model 7: Modelling rater-item-interactions
mod7 <- sirt::rm.facets( dat[, paste0( "k",1:5) ], rater=dat$rater, est.a.item=FALSE,
             est.a.rater=TRUE, pid=dat$idstud, reference_rater="db03",
             rater_item_int=TRUE)
summary(mod7)

## End(Not run)

Hierarchical Rater Model Based on Signal Detection Theory (HRM-SDT)

Description

This function estimates a version of the hierarchical rater model (HRM) based on signal detection theory (HRM-SDT; DeCarlo, 2005; DeCarlo, Kim & Johnson, 2011; Robitzsch & Steinfeld, 2018). The model is estimated by means of an EM algorithm adapted from multilevel latent class analysis (Vermunt, 2008).

Usage

rm.sdt(dat, pid, rater, Qmatrix=NULL, theta.k=seq(-9, 9, len=30),
    est.a.item=FALSE, est.c.rater="n", est.d.rater="n", est.mean=FALSE, est.sigma=TRUE,
    skillspace="normal", tau.item.fixed=NULL, a.item.fixed=NULL,
    d.min=0.5, d.max=100, d.start=3, c.start=NULL, tau.start=NULL, sd.start=1,
    d.prior=c(3,100), c.prior=c(3,100), tau.prior=c(0,1000), a.prior=c(1,100),
    link_item="GPCM", max.increment=1, numdiff.parm=0.00001, maxdevchange=0.1,
    globconv=.001, maxiter=1000, msteps=4, mstepconv=0.001, optimizer="nlminb" )

## S3 method for class 'rm.sdt'
summary(object, file=NULL, ...)

## S3 method for class 'rm.sdt'
plot(x, ask=TRUE, ...)

## S3 method for class 'rm.sdt'
anova(object,...)

## S3 method for class 'rm.sdt'
logLik(object,...)

## S3 method for class 'rm.sdt'
IRT.factor.scores(object, type="EAP", ...)

## S3 method for class 'rm.sdt'
IRT.irfprob(object,...)

## S3 method for class 'rm.sdt'
IRT.likelihood(object,...)

## S3 method for class 'rm.sdt'
IRT.posterior(object,...)

## S3 method for class 'rm.sdt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.rm.sdt'
summary(object,...)

Arguments

dat

Original data frame. Ratings on variables must be in rows, i.e. every row corresponds to a person-rater combination.

pid

Person identifier.

rater

Rater identifier.

Qmatrix

An optional Q-matrix. If this matrix is not provided, then by default the ordinary scoring of categories (from 0 to the maximum score of K) is used.

theta.k

A grid of theta values for the ability distribution.

est.a.item

Should item parameters a_i be estimated?

est.c.rater

Type of estimation for item-rater parameters c_{ir} in the signal detection model. Options are 'n' (no estimation), 'e' (set all parameters equal to each other), 'i' (itemwise estimation), 'r' (rater wise estimation) and 'a' (all parameters are estimated independently from each other).

est.d.rater

Type of estimation of d parameters. Options are the same as in est.c.rater.

est.mean

Optional logical indicating whether the mean of the trait distribution should be estimated.

est.sigma

Optional logical indicating whether the standard deviation of the trait distribution should be estimated.

skillspace

Specified \theta distribution type. It can be "normal" or "discrete". In the latter case, all probabilities of the distribution are separately estimated.

tau.item.fixed

Optional matrix with three columns specifying fixed \tau parameters. The first two columns denote item and category indices, the third the fixed value. See Example 3.

a.item.fixed

Optional matrix with two columns specifying fixed a parameters. First column: Item index. Second column: Fixed a parameter.

d.min

Minimal d parameter to be estimated

d.max

Maximal d parameter to be estimated

d.start

Starting value(s) of d parameters

c.start

Starting values of c parameters

tau.start

Starting values of \tau parameters

sd.start

Starting value for trait standard deviation

d.prior

Normal prior N(M,S^2) for d parameters

c.prior

Normal prior for c parameters. The prior for parameter c_{irk} is defined as M \cdot ( k - 0.5) where M is c.prior[1].

tau.prior

Normal prior for \tau parameters

a.prior

Normal prior for a parameters

link_item

Type of item response function for latent responses. Can be "GPCM" for the generalized partial credit model or "GRM" for the graded response model.

max.increment

Maximum increment of item parameters during estimation

numdiff.parm

Numerical differentiation step width

maxdevchange

Maximum relative deviance change as a convergence criterion

globconv

Maximum parameter change

maxiter

Maximum number of iterations

msteps

Maximum number of iterations during an M step

mstepconv

Convergence criterion in an M step

optimizer

Choice of optimization function in M-step for item parameters. Options are "nlminb" for stats::nlminb and "optim" for stats::optim.

object

Object of class rm.sdt

file

Optional file name in which summary should be written.

x

Object of class rm.sdt

ask

Optional logical indicating whether a new plot should be asked for.

type

Factor score estimation method. Up to now, only type="EAP" is supported.

...

Further arguments to be passed

Details

The specification of the model follows DeCarlo et al. (2011). The second level models the ideal rating (latent response) \eta=0, ...,K of person p on item i. The option link_item='GPCM' follows the generalized partial credit model

P( \eta_{pi}=\eta | \theta_p ) \propto exp( a_{i} q_{i \eta } \theta_p - \tau_{i \eta } )

. The option link_item='GRM' employs the graded response model

P( \eta_{pi}=\eta | \theta_p )= \Psi( \tau_{i,\eta + 1} - a_i \theta_p ) - \Psi( \tau_{i,\eta} - a_i \theta_p )

At the first level, the ratings X_{pir} for person p on item i and rater r are modeled as a signal detection model

P( X_{pir} \le k | \eta_{pi} )= G( c_{irk} - d_{ir} \eta_{pi} )

where G is the logistic distribution function and the categories are k=1,\ldots, K+1. Note that the item response model can be equivalently written as

P( X_{pir} \ge k | \eta_{pi} )= G( d_{ir} \eta_{pi} - c_{irk})

The thresholds c_{irk} can be further restricted to c_{irk}=c_{k} (est.c.rater='e'), c_{irk}=c_{ik} (est.c.rater='i') or c_{irk}=c_{ir} (est.c.rater='r'). The same holds for rater precision parameters d_{ir}.

Value

A list with following entries:

deviance

Deviance

ic

Information criteria and number of parameters

item

Data frame with item parameters. The columns N and M denote the number of observed ratings and the observed mean of all ratings, respectively.
In addition to item parameters \tau_{ik} and a_i, the mean for the latent response (latM) is computed as E( \eta_i )=\sum_p P( \theta_p ) q_{ik} P( \eta_i=k | \theta_p ) which provides an item parameter at the original metric of ratings. The latent standard deviation (latSD) is computed in the same manner.

rater

Data frame with rater parameters. Transformed c parameters (c_x.trans) are computed as c_{irk} / ( d_{ir} ).

person

Data frame with person parameters: EAP and corresponding standard errors

EAP.rel

EAP reliability

EAP.rel

EAP reliability

mu

Mean of the trait distribution

sigma

Standard deviation of the trait distribution

tau.item

Item parameters \tau_{ik}

se.tau.item

Standard error of item parameters \tau_{ik}

a.item

Item slopes a_i

se.a.item

Standard error of item slopes a_i

c.rater

Rater parameters c_{irk}

se.c.rater

Standard error of rater severity parameter c_{irk}

d.rater

Rater slope parameter d_{ir}

se.d.rater

Standard error of rater slope parameter d_{ir}

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior distribution

probs

Item probabilities at grid theta.k. Note that these probabilities are calculated on the pseudo items i \times r, i.e. the interaction of item and rater.

prob.item

Probabilities P( \eta_i=\eta | \theta ) of latent item responses evaluated at theta grid \theta_p.

n.ik

Expected counts

pi.k

Estimated trait distribution P(\theta_p).

maxK

Maximum number of categories

procdata

Processed data

iter

Number of iterations

...

Further values

References

DeCarlo, L. T. (2005). A model of rater behavior in essay grading based on signal detection theory. Journal of Educational Measurement, 42, 53-76.

DeCarlo, L. T. (2010). Studies of a latent-class signal-detection model for constructed response scoring II: Incomplete and hierarchical designs. ETS Research Report ETS RR-10-08. Princeton NJ: ETS.

DeCarlo, T., Kim, Y., & Johnson, M. S. (2011). A hierarchical rater model for constructed responses, with a signal detection rater model. Journal of Educational Measurement, 48, 333-356.

Robitzsch, A., & Steinfeld, J. (2018). Item response models for human ratings: Overview, estimation methods, and implementation in R. Psychological Test and Assessment Modeling, 60(1), 101-139.

Vermunt, J. K. (2008). Latent class and finite mixture models for multilevel data sets. Statistical Methods in Medical Research, 17, 33-51.

Examples

#############################################################################
# EXAMPLE 1: Hierarchical rater model (HRM-SDT) data.ratings1
#############################################################################
data(data.ratings1)
dat <- data.ratings1

## Not run: 
# Model 1: Partial Credit Model: no rater effects
mod1 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", d.start=100,  est.d.rater="n" )
summary(mod1)

# Model 2: Generalized Partial Credit Model: no rater effects
mod2 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="n", est.d.rater="n",
            est.a.item=TRUE, d.start=100)
summary(mod2)

# Model 3: Equal effects in SDT
mod3 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="e", est.d.rater="e")
summary(mod3)

# Model 4: Rater effects in SDT
mod4 <- sirt::rm.sdt( dat[, paste0( "k",1:5) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="r", est.d.rater="r")
summary(mod4)

#############################################################################
# EXAMPLE 2: HRM-SDT data.ratings3
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: item- and rater-specific effects
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a" )
summary(mod1)
plot(mod1)

# Model 2: Differing number of categories per variable
mod2 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4,6)) ], rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a")
summary(mod2)
plot(mod2)

#############################################################################
# EXAMPLE 3: Hierarchical rater model with discrete skill spaces
#############################################################################

data(data.ratings3)
dat <- data.ratings3
dat <- dat[ dat$rater < 814, ]
psych::describe(dat)

# Model 1: Discrete theta skill space with values of 0,1,2 and 3
mod1 <- sirt::rm.sdt( dat[, paste0( "crit",c(2:4)) ], theta.k=0:3, rater=dat$rater,
            pid=dat$idstud, est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod1)
plot(mod1)

# Model 2: Modelling of one item by using a discrete skill space and
#          fixed item parameters

# fixed tau and a parameters
tau.item.fixed <- cbind( 1, 1:3,  100*cumsum( c( 0.5, 1.5, 2.5)) )
a.item.fixed <- cbind( 1, 100 )
# fit HRM-SDT
mod2 <- sirt::rm.sdt( dat[, "crit2", drop=FALSE], theta.k=0:3, rater=dat$rater,
            tau.item.fixed=tau.item.fixed,a.item.fixed=a.item.fixed, pid=dat$idstud,
            est.c.rater="a", est.d.rater="a", skillspace="discrete" )
summary(mod2)
plot(mod2)

## End(Not run)

Simulation of a Multivariate Normal Distribution with Exact Moments

Description

Simulates a dataset from a multivariate or univariate normal distribution that exactly fulfils the specified mean vector and the covariance matrix.

Usage

# multivariate normal distribution
rmvn(N, mu, Sigma, exact=TRUE)

# univariate normal distribution
ruvn(N, mean=0, sd=1, exact=TRUE)

Arguments

N

Sample size

mu

Mean vector

Sigma

Covariance matrix

exact

Logical indicating whether mu and Sigma should be exactly reproduced.

mean

Numeric value for mean

sd

Numeric value for standard deviation

Value

A dataframe or a vector

Examples

#############################################################################
# EXAMPLE 1: Simulate multivariate normal data
#############################################################################

# define covariance matrix and mean vector
rho <- .8
Sigma <- matrix(rho,3,3)
diag(Sigma) <- 1
mu <- c(0,.5,1)

#* simulate data
set.seed(87)
dat <- sirt::rmvn(N=200, mu=mu, Sigma=Sigma)
#* check means and covariances
stats::cov.wt(dat, method="ML")

## Not run: 
#############################################################################
# EXAMPLE 2: Simulate univariate normal data
#############################################################################

#* simulate data
x <- sirt::ruvn(N=20, mean=.5, sd=1.2, exact=TRUE)
# check results
stats::var(x)
sirt:::sirt_var(x)

## End(Not run)

Scaling of Group Means and Standard Deviations

Description

Scales a vector of means and standard deviations containing group values.

Usage

scale_group_means(M, SD, probs=NULL, M_target=0, SD_target=1)

## predict method
predict_scale_group_means(object, M, SD)

Arguments

M

Vector of means

SD

Vector of standard deviations

probs

Optional vector containing probabilities

M_target

Target value for mean

SD_target

Target value for standard deviation

object

Fitted object from scale_group_means

Value

List with entries

M1

total mean

SD1

total standard deviation

M_z

standardized means

SD_z

standardized standard deviations

M_trafo

transformed means

SD_trafo

transformed standard deviations

Examples

#############################################################################
# EXAMPLE 1: Toy example
#############################################################################

M <- c(-.03, .18, -.23, -.15, .29)
SD <- c(.97, 1.13, .77, 1.05, 1.17)
sirt::scale_group_means(M=M, SD=SD)

Statistical Implicative Analysis (SIA)

Description

This function is a simplified implementation of statistical implicative analysis (Gras & Kuntz, 2008) which aims at deriving implications X_i \rightarrow X_j. This means that solving item i implies solving item j.

Usage

sia.sirt(dat, significance=0.85)

Arguments

dat

Data frame with dichotomous item responses

significance

Minimum implicative probability for inclusion of an arrow in the graph. The probability can be interpreted as a kind of significance level, i.e. higher probabilities indicate more probable implications.

Details

The test statistic for selection an implicative relation follows Gras and Kuntz (2008). Transitive arrows (implications) are removed from the graph. If some implications are symmetric, then only the more probable implication will be retained.

Value

A list with following entries

adj.matrix

Adjacency matrix of the graph. Transitive and symmetric implications (arrows) have been removed.

adj.pot

Adjacency matrix including all powers, i.e. all direct and indirect paths from item i to item j.

adj.matrix.trans

Adjacency matrix including transitive arrows.

desc

List with descriptive statistics of the graph.

desc.item

Descriptive statistics for each item.

impl.int

Implication intensity (probability) as the basis for deciding the significance of an arrow

impl.t

Corresponding t values of impl.int

impl.significance

Corresponding p values (significancies) of impl.int

conf.loev

Confidence according to Loevinger (see Gras & Kuntz, 2008). This values are just conditional probabilities P( X_j=1|X_i=1).

graph.matr

Matrix containing all arrows. Can be used for example for the Rgraphviz package.

graph.edges

Vector containing all edges of the graph, e.g. for the Rgraphviz package.

igraph.matr

Matrix containing all arrows for the igraph package.

igraph.obj

An object of the graph for the igraph package.

Note

For an implementation of statistical implicative analysis in the C.H.I.C. (Classification Hierarchique, Implicative et Cohesitive) software.

See https://ardm.eu/partenaires/logiciel-danalyse-de-donnees-c-h-i-c/.

References

Gras, R., & Kuntz, P. (2008). An overview of the statistical implicative analysis (SIA) development. In R. Gras, E. Suzuki, F. Guillet, & F. Spagnolo (Eds.). Statistical Implicative Analysis (pp. 11-40). Springer, Berlin Heidelberg.

Examples

#############################################################################
# EXAMPLE 1: SIA for data.read
#############################################################################

data(data.read)
dat <- data.read

res <- sirt::sia.sirt(dat, significance=.85 )

#*** plot results with igraph package
library(igraph)
plot( res$igraph.obj ) #, vertex.shape="rectangle", vertex.size=30 )

## Not run: 
#*** plot results with qgraph package
miceadds::library_install(qgraph)
qgraph::qgraph( res$adj.matrix )

#*** plot results with Rgraphviz package
# Rgraphviz can only be obtained from Bioconductor
# If it should be downloaded, select TRUE for the following lines
if (FALSE){
     source("http://bioconductor.org/biocLite.R")
     biocLite("Rgraphviz")
            }
# define graph
grmatrix <- res$graph.matr
res.graph <- new("graphNEL", nodes=res$graph.edges, edgemode="directed")
# add edges
RR <- nrow(grmatrix)
for (rr in 1:RR){
    res.graph <- Rgraphviz::addEdge(grmatrix[rr,1], grmatrix[rr,2], res.graph, 1)
                    }
# define cex sizes and shapes
V <- length(res$graph.edges)
size2 <- rep(16,V)
shape2 <- rep("rectangle", V )
names(shape2) <- names(size2) <- res$graph.edges
# plot graph
Rgraphviz::plot( res.graph, nodeAttrs=list("fontsize"=size2, "shape"=shape2) )

## End(Not run)

Simulate from Ramsay's Quotient Model

Description

This function simulates dichotomous item response data according to Ramsay's quotient model (Ramsay, 1989).

Usage

sim.qm.ramsay(theta, b, K)

Arguments

theta

Vector of of length N person parameters (must be positive!)

b

Vector of length I of item difficulties (must be positive)

K

Vector of length I of guessing parameters (must be positive)

Details

Ramsay's quotient model (Ramsay, 1989) is defined by the equation

P(X_{pi}=1 | \theta_p )=\frac{ \exp { ( \theta_p / b_i ) } } { K_i + \exp { ( \theta_p / b_i ) } }

Value

An N \times I data frame with dichotomous item responses.

References

Ramsay, J. O. (1989). A comparison of three simple test theory models. Psychometrika, 54, 487-499.

Examples

#############################################################################
# EXAMPLE 1: Estimate Ramsay Quotient Model with rasch.mml2
#############################################################################

set.seed(657)
# simulate data according to the Ramsay model
N <- 1000       # persons
I <- 11         # items
theta <- exp( stats::rnorm( N ) )  # person ability
b <- exp( seq(-2,2,len=I))  # item difficulty
K <- rep( 3, I )           # K parameter (=> guessing)

# apply simulation function
dat <- sirt::sim.qm.ramsay( theta, b, K )

#***
# analysis
mmliter <- 50       # maximum number of iterations
I <- ncol(dat)
fixed.K <- rep( 3, I )

# Ramsay QM with fixed K parameter (K=3 in fixed.K specification)
mod1 <- sirt::rasch.mml2( dat, mmliter=mmliter, irtmodel="ramsay.qm",
              fixed.K=fixed.K )
summary(mod1)

# Ramsay QM with joint estimated K parameters
mod2 <- sirt::rasch.mml2( dat, mmliter=mmliter, irtmodel="ramsay.qm",
             est.K=rep(1,I)  )
summary(mod2)

## Not run: 
# Ramsay QM with itemwise estimated K parameters
mod3 <- sirt::rasch.mml2( dat, mmliter=mmliter, irtmodel="ramsay.qm",
              est.K=1:I  )
summary(mod3)

# Rasch model
mod4 <- sirt::rasch.mml2( dat )
summary(mod4)

# generalized logistic model
mod5 <- sirt::rasch.mml2( dat, est.alpha=TRUE, mmliter=mmliter)
summary(mod5)

# 2PL model
mod6 <- sirt::rasch.mml2( dat, est.a=rep(1,I) )
summary(mod6)

# Difficulty + Guessing (b+c) Model
mod7 <- sirt::rasch.mml2( dat, est.c=rep(1,I) )
summary(mod7)

# estimate separate guessing (c) parameters
mod8 <- sirt::rasch.mml2( dat, est.c=1:I  )
summary(mod8)

#*** estimate Model 1 with user defined function in mirt package

# create user defined function for Ramsay's quotient model
name <- 'ramsayqm'
par <- c("K"=3, "b"=1 )
est <- c(TRUE, TRUE)
P.ramsay <- function(par,Theta){
     eps <- .01
     K <- par[1]
     b <- par[2]
     num <- exp( exp( Theta[,1] ) / b )
     denom <- K + num
     P1 <- num / denom
     P1 <- eps + ( 1 - 2*eps ) * P1
     cbind(1-P1, P1)
}

# create item response function
ramsayqm <- mirt::createItem(name, par=par, est=est, P=P.ramsay)
# define parameters to be estimated
mod1m.pars <- mirt::mirt(dat, 1, rep( "ramsayqm",I),
                   customItems=list("ramsayqm"=ramsayqm), pars="values")
mod1m.pars[ mod1m.pars$name=="K", "est" ] <- FALSE
# define Theta design matrix
Theta <- matrix( seq(-3,3,len=10), ncol=1)
# estimate model
mod1m <- mirt::mirt(dat, 1, rep( "ramsayqm",I), customItems=list("ramsayqm"=ramsayqm),
               pars=mod1m.pars, verbose=TRUE,
               technical=list( customTheta=Theta, NCYCLES=50)
                )
print(mod1m)
summary(mod1m)
cmod1m <- sirt::mirt.wrapper.coef( mod1m )$coef
# compare simulated and estimated values
dfr <- cbind( b, cmod1m$b, exp(mod1$item$b ) )
colnames(dfr) <- c("simulated", "mirt", "sirt_rasch.mml2")
round( dfr, 2 )
  ##      simulated mirt sirt_rasch.mml2
  ## [1,]      0.14 0.11            0.11
  ## [2,]      0.20 0.17            0.18
  ## [3,]      0.30 0.27            0.29
  ## [4,]      0.45 0.42            0.43
  ## [5,]      0.67 0.65            0.67
  ## [6,]      1.00 1.00            1.01
  ## [7,]      1.49 1.53            1.54
  ## [8,]      2.23 2.21            2.21
  ## [9,]      3.32 3.00            2.98
  ##[10,]      4.95 5.22            5.09
  ##[11,]      7.39 5.62            5.51

## End(Not run)

Simulation of the Rasch Model with Locally Dependent Responses

Description

This function simulates dichotomous item responses where for some itemclusters residual correlations can be defined.

Usage

sim.rasch.dep(theta, b, itemcluster, rho)

Arguments

theta

Vector of person abilities of length N

b

Vector of item difficulties of length I

itemcluster

Vector of integers (including 0) of length I. Different integers correspond to different itemclusters.

rho

Vector of residual correlations. The length of vector must be equal to the number of itemclusters.

Value

An N \times I data frame of dichotomous item responses.

Note

The specification of the simulation models follows a marginal interpretation of the latent trait. Local dependencies are only interpreted as nuisance and not of substantive interest. If local dependencies should be substantively interpreted, a testlet model seems preferable (see mcmc.3pno.testlet).

Examples

#############################################################################
# EXAMPLE 1: 11 Items: 2 itemclusters with 2 resp. 3 dependent items
#             and 6 independent items
#############################################################################

set.seed(7654)
I <- 11                             # number of items
n <- 1500                           # number of persons
b <- seq(-2,2, len=I)               # item difficulties
theta <- stats::rnorm( n, sd=1 )        # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ c(3,5)] <- 1
itemcluster[c(2,4,9)] <- 2
# residual correlations
rho <- c( .7, .5 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimate Rasch copula model
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster )
summary(mod1)

# compare result with Rasch model estimation in rasch.copula
# delta must be set to zero
mod2 <- sirt::rasch.copula2( dat, itemcluster=itemcluster, delta=c(0,0),
            est.delta=c(0,0)  )
summary(mod2)

# estimate Rasch model with rasch.mml2 function
mod3 <- sirt::rasch.mml2( dat )
summary(mod3)

## Not run: 
#############################################################################
# EXAMPLE 2: 12 Items: Cluster 1 -> Items 1,...,4;
#       Cluster 2 -> Items 6,...,9; Cluster 3 -> Items 10,11,12
#############################################################################

set.seed(7896)
I <- 12                             # number of items
n <- 450                            # number of persons
b <- seq(-2,2, len=I)               # item difficulties
b <- sample(b)                      # sample item difficulties
theta <- stats::rnorm( n, sd=1 )        # person abilities
# itemcluster
itemcluster <- rep(0,I)
itemcluster[ 1:4 ] <- 1
itemcluster[ 6:9 ] <- 2
itemcluster[ 10:12 ] <- 3
# residual correlations
rho <- c( .55, .25, .45 )

# simulate data
dat <- sirt::sim.rasch.dep( theta, b, itemcluster, rho )
colnames(dat) <- paste("I", seq(1,ncol(dat)), sep="")

# estimate Rasch copula model
mod1 <- sirt::rasch.copula2( dat, itemcluster=itemcluster, numdiff.parm=.001 )
summary(mod1)

# Rasch model estimation
mod2 <- sirt::rasch.copula2( dat, itemcluster=itemcluster,
            delta=rep(0,3), est.delta=rep(0,3) )
summary(mod2)

# estimation with pairwise Rasch model
mod3 <- sirt::rasch.pairwise( dat )
summary(mod3)

## End(Not run)

Simulate from Generalized Logistic Item Response Model

Description

This function simulates dichotomous item responses from a generalized logistic item response model (Stukel, 1988). The four-parameter logistic item response model (Loken & Rulison, 2010) is a special case. See rasch.mml2 for more details.

Usage

sim.raschtype(theta, b, alpha1=0, alpha2=0, fixed.a=NULL,
    fixed.c=NULL, fixed.d=NULL)

Arguments

theta

Unidimensional ability vector \theta

b

Vector of item difficulties b

alpha1

Parameter \alpha_1 in generalized logistic link function

alpha2

Parameter \alpha_2 in generalized logistic link function

fixed.a

Vector of item slopes a

fixed.c

Vector of lower item asymptotes c

fixed.d

Vector of lower item asymptotes d

Details

The class of generalized logistic link functions contain the most important link functions using the specifications (Stukel, 1988):

See pgenlogis for exact transformation formulas of the mentioned link functions.

Value

Data frame with simulated item responses

References

Loken, E., & Rulison, K. L. (2010). Estimation of a four-parameter item response theory model. British Journal of Mathematical and Statistical Psychology, 63, 509-525.

Stukel, T. A. (1988). Generalized logistic models. Journal of the American Statistical Association, 83, 426-431.

Examples

#############################################################################
## EXAMPLE 1: Simulation of data from a Rasch model (alpha_1=alpha_2=0)
#############################################################################

set.seed(9765)
N <- 500    # number of persons
I <- 11     # number of items
b <- seq( -2, 2, length=I )
dat <- sirt::sim.raschtype( stats::rnorm( N ), b )
colnames(dat) <- paste0( "I", 1:I )

First Eigenvalues of a Symmetric Matrix

Description

This function computes the first D eigenvalues and eigenvectors of a symmetric positive definite matrices. The eigenvalues are computed by the Rayleigh quotient method (Lange, 2010, p. 120).

Usage

sirt_eigenvalues( X, D, maxit=200, conv=10^(-6) )

Arguments

X

Symmetric matrix

D

Number of eigenvalues to be estimated

maxit

Maximum number of iterations

conv

Convergence criterion

Value

A list with following entries:

d

Vector of eigenvalues

u

Matrix with eigenvectors in columns

References

Lange, K. (2010). Numerical Analysis for Statisticians. New York: Springer.

Examples

Sigma <- diag(1,3)
Sigma[ lower.tri(Sigma) ] <- Sigma[ upper.tri(Sigma) ] <- c(.4,.6,.8 )
sirt::sirt_eigenvalues(X=Sigma, D=2 )
# compare with svd function
svd(Sigma)

Defunct sirt Functions

Description

These functions have been removed or replaced in the sirt package.

Usage

rasch.conquest(...)
rasch.pml2(...)
testlet.yen.q3(...)
yen.q3(...)

Arguments

...

Arguments to be passed.

Details

The rasch.conquest function has been replaced by R2conquest.

The rasch.pml2 function has been superseded by rasch.pml3.

The testlet.yen.q3 function has been replaced by Q3.testlet.

The yen.q3 function has been replaced by Q3.

Utility Functions in sirt

Description

Utility functions in sirt.

Usage

# bounds entries in a vector
bounds_parameters( pars, lower=NULL, upper=NULL)

# improper density function which always returns a value of 1
dimproper(x)

# generalized inverse of a symmetric function
ginverse_sym(A, eps=1E-8)
# hard thresholding function
hard_thresholding(x, lambda)
# soft thresholding function
soft_thresholding(x, lambda)

# power function x^a, like in Cpp
pow(x, a)
# trace of a matrix
tracemat(A)

#** matrix functions
sirt_matrix2(x, nrow)   # matrix() function with byrow=TRUE
sirt_colMeans(x, na.rm=TRUE)
sirt_colSDs(x, na.rm=TRUE)
sirt_colMins(x, na.rm=TRUE)
sirt_colMaxs(x, na.rm=TRUE)
sirt_colMedians(x, na.rm=TRUE)

#* normalize vector to have sum of one
sirt_sum_norm(x, na.rm=TRUE)
#* discrete normal distribution
sirt_dnorm_discrete(x, mean=0, sd=1, ...)

# plyr::rbind.fill implementation in sirt
sirt_rbind_fill(x, y)

# Fisher-z transformation, see psych::fisherz
sirt_fisherz(rho)
# inverse Fisher-z transformation, see psych::fisherz2r
sirt_antifisherz(z)

# smooth approximation of the absolute value function
sirt_abs_smooth(x, deriv=0, eps=1e-4)

# permutations with replacement
sirt_permutations(r,v)
  #-> is equivalent to gtools::permutations(n=length(v), r=D, v=v, repeats.allowed=TRUE)

# attach all elements in a list in a specified environment
sirt_attach_list_elements(x, envir)

# switch between stats::optim and stats::nlminb
sirt_optimizer(optimizer, par, fn, grad=NULL, method="L-BFGS-B", hessian=TRUE,
                   control=list(), ...)

# print objects in a summary
sirt_summary_print_objects(obji, from=NULL, to=NULL, digits=3, rownames_null=TRUE,
      grep_string=NULL)
# print package version and R session
sirt_summary_print_package_rsession(pack)
# print package version
sirt_summary_print_package(pack)
# print R session
sirt_summary_print_rsession()
# print call
sirt_summary_print_call(CALL)

# print a data frame x with fixed numbers of digits after the decimal
print_digits(x, digits=NULL)

# discrete inverse function
sirt_rcpp_discrete_inverse(x0, y0, y)

# move variables in a data frame
move_variables_df(x, after_var, move_vars)

Arguments

pars

Numeric vector

lower

Numeric vector

upper

Numeric vector

x

Numeric vector or a matrix or a list

eps

Numerical. Shrinkage parameter of eigenvalue in ginverse_sym

a

Numeric vector

lambda

Numeric value

A

Matrix

nrow

Integer

na.rm

Logical

mean

Numeric

sd

Numeric

y

Matrix

rho

Numeric

deriv

Integer indicating the order of derivative

z

Numeric

r

Integer

v

Vector

envir

Environment

optimizer

Can be one of the following optimizers: optim, nlminb, bobyqa (from the minqa packlage), Rvmmin (from the optimx package) or nloptr (from the nloptr package using the argument opts$algorithm="NLOPT_LD_MMA").

par

Initial parameter

fn

Function

grad

Gradient function

method

Optimization method

hessian

Logical

control

Control list for R optimizers

...

Further arguments to be passed

obji

Data frame

from

Integer

to

Integer

digits

Integer

rownames_null

Logical

grep_string

String

pack

Package name

CALL

Call statement

x0

Vector

y0

Vector

after_var

String indicating variable name after which variable specified variables in move_vars should be moved

move_vars

Variables which should be moved after after_var

Examples

#############################################################################
## EXAMPLE 1: Trace of a matrix
#############################################################################

set.seed(86)
A <- matrix( stats::runif(4), 2,2 )
tracemat(A)
sum(diag(A))    #=sirt::tracemat(A)

#############################################################################
## EXAMPLE 2: Power function
#############################################################################

x <- 2.3
a <- 1.7
pow(x=x,a=a)
x^a            #=sirt::pow(x,a)

#############################################################################
## EXAMPLE 3: Soft and hard thresholding function (e.g. in LASSO estimation)
#############################################################################

x <- seq(-2, 2, length=100)
y <- sirt::soft_thresholding( x, lambda=.5)
graphics::plot( x, y, type="l")

z <- sirt::hard_thresholding( x, lambda=.5)
graphics::lines( x, z, lty=2, col=2)

#############################################################################
## EXAMPLE 4: Bounds on parameters
#############################################################################

pars <- c(.721, .346)
bounds_parameters( pars=pars, lower=c(-Inf, .5), upper=c(Inf,1) )

#############################################################################
## EXAMPLE 5: Smooth approximation of absolute value function
#############################################################################

x <- seq(-1,1,len=100)
graphics::plot(x, abs(x), lwd=2, col=1, lty=1, type="l", ylim=c(-1,1) )
# smooth approximation
tt <- 2
graphics::lines(x, sirt::sirt_abs_smooth(x), lty=tt, col=tt, lwd=2)
# first derivative
tt <- 3
graphics::lines(x, sirt::sirt_abs_smooth(x, deriv=1), lty=tt, col=tt, lwd=2)
# second derivative
tt <- 4
graphics::lines(x, sirt::sirt_abs_smooth(x, deriv=2), lty=tt, col=tt, lwd=2)

# analytic computation of first and second derivative
stats::deriv( ~ sqrt(x^2 + eps), namevec="x", hessian=TRUE )

## Not run: 
#############################################################################
## EXAMPLE 6: Permutations with replacement
#############################################################################

D <- 4
v <- 0:1
sirt::sirt_permutations(r=D, v=v)
gtools::permutations(n=length(v), r=D, v=v, repeats.allowed=TRUE)

## End(Not run)

Multidimensional Noncompensatory, Compensatory and Partially Compensatory Item Response Model

Description

This function estimates the noncompensatory and compensatory multidimensional item response model (Bolt & Lall, 2003; Reckase, 2009) as well as the partially compensatory item response model (Spray et al., 1990) for dichotomous data.

Usage

smirt(dat, Qmatrix, irtmodel="noncomp", est.b=NULL, est.a=NULL,
     est.c=NULL, est.d=NULL, est.mu.i=NULL, b.init=NULL, a.init=NULL,
     c.init=NULL, d.init=NULL, mu.i.init=NULL, Sigma.init=NULL,
     b.lower=-Inf, b.upper=Inf, a.lower=-Inf, a.upper=Inf,
     c.lower=-Inf, c.upper=Inf, d.lower=-Inf, d.upper=Inf,
     theta.k=seq(-6,6,len=20), theta.kDES=NULL,
     qmcnodes=0, mu.fixed=NULL, variance.fixed=NULL,  est.corr=FALSE,
     max.increment=1, increment.factor=1, numdiff.parm=0.0001,
     maxdevchange=0.1, globconv=0.001, maxiter=1000, msteps=4,
     mstepconv=0.001)

## S3 method for class 'smirt'
summary(object,...)

## S3 method for class 'smirt'
anova(object,...)

## S3 method for class 'smirt'
logLik(object,...)

## S3 method for class 'smirt'
IRT.irfprob(object,...)

## S3 method for class 'smirt'
IRT.likelihood(object,...)

## S3 method for class 'smirt'
IRT.posterior(object,...)

## S3 method for class 'smirt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.smirt'
summary(object,...)

Arguments

dat

Data frame with dichotomous item responses

Qmatrix

The Q-matrix which specifies the loadings to be estimated

irtmodel

The item response model. Options are the noncompensatory model ("noncomp"), the compensatory model ("comp") and the partially compensatory model ("partcomp"). See Details for more explanations.

est.b

An integer matrix (if irtmodel="noncomp") or integer vector (if irtmodel="comp") for b parameters to be estimated

est.a

An integer matrix for a parameters to be estimated. If est.a="2PL", then all item loadings will be estimated and the variances are set to one (and therefore est.corr=TRUE).

est.c

An integer vector for c parameters to be estimated

est.d

An integer vector for d parameters to be estimated

est.mu.i

An integer vector for \mu_i parameters to be estimated

b.init

Initial b coefficients. For irtmodel="noncomp" it must be a matrix, for irtmodel="comp" it is a vector.

a.init

Initial a coefficients arranged in a matrix

c.init

Initial c coefficients

d.init

Initial d coefficients

mu.i.init

Initial d coefficients

Sigma.init

Initial covariance matrix \Sigma

b.lower

Lower bound for b parameter

b.upper

Upper bound for b parameter

a.lower

Lower bound for a parameter

a.upper

Upper bound for a parameter

c.lower

Lower bound for c parameter

c.upper

Upper bound for c parameter

d.lower

Lower bound for d parameter

d.upper

Upper bound for d parameter

theta.k

Vector of discretized trait distribution. This vector is expanded in all dimensions by using the base::expand.grid function. If a user specifies a design matrix theta.kDES of transformed \bold{\theta}_p values (see Details and Examples), then theta.k must be a matrix, too.

theta.kDES

An optional design matrix. This matrix will differ from the ordinary theta grid in case of nonlinear item response models.

qmcnodes

Number of integration nodes for quasi Monte Carlo integration (see Pan & Thompson, 2007; Gonzales et al., 2006). Integration points are obtained by using the function qmc.nodes. Note that when using quasi Monte Carlo nodes, no theta design matrix theta.kDES can be specified. See Example 1, Model 11.

mu.fixed

Matrix with fixed entries in the mean vector. By default, all means are set to zero.

variance.fixed

Matrix (with rows and three columns) with fixed entries in the covariance matrix (see Examples). The entry c_{kd} of the covariance between dimensions k and d is set to c_0 iff variance.fixed has a row with a k in the first column, a d in the second column and the value c_0 in the third column.

est.corr

Should only a correlation matrix instead of a covariance matrix be estimated?

max.increment

Maximum increment

increment.factor

A value (larger than one) which defines the extent of the decrease of the maximum increment of item parameters in every iteration. The maximum increment in iteration iter is defined as max.increment*increment.factor^(-iter) where max.increment=1. Using a value larger than 1 helps to reach convergence in some non-converging analyses (use values of 1.01, 1.02 or even 1.05). See also Example 1 Model 2a.

numdiff.parm

Numerical differentiation parameter

maxdevchange

Convergence criterion for change in relative deviance

globconv

Global convergence criterion for parameter change

maxiter

Maximum number of iterations

msteps

Number of iterations within a M step

mstepconv

Convergence criterion within a M step

object

Object of class smirt

...

Further arguments to be passed

Details

The noncompensatory item response model (irtmodel="noncomp"; e.g. Bolt & Lall, 2003) is defined as

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) \prod_l invlogit( a_{il} q_{il} \theta_{pl} - b_{il} )

where i, p, l denote items, persons and dimensions respectively.

The compensatory item response model (irtmodel="comp") is defined by

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} \theta_{pl} - b_{i} )

Using a design matrix theta.kDES the model can be made even more general in a model which is linear in item parameters

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) invlogit( \sum_l a_{il} q_{il} t_l ( \bold{ \theta_{p} } ) - b_{i} )

with known functions t_l of the trait vector \bold{\theta}_p. Fixed values of the functions t_l are specified in the \bold{\theta}_p design matrix theta.kDES.

The partially compensatory item response model (irtmodel="partcomp") is defined by

P(X_{pi}=1 | \bold{\theta}_p )=c_i + (d_i - c_i ) \frac{ \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) } { \mu_i \prod_l ( 1 + \exp ( a_{il} q_{il} \theta_{pl} - b_{il} ) ) + ( 1- \mu_i) ( 1 + \exp \left( \sum_l ( a_{il} q_{il} \theta_{pl} - b_{il} ) \right) ) }

with item parameters \mu_i indicating the degree of compensatory. \mu_i=1 indicates a noncompensatory model while \mu_i=0 indicates a (fully) compensatory model.

The models are estimated by an EM algorithm employing marginal maximum likelihood.

Value

A list with following entries:

deviance

Deviance

ic

Information criteria

item

Data frame with item parameters

person

Data frame with person parameters. It includes the person mean of all item responses (M; percentage correct of all non-missing items), the EAP estimate and its corresponding standard error for all dimensions (EAP and SE.EAP) and the maximum likelihood estimate as well as the mode of the posterior distribution (MLE and MAP).

EAP.rel

EAP reliability

mean.trait

Means of trait

sd.trait

Standard deviations of trait

Sigma

Trait covariance matrix

cor.trait

Trait correlation matrix

b

Matrix (vector) of b parameters

se.b

Matrix (vector) of standard errors b parameters

a

Matrix of a parameters

se.a

Matrix of standard errors of a parameters

c

Vector of c parameters

se.c

Vector of standard errors of c parameters

d

Vector of d parameters

se.d

Vector of standard errors of d parameters

mu.i

Vector of \mu_i parameters

se.mu.i

Vector of standard errors of \mu_i parameters

f.yi.qk

Individual likelihood

f.qk.yi

Individual posterior

probs

Probabilities of item response functions evaluated at theta.k

n.ik

Expected counts

iter

Number of iterations

dat2

Processed data set

dat2.resp

Data set of response indicators

I

Number of items

D

Number of dimensions

K

Maximum item response score

theta.k

Used theta integration grid

pi.k

Distribution function evaluated at theta.k

irtmodel

Used IRT model

Qmatrix

Used Q-matrix

References

Bolt, D. M., & Lall, V. F. (2003). Estimation of compensatory and noncompensatory multidimensional item response models using Markov chain Monte Carlo. Applied Psychological Measurement, 27, 395-414.

Gonzalez, J., Tuerlinckx, F., De Boeck, P., & Cools, R. (2006). Numerical integration in logistic-normal models. Computational Statistics & Data Analysis, 51, 1535-1548.

Pan, J., & Thompson, R. (2007). Quasi-Monte Carlo estimation in generalized linear mixed models. Computational Statistics & Data Analysis, 51, 5765-5775.

Reckase, M. (2009). Multidimensional item response theory. New York: Springer. doi:10.1007/978-0-387-89976-3

Spray, J. A., Davey, T. C., Reckase, M. D., Ackerman, T. A., & Carlson, J. E. (1990). Comparison of two logistic multidimensional item response theory models. ACT Research Report No. ACT-RR-ONR-90-8.

Examples

#############################################################################
## EXAMPLE 1: Noncompensatory and compensatory IRT models
#############################################################################
set.seed(997)

# (1) simulate data from a two-dimensional noncompensatory
#     item response model
#   -> increase number of iterations in all models!

N <- 1000    # number of persons
I <- 10        # number of items
theta0 <- rnorm( N, sd=1 )
theta1 <- theta0 + rnorm(N, sd=.7 )
theta2 <- theta0 + rnorm(N, sd=.7 )
Q <- matrix( 1, nrow=I,ncol=2 )
Q[ 1:(I/2), 2 ] <- 0
Q[ I,1] <- 0
b <- matrix( rnorm( I*2 ), I, 2 )
a <- matrix( 1, I, 2 )

# simulate data
prob <- dat <- matrix(0, nrow=N, ncol=I )
for (ii in 1:I){
prob[,ii] <- ( stats::plogis( theta1 - b[ii,1]  ) )^Q[ii,1]
prob[,ii] <- prob[,ii] * ( stats::plogis( theta2 - b[ii,2]  ) )^Q[ii,2]
            }
dat[ prob > matrix( stats::runif( N*I),N,I) ] <- 1
colnames(dat) <- paste0("I",1:I)

#***
# Model 1: Noncompensatory 1PL model
mod1 <- sirt::smirt(dat, Qmatrix=Q, maxiter=10 ) # change number of iterations
summary(mod1)

## Not run: 
#***
# Model 2: Noncompensatory 2PL model
mod2 <- sirt::smirt(dat,Qmatrix=Q, est.a="2PL", maxiter=15 )
summary(mod2)

# Model 2a: avoid convergence problems with increment.factor
mod2a <- sirt::smirt(dat,Qmatrix=Q, est.a="2PL", maxiter=30, increment.factor=1.03)
summary(mod2a)

#***
# Model 3: some fixed c and d parameters different from zero or one
c.init <- rep(0,I)
c.init[ c(3,7)] <- .2
d.init <- rep(1,I)
d.init[c(4,8)] <- .95
mod3 <- sirt::smirt( dat, Qmatrix=Q, c.init=c.init, d.init=d.init )
summary(mod3)

#***
# Model 4: some estimated c and d parameters (in parameter groups)
est.c <- c.init <- rep(0,I)
c.estpars <- c(3,6,7)
c.init[ c.estpars ] <- .2
est.c[c.estpars] <- 1
est.d <- rep(0,I)
d.init <- rep(1,I)
d.estpars <- c(6,9)
d.init[ d.estpars ] <- .95
est.d[ d.estpars ] <- d.estpars   # different d parameters
mod4 <- sirt::smirt(dat,Qmatrix=Q, est.c=est.c, c.init=c.init,
            est.d=est.d, d.init=d.init  )
summary(mod4)

#***
# Model 5: Unidimensional 1PL model
Qmatrix <- matrix( 1, nrow=I, ncol=1 )
mod5 <- sirt::smirt( dat, Qmatrix=Qmatrix )
summary(mod5)

#***
# Model 6: Unidimensional 2PL model
mod6 <- sirt::smirt( dat, Qmatrix=Qmatrix, est.a="2PL" )
summary(mod6)

#***
# Model 7: Compensatory model with between item dimensionality
# Note that the data is simulated under the noncompensatory condition
# Therefore Model 7 should have a worse model fit than Model 1
Q1 <- Q
Q1[ 6:10, 1] <- 0
mod7 <- sirt::smirt(dat,Qmatrix=Q1, irtmodel="comp", maxiter=30)
summary(mod7)

#***
# Model 8: Compensatory model with within item dimensionality
#         assuming zero correlation between dimensions
variance.fixed <- as.matrix( cbind( 1,2,0) )
# set the covariance between the first and second dimension to zero
mod8 <- sirt::smirt(dat,Qmatrix=Q, irtmodel="comp", variance.fixed=variance.fixed,
            maxiter=30)
summary(mod8)

#***
# Model 8b: 2PL model with starting values for a and b parameters
b.init <- rep(0,10)  # set all item difficulties initially to zero
# b.init <- NULL
a.init <- Q       # initialize a.init with Q-matrix
# provide starting values for slopes of first three items on Dimension 1
a.init[1:3,1] <- c( .55, .32, 1.3)

mod8b <- sirt::smirt(dat,Qmatrix=Q, irtmodel="comp", variance.fixed=variance.fixed,
              b.init=b.init, a.init=a.init, maxiter=20, est.a="2PL" )
summary(mod8b)

#***
# Model 9: Unidimensional model with quadratic item response functions
# define theta
theta.k <- seq( - 6, 6, len=15 )
theta.k <- as.matrix( theta.k, ncol=1 )
# define design matrix
theta.kDES <- cbind( theta.k[,1], theta.k[,1]^2 )
# define Q-matrix
Qmatrix <- matrix( 0, I, 2 )
Qmatrix[,1] <- 1
Qmatrix[ c(3,6,7), 2 ] <- 1
colnames(Qmatrix) <- c("F1", "F1sq" )
# estimate model
mod9 <- sirt::smirt(dat,Qmatrix=Qmatrix, maxiter=50, irtmodel="comp",
           theta.k=theta.k, theta.kDES=theta.kDES, est.a="2PL" )
summary(mod9)

#***
# Model 10: Two-dimensional item response model with latent interaction
#           between dimensions
theta.k <- seq( - 6, 6, len=15 )
theta.k <- expand.grid( theta.k, theta.k )    # expand theta to 2 dimensions
# define design matrix
theta.kDES <- cbind( theta.k, theta.k[,1]*theta.k[,2] )
# define Q-matrix
Qmatrix <- matrix( 0, I, 3 )
Qmatrix[,1] <- 1
Qmatrix[ 6:10, c(2,3) ] <- 1
colnames(Qmatrix) <- c("F1", "F2", "F1iF2" )
# estimate model
mod10 <- sirt::smirt(dat,Qmatrix=Qmatrix,irtmodel="comp", theta.k=theta.k,
            theta.kDES=theta.kDES, est.a="2PL" )
summary(mod10)

#****
# Model 11: Example Quasi Monte Carlo integration
Qmatrix <- matrix( 1, I, 1 )
mod11 <- sirt::smirt( dat, irtmodel="comp", Qmatrix=Qmatrix, qmcnodes=1000 )
summary(mod11)

#############################################################################
## EXAMPLE 2: Dataset Reading data.read
##            Multidimensional models for dichotomous data
#############################################################################

data(data.read)
dat <- data.read
I <- ncol(dat)    # number of items

#***
# Model 1: 3-dimensional 2PL model

# define Q-matrix
Qmatrix <- matrix(0,nrow=I,ncol=3)
Qmatrix[1:4,1] <- 1
Qmatrix[5:8,2] <- 1
Qmatrix[9:12,3] <- 1

# estimate model
mod1 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp", est.a="2PL",
            qmcnodes=1000, maxiter=20)
summary(mod1)

#***
# Model 2: 3-dimensional Rasch model
mod2 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp",
              qmcnodes=1000, maxiter=20)
summary(mod2)

#***
# Model 3: 3-dimensional 2PL model with uncorrelated dimensions
# fix entries in variance matrix
variance.fixed <- cbind( c(1,1,2), c(2,3,3), 0 )
# set the following covariances to zero: cov[1,2]=cov[1,3]=cov[2,3]=0

# estimate model
mod3 <- sirt::smirt( dat, Qmatrix=Qmatrix, irtmodel="comp", est.a="2PL",
             variance.fixed=variance.fixed, qmcnodes=1000, maxiter=20)
summary(mod3)

#***
# Model 4: Bifactor model with one general factor (g) and
#          uncorrelated specific factors

# define a new Q-matrix
Qmatrix1 <- cbind( 1, Qmatrix )
# uncorrelated factors
variance.fixed <- cbind( c(1,1,1,2,2,3), c(2,3,4,3,4,4), 0 )
# The first dimension refers to the general factors while the other
# dimensions refer to the specific factors.
# The specification means that:
# Cov[1,2]=Cov[1,3]=Cov[1,4]=Cov[2,3]=Cov[2,4]=Cov[3,4]=0

# estimate model
mod4 <- sirt::smirt( dat, Qmatrix=Qmatrix1, irtmodel="comp", est.a="2PL",
             variance.fixed=variance.fixed, qmcnodes=1000, maxiter=20)
summary(mod4)

#############################################################################
## EXAMPLE 3: Partially compensatory model
#############################################################################

#**** simulate data
set.seed(7656)
I <- 10         # number of items
N <- 2000        # number of subjects
Q <- matrix( 0, 3*I,2)  # Q-matrix
Q[1:I,1] <- 1
Q[1:I + I,2] <- 1
Q[1:I + 2*I,1:2] <- 1
b <- matrix( stats::runif( 3*I *2, -2, 2 ), nrow=3*I, 2 )
b <- b*Q
b <- round( b, 2 )
mui <- rep(0,3*I)
mui[ seq(2*I+1, 3*I) ] <- 0.65
# generate data
dat <- matrix( NA, N, 3*I )
colnames(dat) <- paste0("It", 1:(3*I) )
# simulate item responses
library(mvtnorm)
theta <- mvtnorm::rmvnorm(N, mean=c(0,0), sigma=matrix( c( 1.2, .6,.6,1.6),2, 2 ) )
for (ii in 1:(3*I)){
    # define probability
    tmp1 <- exp( theta[,1] * Q[ii,1] - b[ii,1] +  theta[,2] * Q[ii,2] - b[ii,2] )
    # non-compensatory model
    nco1 <- ( 1 + exp( theta[,1] * Q[ii,1] - b[ii,1] ) ) *
                  ( 1 + exp( theta[,2] * Q[ii,2] - b[ii,2] ) )
    co1 <- ( 1 + tmp1 )
    p1 <- tmp1 / ( mui[ii] * nco1 + ( 1 - mui[ii] )*co1 )
    dat[,ii] <- 1 * ( stats::runif(N) < p1 )
}

#*** Model 1: Joint mu.i parameter for all items
est.mu.i <- rep(0,3*I)
est.mu.i[ seq(2*I+1,3*I)] <- 1
mod1 <- sirt::smirt( dat, Qmatrix=Q, irtmodel="partcomp", est.mu.i=est.mu.i)
summary(mod1)

#*** Model 2: Separate mu.i parameter for all items
est.mu.i[ seq(2*I+1,3*I)] <- 1:I
mod2 <- sirt::smirt( dat, Qmatrix=Q, irtmodel="partcomp", est.mu.i=est.mu.i)
summary(mod2)

## End(Not run)

Stratified Cronbach's Alpha

Description

This function computes the stratified Cronbach's Alpha for composite scales (Cronbach, Schoenemann & McKie, 1965; He, 2010; Meyer, 2010).

Usage

stratified.cronbach.alpha(data, itemstrata=NULL)

Arguments

data

An N \times I data frame

itemstrata

A matrix with two columns defining the item stratification. The first column contains the item names, the second column the item stratification label (these can be integers). The default NULL does only compute Cronbach's Alpha for the whole scale.

References

Cronbach, L. J., Schoenemann, P., & McKie, D. (1965). Alpha coefficient for stratified-parallel tests. Educational and Psychological Measurement, 25, 291-312. doi:10.1177/001316446502500201

He, Q. (2010). Estimating the reliability of composite scores. Ofqual/10/4703. Coventry: The Office of Qualifications and Examinations Regulation.

Meyer, P. (2010). Reliability. Cambridge: Oxford University Press.

Examples

#############################################################################
# EXAMPLE 1: data.read
#############################################################################

data(data.read, package="sirt")
dat <- data.read
I <- ncol(dat)

# apply function without defining item strata
sirt::stratified.cronbach.alpha( data.read  )

# define item strata
itemstrata <- cbind( colnames(dat), substring( colnames(dat), 1,1 ) )
sirt::stratified.cronbach.alpha( dat, itemstrata=itemstrata )
  ##   scale  I alpha mean.tot var.tot alpha.stratified
  ## 1 total 12 0.677    8.680   5.668            0.703
  ## 2     A  4 0.545    2.616   1.381               NA
  ## 3     B  4 0.381    2.811   1.059               NA
  ## 4     C  4 0.640    3.253   1.107               NA

## Not run: 
#**************************
# reliability analysis in psych package
library(psych)
# Cronbach's alpha and item discriminations
psych::alpha(dat)
# McDonald's omega
psych::omega(dat, nfactors=1)     # 1 factor
  ##   Alpha:                 0.69
  ##   Omega Total            0.69
##=> Note that alpha in this function is the standardized Cronbach's
##     alpha, i.e. alpha computed for standardized variables.
psych::omega(dat, nfactors=2)     # 2 factors
  ##   Omega Total            0.72
psych::omega(dat, nfactors=3)     # 3 factors
  ##   Omega Total            0.74

## End(Not run)

Summary Method for Objects of Class `mcmc.sirt`

Description

S3 method to summarize objects of class mcmc.sirt. This object is generated by following functions: mcmc.2pno, mcmc.2pnoh, mcmc.3pno.testlet, mcmc.2pno.ml

Usage

## S3 method for class 'mcmc.sirt'
summary(object,digits=3, file=NULL, ...)

Arguments

object

Object of class mcmc.sirt

digits

Number of digits after decimal

file

Optional file name to which summary output is written

...

Further arguments to be passed

Converting a fitted `TAM` Object into a `mirt` Object

Description

Converts a fitted TAM object into a mirt object. As a by-product, lavaan syntax is generated which can be used with lavaan2mirt for re-estimating the model in the mirt package. Up to now, only single group models are supported. There must not exist background covariates (no latent regression models!).

Usage

tam2mirt(tamobj)

Arguments

tamobj

Object of class TAM::tam.mml

Value

A list with following entries

mirt

Object generated by mirt function if est.mirt=TRUE

mirt.model

Generated mirt model

mirt.syntax

Generated mirt syntax

mirt.pars

Generated parameter specifications in mirt

lavaan.model

Used lavaan model transformed by lavaanify function

dat

Used dataset. If necessary, only items used in the model are included in the dataset.

lavaan.syntax.fixed

Generated lavaan syntax with fixed parameter estimates.

lavaan.syntax.freed

Generated lavaan syntax with freed parameters for estimation.

Examples

## Not run: 
library(TAM)
library(mirt)

#############################################################################
# EXAMPLE 1: Estimations in TAM for data.read dataset
#############################################################################

data(data.read)
dat <- data.read

#**************************************
#*** Model 1: Rasch model
#**************************************

# estimation in TAM package
mod <- TAM::tam.mml( dat )
summary(mod)
# conversion to mirt
res <- sirt::tam2mirt(mod)
# generated lavaan syntax
cat(res$lavaan.syntax.fixed)
cat(res$lavaan.syntax.freed)
# extract object of class mirt
mres <- res$mirt
# print and parameter values
print(mres)
mirt::mod2values(mres)
# model fit
mirt::M2(mres)
# residual statistics
mirt::residuals(mres, type="Q3")
mirt::residuals(mres, type="LD")
# item fit
mirt::itemfit(mres)
# person fit
mirt::personfit(mres)
# compute several types of factor scores (quite slow)
f1 <- mirt::fscores(mres, method='WLE',response.pattern=dat[1:10,])
     # method=MAP and EAP also possible
# item plot
mirt::itemplot(mres,"A3")    # item A3
mirt::itemplot(mres,4)       # fourth item
# some more plots
plot(mres,type="info")
plot(mres,type="score")
plot(mres,type="trace")
# compare estimates with estimated Rasch model in mirt
mres1 <- mirt::mirt(dat,1,"Rasch" )
print(mres1)
mirt.wrapper.coef(mres1)

#**************************************
#*** Model 2: 2PL model
#**************************************

# estimation in TAM
mod <- TAM::tam.mml.2pl( dat )
summary(mod)
# conversion to mirt
res <- sirt::tam2mirt(mod)
mres <- res$mirt
# lavaan syntax
cat(res$lavaan.syntax.fixed)
cat(res$lavaan.syntax.freed)
# parameter estimates
print(mres)
mod2values(mres)
mres@nest   # number of estimated parameters
# some plots
plot(mres,type="info")
plot(mres,type="score")
plot(mres,type="trace")
# model fit
mirt::M2(mres)
# residual statistics
mirt::residuals(mres, type="Q3")
mirt::residuals(mres, type="LD")
# item fit
mirt::itemfit(mres)

#**************************************
#*** Model 3: 3-dimensional Rasch model
#**************************************

# define Q-matrix
Q <- matrix( 0, nrow=12, ncol=3 )
Q[ cbind(1:12, rep(1:3,each=4) ) ] <- 1
rownames(Q) <- colnames(dat)
colnames(Q) <- c("A","B","C")
# estimation in TAM
mod <- TAM::tam.mml( resp=dat, Q=Q, control=list(snodes=1000,maxiter=30) )
summary(mod)
# mirt conversion
res <- sirt::tam2mirt(mod)
mres <- res$mirt
# mirt syntax
cat(res$mirt.syntax)
  ##   Dim01=1,2,3,4
  ##   Dim02=5,6,7,8
  ##   Dim03=9,10,11,12
  ##   COV=Dim01*Dim01,Dim02*Dim02,Dim03*Dim03,Dim01*Dim02,Dim01*Dim03,Dim02*Dim03
  ##   MEAN=Dim01,Dim02,Dim03
# lavaan syntax
cat(res$lavaan.syntax.freed)
  ##   Dim01=~ 1*A1+1*A2+1*A3+1*A4
  ##   Dim02=~ 1*B1+1*B2+1*B3+1*B4
  ##   Dim03=~ 1*C1+1*C2+1*C3+1*C4
  ##   A1 | t1_1*t1
  ##   A2 | t1_2*t1
  ##   A3 | t1_3*t1
  ##   A4 | t1_4*t1
  ##   B1 | t1_5*t1
  ##   B2 | t1_6*t1
  ##   B3 | t1_7*t1
  ##   B4 | t1_8*t1
  ##   C1 | t1_9*t1
  ##   C2 | t1_10*t1
  ##   C3 | t1_11*t1
  ##   C4 | t1_12*t1
  ##   Dim01 ~ 0*1
  ##   Dim02 ~ 0*1
  ##   Dim03 ~ 0*1
  ##   Dim01 ~~ Cov_11*Dim01
  ##   Dim02 ~~ Cov_22*Dim02
  ##   Dim03 ~~ Cov_33*Dim03
  ##   Dim01 ~~ Cov_12*Dim02
  ##   Dim01 ~~ Cov_13*Dim03
  ##   Dim02 ~~ Cov_23*Dim03
# model fit
mirt::M2(mres)
# residual statistics
residuals(mres,type="LD")
# item fit
mirt::itemfit(mres)

#**************************************
#*** Model 4: 3-dimensional 2PL model
#**************************************

# estimation in TAM
mod <- TAM::tam.mml.2pl( resp=dat, Q=Q, control=list(snodes=1000,maxiter=30) )
summary(mod)
# mirt conversion
res <- sirt::tam2mirt(mod)
mres <- res$mirt
# generated lavaan syntax
cat(res$lavaan.syntax.fixed)
cat(res$lavaan.syntax.freed)
# write lavaan syntax on disk
  sink( "mod4_lav_freed.txt", split=TRUE )
cat(res$lavaan.syntax.freed)
  sink()
# some statistics from mirt
print(mres)
summary(mres)
mirt::M2(mres)
mirt::residuals(mres)
mirt::itemfit(mres)

# estimate mirt model by using the generated lavaan syntax with freed parameters
res2 <- sirt::lavaan2mirt( dat, res$lavaan.syntax.freed,
            technical=list(NCYCLES=3), verbose=TRUE)
                 # use only few cycles for illustrational purposes
mirt.wrapper.coef(res2$mirt)
summary(res2$mirt)
print(res2$mirt)

#############################################################################
# EXAMPLE 4: mirt conversions for polytomous dataset data.big5
#############################################################################

data(data.big5)
# select some items
items <- c( grep( "O", colnames(data.big5), value=TRUE )[1:6],
     grep( "N", colnames(data.big5), value=TRUE )[1:4] )
# O3 O8 O13 O18 O23 O28 N1 N6 N11 N16
dat <- data.big5[, items ]
library(psych)
psych::describe(dat)

library(TAM)
#******************
#*** Model 1: Partial credit model in TAM
mod1 <- TAM::tam.mml( dat[,1:6] )
summary(mod1)
# convert to mirt object
mmod1 <- sirt::tam2mirt( mod1 )
rmod1 <- mmod1$mirt
# coefficients in mirt
coef(rmod1)
mirt.wrapper.coef(rmod1)
# model fit
mirt::M2(rmod1)
# item fit
mirt::itemfit(rmod1)
# plots
plot(rmod1,type="trace")
plot(rmod1, type="trace", which.items=1:4 )
mirt::itemplot(rmod1,"O3")

#******************
#*** Model 2: Generalized partial credit model in TAM
mod2 <- TAM::tam.mml.2pl( dat[,1:6], irtmodel="GPCM" )
summary(mod2)
# convert to mirt object
mmod2 <- sirt::tam2mirt( mod2 )
rmod2 <- mmod2$mirt
# coefficients in mirt
mirt.wrapper.coef(rmod2)
# model fit
mirt::M2(rmod2)
# item fit
mirt::itemfit(rmod2)

## End(Not run)

Marginal Item Parameters from a Testlet (Bifactor) Model

Description

This function computes marginal item parameters of a general factor if item parameters from a testlet (bifactor) model are provided as an input (see Details).

Usage

testlet.marginalized(tam.fa.obj=NULL,a1=NULL, d1=NULL, testlet=NULL,
      a.testlet=NULL, var.testlet=NULL)

Arguments

tam.fa.obj

Optional object of class tam.fa generated by TAM::tam.fa from the TAM package.

a1

Vector of item discriminations of general factor

d1

Vector of item intercepts of general factor

testlet

Integer vector of testlet (bifactor) identifiers (must be integers between 1 to T).

a.testlet

Vector of testlet (bifactor) item discriminations

var.testlet

Vector of testlet (bifactor) variances

Details

A testlet (bifactor) model is assumed to be estimated:

P(X_{pit}=1 | \theta_{p}, u_{pt} )= invlogit( a_{i1} \theta_p + a_t u_{pt} - d_{i} )

with Var( u_{pt} )=\sigma_t^2 . This multidimensional item response model with locally independent items is equivalent to a unidimensional IRT model with locally dependent items (Ip, 2010). Marginal item parameters a_i^\ast and d_i^\ast are obtained according to the response equation

P(X_{pit}=1 | \theta_{p}^\ast )= invlogit( a_{i}^\ast \theta_p^\ast - d_{i}^\ast )

Calculation details can be found in Ip (2010).

Value

A data frame containing all input item parameters and marginal item intercept d_i^\ast (d1_marg) and marginal item slope a_i^\ast (a1_marg).

References

Ip, E. H. (2010). Empirically indistinguishable multidimensional IRT and locally dependent unidimensional item response models. British Journal of Mathematical and Statistical Psychology, 63, 395-416.

Examples

#############################################################################
# EXAMPLE 1: Small numeric example for Rasch testlet model
#############################################################################

# Rasch testlet model with 9 items contained into 3 testlets
# the third testlet has essentially no dependence and therefore
# no testlet variance
testlet <- rep( 1:3, each=3 )
a1 <- rep(1, 9 )   # item slopes first dimension
d1 <- rep( c(-1.25,0,1.5), 3 ) # item intercepts
a.testlet <- rep( 1, 9 )  # item slopes testlets
var.testlet <- c( .8, .2, 0 )  # testlet variances

# apply function
res <- sirt::testlet.marginalized( a1=a1, d1=d1, testlet=testlet,
            a.testlet=a.testlet, var.testlet=var.testlet )
round( res, 2 )
  ##    item testlet a1    d1 a.testlet var.testlet a1_marg d1_marg
  ##  1    1       1  1 -1.25         1         0.8    0.89   -1.11
  ##  2    2       1  1  0.00         1         0.8    0.89    0.00
  ##  3    3       1  1  1.50         1         0.8    0.89    1.33
  ##  4    4       2  1 -1.25         1         0.2    0.97   -1.21
  ##  5    5       2  1  0.00         1         0.2    0.97    0.00
  ##  6    6       2  1  1.50         1         0.2    0.97    1.45
  ##  7    7       3  1 -1.25         1         0.0    1.00   -1.25
  ##  8    8       3  1  0.00         1         0.0    1.00    0.00
  ##  9    9       3  1  1.50         1         0.0    1.00    1.50

## Not run: 
#############################################################################
# EXAMPLE 2: Dataset reading
#############################################################################

library(TAM)
data(data.read)
resp <- data.read
maxiter <-  100

# Model 1: Rasch testlet model with 3 testlets
dims <- substring( colnames(resp),1,1 )  # define dimensions
mod1 <- TAM::tam.fa( resp=resp, irtmodel="bifactor1", dims=dims,
               control=list(maxiter=maxiter) )
# marginal item parameters
res1 <- sirt::testlet.marginalized( mod1 )

#***
# Model 2: estimate bifactor model but assume that items 3 and 5 do not load on
#           specific factors
dims1 <- dims
dims1[c(3,5)] <- NA
mod2 <- TAM::tam.fa( resp=resp, irtmodel="bifactor2", dims=dims1,
              control=list(maxiter=maxiter) )
res2 <- sirt::testlet.marginalized( mod2 )
res2

## End(Not run)

Tetrachoric Correlation Matrix

Description

This function estimates a tetrachoric correlation matrix according to the maximum likelihood estimation of Olsson (Olsson, 1979; method="Ol"), the Tucker method (Method 2 of Froemel, 1971; method="Tu") and Divgi (1979, method="Di"). In addition, an alternative non-iterative approximation of Bonett and Price (2005; method="Bo") is provided.

Usage

tetrachoric2(dat, method="Ol", delta=0.007, maxit=1000000, cor.smooth=TRUE,
   progress=TRUE)

Arguments

dat

A data frame of dichotomous response

method

Computation method for calculating the tetrachoric correlation. The ML method is method="Ol" (which is the default), the Tucker method is method="Tu", the Divgi method is method="Di" the method of Bonett and Price (2005) is method="Bo".

delta

The step parameter. It is set by default to 2^{-7} which is approximately .007.

maxit

Maximum number of iterations.

cor.smooth

Should smoothing of the tetrachoric correlation matrix be performed to ensure positive definiteness? Choosing cor.smooth=TRUE, the function cor.smooth from the psych package is used for obtaining a positive definite tetrachoric correlation matrix.

progress

Display progress? Default is TRUE.

Value

A list with following entries

tau

Item thresholds

rho

Tetrachoric correlation matrix

Author(s)

Alexander Robitzsch

The code is adapted from an R script of Cengiz Zopluoglu. See http://sites.education.miami.edu/zopluoglu/software-programs/.

References

Bonett, D. G., & Price, R. M. (2005). Inferential methods for the tetrachoric correlation coefficient. Journal of Educational and Behavioral Statistics, 30(2), 213-225. doi:10.3102/10769986030002213

Divgi, D. R. (1979). Calculation of the tetrachoric correlation coefficient. Psychometrika, 44(2), 169-172. doi:10.1007/BF02293968

Froemel, E. C. (1971). A comparison of computer routines for the calculation of the tetrachoric correlation coefficient. Psychometrika, 36(2), 165-174. doi:10.1007/BF02291396

Olsson, U. (1979). Maximum likelihood estimation of the polychoric correlation coefficient. Psychometrika, 44(4), 443-460. doi:10.1007/BF02296207

Examples

#############################################################################
# EXAMPLE 1: data.read
#############################################################################

data(data.read)

# tetrachoric correlation from psych package
library(psych)
t0 <- psych::tetrachoric( data.read )$rho
# Olsson method (maximum likelihood estimation)
t1 <- sirt::tetrachoric2( data.read )$rho
# Divgi method
t2 <- sirt::tetrachoric2( data.read, method="Di"  )$rho
# Tucker method
t3 <- sirt::tetrachoric2( data.read, method="Tu" )$rho
# Bonett method
t4 <- sirt::tetrachoric2( data.read, method="Bo" )$rho

# maximum absolute deviation ML method
max( abs( t0 - t1 ) )
  ##   [1] 0.008224986
# mean absolute deviation Divgi method
max( abs( t0 - t2 ) )
  ##   [1] 0.1766688
# mean absolute deviation Tucker method
max( abs( t0 - t3 ) )
  ##   [1] 0.1766292
# mean absolute deviation Bonett method
max( abs( t0 - t4 ) )
  ##   [1] 0.05695522

Conversion of Trait Scores `\theta` into True Scores `\tau ( \theta )`

Description

This function computes the true score \tau=\tau(\theta)=\sum_{i=1}^I P_i(\theta) in a unidimensional item response model with I items. In addition, it also transforms conditional standard errors if they are provided.

Usage

truescore.irt(A, B, c=NULL, d=NULL, theta=seq(-3, 3, len=21),
    error=NULL, pid=NULL, h=0.001)

Arguments

A

Matrix or vector of item slopes. See Examples for polytomous responses.

B

Matrix or vector of item intercepts. Note that the entries in B refer to item intercepts and not to item difficulties.

c

Optional vector of guessing parameters

d

Optional vector of slipping parameters

theta

Vector of trait values

error

Optional vector of standard errors of trait

pid

Optional vector of person identifiers

h

Numerical differentiation parameter

Details

In addition, the function \pi(\theta)=\frac{1}{I} \cdot \tau( \theta) of the expected percent score is approximated by a logistic function

\pi ( \theta ) \approx l + ( u - l ) \cdot invlogit ( a \theta + b )

Value

A data frame with following columns:

truescore

True scores \tau=\tau ( \theta )

truescore.error

Standard errors of true scores

percscore

Expected correct scores which is \tau divided by the maximum true score

percscore.error

Standard errors of expected correct scores

lower

The l parameter

upper

The u parameter

a

The a parameter

b

The b parameter

Examples

#############################################################################
# EXAMPLE 1: Dataset with mixed dichotomous and polytomous responses
#############################################################################

data(data.mixed1)
dat <- data.mixed1

#****
# Model 1: Partial credit model
# estimate model with TAM package
library(TAM)
mod1 <- TAM::tam.mml( dat )
# estimate person parameter estimates
wmod1 <- TAM::tam.wle( mod1 )
wmod1 <- wmod1[ order(wmod1$theta), ]
# extract item parameters
A <- mod1$B[,-1,1]
B <- mod1$AXsi[,-1]
# person parameters and standard errors
theta <- wmod1$theta
error <- wmod1$error

# estimate true score transformation
dfr <- sirt::truescore.irt( A=A, B=B, theta=theta, error=error )

# plot different person parameter estimates and standard errors
par(mfrow=c(2,2))
plot( theta, dfr$truescore, pch=16, cex=.6, xlab=expression(theta), type="l",
    ylab=expression(paste( tau, "(",theta, ")" )), main="True Score Transformation" )
plot( theta, dfr$percscore, pch=16, cex=.6, xlab=expression(theta), type="l",
    ylab=expression(paste( pi, "(",theta, ")" )), main="Percent Score Transformation" )
points( theta, dfr$lower + (dfr$upper-dfr$lower)*
                stats::plogis(dfr$a*theta+dfr$b), col=2, lty=2)
plot( theta, error, pch=16, cex=.6, xlab=expression(theta), type="l",
    ylab=expression(paste("SE(",theta, ")" )), main="Standard Error Theta" )
plot( dfr$truescore, dfr$truescore.error, pch=16, cex=.6, xlab=expression(tau),
    ylab=expression(paste("SE(",tau, ")" ) ), main="Standard Error True Score Tau",
    type="l")
par(mfrow=c(1,1))

## Not run: 
#****
# Model 2: Generalized partial credit model
mod2 <- TAM::tam.mml.2pl( dat, irtmodel="GPCM")
# estimate person parameter estimates
wmod2 <- TAM::tam.wle( mod2 )
# extract item parameters
A <- mod2$B[,-1,1]
B <- mod2$AXsi[,-1]
# person parameters and standard errors
theta <- wmod2$theta
error <- wmod2$error
# estimate true score transformation
dfr <- sirt::truescore.irt( A=A, B=B, theta=theta, error=error )

#############################################################################
# EXAMPLE 2: Dataset Reading data.read
#############################################################################
data(data.read)

#****
# Model 1: estimate difficulty + guessing model
mod1 <- sirt::rasch.mml2( data.read, fixed.c=rep(.25,12) )
mod1$person <- mod1$person[ order( mod1$person$EAP), ]
# person parameters and standard errors
theta <- mod1$person$EAP
error <- mod1$person$SE.EAP
A <- rep(1,12)
B <- - mod1$item$b
c <- rep(.25,12)
# estimate true score transformation
dfr <- sirt::truescore.irt( A=A, B=B, theta=theta, error=error,c=c)

plot( theta, dfr$percscore, pch=16, cex=.6, xlab=expression(theta), type="l",
    ylab=expression(paste( pi, "(",theta, ")" )), main="Percent Score Transformation" )
points( theta, dfr$lower + (dfr$upper-dfr$lower)*
             stats::plogis(dfr$a*theta+dfr$b), col=2, lty=2)

#****
# Model 2: Rasch model
mod2 <- sirt::rasch.mml2( data.read  )
# person parameters and standard errors
theta <- mod2$person$EAP
error <- mod2$person$SE.EAP
A <- rep(1,12)
B <- - mod2$item$b
# estimate true score transformation
dfr <- sirt::truescore.irt( A=A, B=B, theta=theta, error=error )

## End(Not run)

Test for Unidimensionality of CSN

Description

This function tests whether item covariances given the sum score are non-positive (CSN; see Junker 1993), i.e. for items i and j it holds that

Cov( X_i, X_j | X^+ ) \le 0

Note that this function only works for dichotomous data.

Usage

unidim.test.csn(dat, RR=400, prop.perm=0.75, progress=TRUE)

Arguments

dat

Data frame with dichotomous item responses. All persons with (some) missing responses are removed.

RR

Number of permutations used for statistical testing

prop.perm

A positive value indicating the amount of permutation in an existing permuted data set

progress

An optional logical indicating whether computation progress should be displayed

Details

For each item pair (i,j) and a each sum score group k a conditional covariance r(i,j|k) is calculated. Then, the test statistic for CSN is

h=\sum_{k=1}^{I-1} \frac{n_k}{n} \max_{i,j} r(i,j|k)

where n_k is the number of persons in score group k. "'Large values"' of h are not in agreement with the null hypothesis of non-positivity of conditional covariances.

The distribution of the test statistic h under the null hypothesis is empirically obtained by column wise permutation of items within all score groups. In the population, this procedure corresponds to conditional covariances of zero. See de Gooijer and Yuan (2011) for more details.

Value

A list with following entries

stat

Value of the statistic

stat_perm

Distribution of statistic under H_0 of permuted dataset

p

The corresponding p value of the statistic

H0_quantiles

Quantiles of the statistic under permutation (the null hypothesis H_0)

References

De Gooijer, J. G., & Yuan, A. (2011). Some exact tests for manifest properties of latent trait models. Computational Statistics and Data Analysis, 55, 34-44.

Junker, B.W. (1993). Conditional association, essential independence, and monotone unidimensional item response models. Annals of Statistics, 21, 1359-1378.

Examples

#############################################################################
# EXAMPLE 1: Dataset data.read
#############################################################################

data(data.read)
dat <- data.read
set.seed(778)
res <- sirt::unidim.test.csn( dat )
  ##  CSN Statistic=0.04737, p=0.02

## Not run: 
#############################################################################
# EXAMPLE 2: CSN statistic for two-dimensional simulated data
#############################################################################

set.seed(775)
N <- 2000
I <- 30   # number of items
rho <- .60   # correlation between 2 dimensions
t0 <- stats::rnorm(N)
t1 <- sqrt(rho)*t0 + sqrt(1-rho)*stats::rnorm(N)
t2 <- sqrt(rho)*t0 + sqrt(1-rho)*stats::rnorm(N)
dat1 <- sirt::sim.raschtype(t1, b=seq(-1.5,1.5,length=I/2) )
dat2 <- sirt::sim.raschtype(t2, b=seq(-1.5,1.5,length=I/2) )
dat <- as.matrix(cbind( dat1, dat2) )
res <- sirt::unidim.test.csn( dat )
  ##  CSN Statistic=0.06056, p=0.02

## End(Not run)

Weighted Likelihood Estimation of Person Abilities

Description

This function computes weighted likelihood estimates for dichotomous responses based on the Rasch model (Warm, 1989).

Usage

wle.rasch(dat, dat.resp=NULL, b, itemweights=1 + 0 * b,
    theta=rep(0, nrow(dat)), conv=0.001, maxit=200,
    wle.adj=0, progress=FALSE)

Arguments

dat

An N \times I data frame of dichotomous item responses

dat.resp

Optional data frame with dichotomous response indicators

b

Vector of length I with fixed item difficulties

itemweights

Optional vector of fixed item discriminations

theta

Optional vector of initial person parameter estimates

conv

Convergence criterion

maxit

Maximal number of iterations

wle.adj

Constant for WLE adjustment

progress

Display progress?

Value

A list with following entries

theta

Estimated weighted likelihood estimate

dat.resp

Data frame with dichotomous response indicators. A one indicates an observed response, a zero a missing response. See also dat.resp in the list of arguments of this function.

p.ia

Matrix with expected item response, i.e. the probabilities P(X_{pi}=1|\theta_p )=invlogit( \theta_p - b_i ).

wle

WLE reliability (Adams, 2005)

References

Adams, R. J. (2005). Reliability as a measurement design effect. Studies in Educational Evaluation, 31, 162-172.

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54, 427-450.

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)

# estimate the Rasch model
mod <- sirt::rasch.mml2(data.read)
mod$item

# estmate WLEs
mod.wle <- sirt::wle.rasch( dat=data.read, b=mod$item$b )

Standard Error Estimation of WLE by Jackknifing

Description

This function calculates standard errors of WLEs (Warm, 1989) for stratified item designs and item designs with testlets for the Rasch model.

Usage

wle.rasch.jackknife(dat, b, itemweights=1 + 0 * b, pid=NULL,
    testlet=NULL, stratum=NULL, size.itempop=NULL)

Arguments

dat

An N \times I data frame of item responses

b

Vector of item difficulties

itemweights

Weights for items, i.e. fixed item discriminations

pid

Person identifier

testlet

A vector of length I which defines which item belongs to which testlet. If some items does not belong to any testlet, then define separate testlet labels for these single items.

stratum

Item stratum

size.itempop

Number of items in an item stratum of the finite item population.

Details

The idea of Jackknife in item response models can be found in Wainer and Wright (1980).

Value

A list with following entries:

wle

Data frame with some estimated statistics. The column wle is the WLE and wle.jackse its corresponding standard error estimated by jackknife.

wle.rel

WLE reliability (Adams, 2005)

References

Adams, R. J. (2005). Reliability as a measurement design effect. Studies in Educational Evaluation, 31(2-3), 162-172. doi:10.1016/j.stueduc.2005.05.008

Gershunskaya, J., Jiang, J., & Lahiri, P. (2009). Resampling methods in surveys. In D. Pfeffermann and C.R. Rao (Eds.). Handbook of Statistics 29B; Sample Surveys: Inference and Analysis (pp. 121-151). Amsterdam: North Holland. doi:10.1016/S0169-7161(09)00228-4

Wainer, H., & Wright, B. D. (1980). Robust estimation of ability in the Rasch model. Psychometrika, 45(3), 373-391. doi:10.1007/BF02293910

Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54(3), 427-450. doi:10.1007/BF02294627

Examples

#############################################################################
# EXAMPLE 1: Dataset Reading
#############################################################################
data(data.read)
dat <- data.read

# estimation of the Rasch model
res <- sirt::rasch.mml2( dat, parm.conv=.001)

# WLE estimation
wle1 <- sirt::wle.rasch(dat, b=res$item$thresh )

# simple jackknife WLE estimation
wle2 <- sirt::wle.rasch.jackknife(dat, b=res$item$thresh )
  ## WLE Reliability=0.651

# SE(WLE) for testlets A, B and C
wle3 <- sirt::wle.rasch.jackknife(dat, b=res$item$thresh,
           testlet=substring( colnames(dat),1,1) )
  ## WLE Reliability=0.572

# SE(WLE) for item strata A,B, C
wle4 <- sirt::wle.rasch.jackknife(dat, b=res$item$thresh,
             stratum=substring( colnames(dat),1,1) )
  ## WLE Reliability=0.683

# SE (WLE) for finite item strata
# A (10 items), B (7 items), C (4 items -> no sampling error)
# in every stratum 4 items were sampled
size.itempop <- c(10,7,4)
names(size.itempop) <- c("A","B","C")
wle5 <- sirt::wle.rasch.jackknife(dat, b=res$item$thresh,
             stratum=substring( colnames(dat),1,1),
             size.itempop=size.itempop )
  ## Stratum  A (Mean) Correction Factor 0.6
  ## Stratum  B (Mean) Correction Factor 0.42857
  ## Stratum  C (Mean) Correction Factor 0
  ## WLE Reliability=0.876

# compare different estimated standard errors
a2 <- stats::aggregate( wle2$wle$wle.jackse, list( wle2$wle$wle), mean )
colnames(a2) <- c("wle", "se.simple")
a2$se.testlet <- stats::aggregate( wle3$wle$wle.jackse, list( wle3$wle$wle), mean )[,2]
a2$se.strata <- stats::aggregate( wle4$wle$wle.jackse, list( wle4$wle$wle), mean )[,2]
a2$se.finitepop.strata <- stats::aggregate( wle5$wle$wle.jackse,
    list( wle5$wle$wle), mean )[,2]
round( a2, 3 )
  ## > round( a2, 3 )
  ##       wle se.simple se.testlet se.strata se.finitepop.strata
  ## 1  -5.085     0.440      0.649     0.331               0.138
  ## 2  -3.114     0.865      1.519     0.632               0.379
  ## 3  -2.585     0.790      0.849     0.751               0.495
  ## 4  -2.133     0.715      1.177     0.546               0.319
  ## 5  -1.721     0.597      0.767     0.527               0.317
  ## 6  -1.330     0.633      0.623     0.617               0.377
  ## 7  -0.942     0.631      0.643     0.604               0.365
  ## 8  -0.541     0.655      0.678     0.617               0.384
  ## 9  -0.104     0.671      0.646     0.659               0.434
  ## 10  0.406     0.771      0.706     0.751               0.461
  ## 11  1.080     1.118      0.893     1.076               0.630
  ## 12  2.332     0.400      0.631     0.272               0.195

User Defined Item Response Model

Description

Estimates a user defined item response model. Both, item response functions and latent trait distributions can be specified by the user (see Details). By default, the EM algorithm is used for estimation. The number of maximum EM iterations can be defined with the argument maxit. The xxirt function also allows Newton-Raphson optimization by specifying values of maximum number of iterations in maxit_nr larger than zero. Typically, a small initial number of EM iterations should be chosen to obtain reasonable starting values.

Usage

xxirt(dat, Theta=NULL, itemtype=NULL, customItems=NULL, partable=NULL,
       customTheta=NULL, group=NULL, weights=NULL, globconv=1e-06, conv=1e-04,
       maxit=1000, mstep_iter=4, mstep_reltol=1e-06, maxit_nr=0, optimizer_nr="nlminb",
       control_nr=list(trace=1), h=1E-4, use_grad=TRUE, verbose=TRUE,
       penalty_fun_item=NULL, np_fun_item=NULL, verbose_index=NULL,
       cv_kfold=0, cv_maxit=10)

## S3 method for class 'xxirt'
summary(object, digits=3, file=NULL, ...)

## S3 method for class 'xxirt'
print(x, ...)

## S3 method for class 'xxirt'
anova(object,...)

## S3 method for class 'xxirt'
coef(object,...)

## S3 method for class 'xxirt'
logLik(object,...)

## S3 method for class 'xxirt'
vcov(object,...)

## S3 method for class 'xxirt'
confint(object, parm, level=.95, ... )

## S3 method for class 'xxirt'
IRT.expectedCounts(object,...)

## S3 method for class 'xxirt'
IRT.factor.scores(object, type="EAP", ...)

## S3 method for class 'xxirt'
IRT.irfprob(object,...)

## S3 method for class 'xxirt'
IRT.likelihood(object,...)

## S3 method for class 'xxirt'
IRT.posterior(object,...)

## S3 method for class 'xxirt'
IRT.modelfit(object,...)

## S3 method for class 'IRT.modelfit.xxirt'
summary(object,...)

## S3 method for class 'xxirt'
IRT.se(object,...)

# computes Hessian matrix
xxirt_hessian(object, h=1e-4, use_shortcut=TRUE)

Arguments

dat

Data frame with item responses

Theta

Matrix with \bold{\theta} grid vector of latent trait

itemtype

Vector of item types

customItems

List containing types of item response functions created by xxirt_createDiscItem.

partable

Item parameter table which is initially created by xxirt_createParTable and which can be modified by xxirt_modifyParTable.

customTheta

User defined \bold{\theta} distribution created by xxirt_createThetaDistribution.

group

Optional vector of group indicators

weights

Optional vector of person weights

globconv

Convergence criterion for relative change in deviance

conv

Convergence criterion for absolute change in parameters

maxit

Maximum number of iterations in the EM algorithm

mstep_iter

Maximum number of iterations in M-step

mstep_reltol

Convergence criterion in M-step

maxit_nr

Number of Newton-Raphson iterations after EM algorithm

optimizer_nr

Type of optimizer for Newton-Raphson optimization. Alternatives are "optim" or "nlminb" or other options of sirt_optimizer.

control_nr

Argument control for optimizer.

h

Numerical differentiation parameter

use_grad

Logical indicating whether the gradient should be supplied to stats::optim

verbose

Logical indicating whether iteration progress should be displayed

penalty_fun_item

Optional penalty function used in regularized estimation. Used as a function of x (vector of item parameters)

np_fun_item

Function that counts the number of item parameters in regularized estimation. Used as a function of x (vector of item parameters)

object

Object of class xxirt

digits

Number of digits to be rounded

file

Optional file name to which summary output is written

parm

Optional vector of parameters

level

Confidence level

verbose_index

Logical indicating whether item index should be printed in estimation output

cv_kfold

Number of k folds in cross validation. The default is 0 (no cross-validation)

cv_maxit

Maximum number of iterations for each cross-validation sample

x

Object of class xxirt

type

Type of person parameter estimate. Currently, only EAP is implemented.

use_shortcut

Logical indicating whether a shortcut in the computation should be utilized

...

Further arguments to be passed

Details

Item response functions can be specified as functions of unknown parameters \bold{\delta}_i such that P(X_{i}=x | \bold{\theta})=f_i( x | \bold{\theta} ; \bold{\delta}_i ) The item response model is estimated under the assumption of local stochastic independence of items. Equality constraints of item parameters \bold{\delta}_i among items are allowed.

The probability distribution P(\bold{\theta}) are specified as functions of an unknown parameter vector \bold{\gamma}.

A penalty function for item parameters can be specified in penalty_fun_item. The penalty function should be differentiable and a non-differentiable function (e.g., the absolute value function) should be approximated by a differentiable function.

Value

List with following entries

partable

Item parameter table

par_items

Vector with estimated item parameters

par_items_summary

Data frame with item parameters

par_items_bounds

Data frame with summary on bounds of estimated item parameters

par_Theta

Vector with estimated parameters of theta distribution

Theta

Matrix with \bold{\theta} grid

probs_items

Item response functions

probs_Theta

Theta distribution

deviance

Deviance

loglik

Log likelihood value

ic

Information criteria

item_list

List with item functions

customItems

Used customized item response functions

customTheta

Used customized theta distribution

cv_loglike

Cross-validated log-likelihood value (if cv_kfold>0)

p.xi.aj

Individual likelihood

p.aj.xi

Individual posterior

ll_case

Case-wise log-likelihood values

n.ik

Array of expected counts

EAP

EAP person parameter estimates

dat

Used dataset with item responses

dat_resp

Dataset with response indicators

weights

Vector of person weights

G

Number of groups

group

Integer vector of group indicators

group_orig

Vector of original group_identifiers

ncat

Number of categories per item

converged

Logical whether model has converged

iter

Number of iterations needed

Examples

## Not run: 
#############################################################################
## EXAMPLE 1: Unidimensional item response functions
#############################################################################

data(data.read)
dat <- data.read

#------ Definition of item response functions

#*** IRF 2PL
P_2PL <- function( par, Theta, ncat){
    a <- par[1]
    b <- par[2]
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( (cc-1) * a * Theta[,1] - b )
    }
    P <- P / rowSums(P)
    return(P)
}

#*** IRF 1PL
P_1PL <- function( par, Theta, ncat){
    b <- par[1]
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( (cc-1) * Theta[,1] - b )
    }
    P <- P / rowSums(P)
    return(P)
}

#** created item classes of 1PL and 2PL models
par <- c( "a"=1, "b"=0 )
# define some slightly informative prior of 2PL
item_2PL <- sirt::xxirt_createDiscItem( name="2PL", par=par, est=c(TRUE,TRUE),
               P=P_2PL, prior=c(a="dlnorm"), prior_par1=c( a=0 ),
               prior_par2=c(a=5) )
item_1PL <- sirt::xxirt_createDiscItem( name="1PL", par=par[2], est=c(TRUE),
               P=P_1PL )
customItems <- list( item_1PL,  item_2PL )

#---- definition theta distribution

#** theta grid
Theta <- matrix( seq(-6,6,length=21), ncol=1 )

#** theta distribution
P_Theta1 <- function( par, Theta, G){
    mu <- par[1]
    sigma <- max( par[2], .01 )
    TP <- nrow(Theta)
    pi_Theta <- matrix( 0, nrow=TP, ncol=G)
    pi1 <- dnorm( Theta[,1], mean=mu, sd=sigma )
    pi1 <- pi1 / sum(pi1)
    pi_Theta[,1] <- pi1
    return(pi_Theta)
}
#** create distribution class
par_Theta <- c( "mu"=0, "sigma"=1 )
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta, est=c(FALSE,TRUE),
                       P=P_Theta1 )

#****************************************************************************
#******* Model 1: Rasch model

#-- create parameter table
itemtype <- rep( "1PL", 12 )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype,
                        customItems=customItems )

# estimate model
mod1 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable,
                   customItems=customItems, customTheta=customTheta)
summary(mod1)

# estimate Rasch model by providing starting values
partable1 <- sirt::xxirt_modifyParTable( partable, parname="b",
                   value=- stats::qlogis( colMeans(dat) ) )
# estimate model again
mod1b <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable1,
                   customItems=customItems, customTheta=customTheta )
summary(mod1b)

# extract coefficients, covariance matrix and standard errors
coef(mod1b)
vcov(mod1b)
IRT.se(mod1b)

#** start with EM and finalize with Newton-Raphson algorithm
mod1c <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable,
                   customItems=customItems, customTheta=customTheta,
                   maxit=20, maxit_nr=300)
summary(mod1c)

#****************************************************************************
#******* Model 2: 2PL Model with three groups of item discriminations

#-- create parameter table
itemtype <- rep( "2PL", 12 )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype, customItems=customItems)
# modify parameter table: set constraints for item groups A, B and C
partable1 <- sirt::xxirt_modifyParTable(partable, item=paste0("A",1:4),
                         parname="a", parindex=111)
partable1 <- sirt::xxirt_modifyParTable(partable1, item=paste0("B",1:4),
                         parname="a", parindex=112)
partable1 <- sirt::xxirt_modifyParTable(partable1, item=paste0("C",1:4),
                         parname="a", parindex=113)
# delete prior distributions
partable1 <- sirt::xxirt_modifyParTable(partable1, parname="a", prior=NA)

#-- fix sigma to 1
customTheta1 <- customTheta
customTheta1$est <- c("mu"=FALSE,"sigma"=FALSE )

# estimate model
mod2 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable1,
                  customItems=customItems, customTheta=customTheta1 )
summary(mod2)

#****************************************************************************
#******* Model 3: Cloglog link function

#*** IRF cloglog
P_1N <- function( par, Theta, ncat){
    b <- par
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,2] <- 1 - exp( - exp( Theta - b ) )
    P[,1] <- 1 - P[,2]
    return(P)
}
par <- c("b"=0)
item_1N <- sirt::xxirt_createDiscItem( name="1N", par=par, est=c(TRUE),
                    P=P_1N )
customItems <- list( item_1N )
itemtype <- rep( "1N", I )
partable <- sirt::xxirt_createParTable( dat[,items], itemtype=itemtype,
                      customItems=customItems )
partable <- sirt::xxirt_modifyParTable( partable=partable, parname="b",
                 value=- stats::qnorm( colMeans(dat[,items] )) )

#*** estimate model
mod3 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable, customItems=customItems,
                customTheta=customTheta )
summary(mod3)
IRT.compareModels(mod1,mod3)

#****************************************************************************
#******* Model 4: Latent class model

K <- 3 # number of classes
Theta <- diag(K)

#*** Theta distribution
P_Theta1 <- function( par, Theta, G  ){
    logitprobs <- par[1:(K-1)]
    l1 <- exp( c( logitprobs, 0 ) )
    probs <- matrix( l1/sum(l1), ncol=1)
    return(probs)
}

par_Theta <- stats::qlogis( rep( 1/K, K-1 ) )
names(par_Theta) <- paste0("pi",1:(K-1) )
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta,
                     est=rep(TRUE,K-1), P=P_Theta1)

#*** IRF latent class
P_lc <- function( par, Theta, ncat){
    b <- par
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( Theta %*% b )
    }
    P <- P / rowSums(P)
    return(P)
}
par <- seq( -1.5, 1.5, length=K )
names(par) <- paste0("b",1:K)
item_lc <- sirt::xxirt_createDiscItem( name="LC", par=par,
                 est=rep(TRUE,K), P=P_lc )
customItems <- list( item_lc )

# create parameter table
itemtype <- rep( "LC", 12 )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype, customItems=customItems)
partable

#*** estimate model
mod4 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable, customItems=customItems,
                customTheta=customTheta)
summary(mod4)
# class probabilities
mod4$probs_Theta
# item response functions
imod4 <- IRT.irfprob( mod5 )
round( imod4[,2,], 3 )

#****************************************************************************
#******* Model 5: Ordered latent class model

K <- 3 # number of classes
Theta <- diag(K)
Theta <- apply( Theta, 1, cumsum )

#*** Theta distribution
P_Theta1 <- function( par, Theta, G  ){
    logitprobs <- par[1:(K-1)]
    l1 <- exp( c( logitprobs, 0 ) )
    probs <- matrix( l1/sum(l1), ncol=1)
    return(probs)
}
par_Theta <- stats::qlogis( rep( 1/K, K-1 ) )
names(par_Theta) <- paste0("pi",1:(K-1) )
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta,
                est=rep(TRUE,K-1), P=P_Theta1  )

#*** IRF ordered latent class
P_olc <- function( par, Theta, ncat){
    b <- par
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( Theta %*% b )
    }
    P <- P / rowSums(P)
    return(P)
}

par <- c( -1, rep( .5,, length=K-1 ) )
names(par) <- paste0("b",1:K)
item_olc <- sirt::xxirt_createDiscItem( name="OLC", par=par, est=rep(TRUE,K),
                    P=P_olc, lower=c( -Inf, 0, 0 ) )
customItems <- list( item_olc )
itemtype <- rep( "OLC", 12 )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype, customItems=customItems)
partable

#*** estimate model
mod5 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable, customItems=customItems,
                customTheta=customTheta )
summary(mod5)
# estimated item response functions
imod5 <- IRT.irfprob( mod5 )
round( imod5[,2,], 3 )

#############################################################################
## EXAMPLE 2: Multiple group models with xxirt
#############################################################################

data(data.math)
dat <- data.math$data
items <- grep( "M[A-Z]", colnames(dat), value=TRUE )
I <- length(items)

Theta <- matrix( seq(-8,8,len=31), ncol=1 )

#****************************************************************************
#******* Model 1: Rasch model, single group

#*** Theta distribution
P_Theta1 <- function( par, Theta, G  ){
    mu <- par[1]
    sigma <- max( par[2], .01 )
    p1 <- stats::dnorm( Theta[,1], mean=mu, sd=sigma)
    p1 <- p1 / sum(p1)
    probs <- matrix( p1, ncol=1)
    return(probs)
}

par_Theta <- c(0,1)
names(par_Theta) <- c("mu","sigma")
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta,
                   est=c(FALSE,TRUE), P=P_Theta1  )
customTheta

#*** IRF 1PL logit
P_1PL <- function( par, Theta, ncat){
    b <- par
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,2] <- plogis( Theta - b )
    P[,1] <- 1 - P[,2]
    return(P)
}
par <- c("b"=0)
item_1PL <- sirt::xxirt_createDiscItem( name="1PL", par=par, est=c(TRUE), P=P_1PL)
customItems <- list( item_1PL )

itemtype <- rep( "1PL", I )
partable <- sirt::xxirt_createParTable( dat[,items], itemtype=itemtype,
                       customItems=customItems )
partable <- sirt::xxirt_modifyParTable( partable=partable, parname="b",
                  value=- stats::qlogis( colMeans(dat[,items] )) )

#*** estimate model
mod1 <- sirt::xxirt( dat=dat[,items], Theta=Theta, partable=partable,
                customItems=customItems, customTheta=customTheta )
summary(mod1)

#****************************************************************************
#******* Model 2: Rasch model, multiple groups

#*** Theta distribution
P_Theta2 <- function( par, Theta, G  ){
    mu1 <- par[1]
    mu2 <- par[2]
    sigma1 <- max( par[3], .01 )
    sigma2 <- max( par[4], .01 )
    TP <- nrow(Theta)
    probs <- matrix( NA, nrow=TP, ncol=G)
    p1 <- stats::dnorm( Theta[,1], mean=mu1, sd=sigma1)
    probs[,1] <- p1 / sum(p1)
    p1 <- stats::dnorm( Theta[,1], mean=mu2, sd=sigma2)
    probs[,2] <- p1 / sum(p1)
    return(probs)
}
par_Theta <- c(0,0,1,1)
names(par_Theta) <- c("mu1","mu2","sigma1","sigma2")
customTheta2  <- sirt::xxirt_createThetaDistribution( par=par_Theta,
                    est=c(FALSE,TRUE,TRUE,TRUE), P=P_Theta2  )
print(customTheta2)

#*** estimate model
mod2 <- sirt::xxirt( dat=dat[,items], group=dat$female, Theta=Theta, partable=partable,
           customItems=customItems, customTheta=customTheta2, maxit=40)
summary(mod2)
IRT.compareModels(mod1, mod2)

#*** compare results with TAM package
library(TAM)
mod2b <- TAM::tam.mml( resp=dat[,items], group=dat$female )
summary(mod2b)
IRT.compareModels(mod1, mod2, mod2b)

#############################################################################
## EXAMPLE 3: Regularized 2PL model
#############################################################################

data(data.read, package="sirt")
dat <- data.read

#------ Definition of item response functions

#*** IRF 2PL
P_2PL <- function( par, Theta, ncat){
    a <- par[1]
    b <- par[2]
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( (cc-1) * a * Theta[,1] - b )
    }
    P <- P / rowSums(P)
    return(P)
}

#** created item classes of 1PL and 2PL models
par <- c( "a"=1, "b"=0 )
# define some slightly informative prior of 2PL
item_2PL <- sirt::xxirt_createDiscItem( name="2PL", par=par, est=c(TRUE,TRUE),
               P=P_2PL, prior=c(a="dlnorm"), prior_par1=c( a=0 ),
               prior_par2=c(a=5) )
customItems <- list( item_2PL )

#---- definition theta distribution

#** theta grid
Theta <- matrix( seq(-6,6,length=21), ncol=1 )

#** theta distribution
P_Theta1 <- function( par, Theta, G){
    mu <- par[1]
    sigma <- max( par[2], .01 )
    TP <- nrow(Theta)
    pi_Theta <- matrix( 0, nrow=TP, ncol=G)
    pi1 <- dnorm( Theta[,1], mean=mu, sd=sigma )
    pi1 <- pi1 / sum(pi1)
    pi_Theta[,1] <- pi1
    return(pi_Theta)
}
#** create distribution class
par_Theta <- c( "mu"=0, "sigma"=1 )
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta, est=c(FALSE,FALSE),
                       P=P_Theta1 )

#****************************************************************************
#******* Model 1: 2PL model

itemtype <- rep( "2PL", 12 )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype,
                        customItems=customItems )

mod1 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable,
                   customItems=customItems, customTheta=customTheta)
summary(mod1)

#****************************************************************************
#******* Model 2: Regularized 2PL model with regularization on item loadings

# define regularized estimation of item loadings
parindex <- partable[ partable$parname=="a","parindex"]

#** penalty is defined by -N*lambda*sum_i (a_i-1)^2
N <- nrow(dat)
lambda <- .02
penalty_fun_item <- function(x)
{
    val <- N*lambda*sum( ( x[parindex]-1)^2)
    return(val)
}
# estimate standard deviation
customTheta1  <- sirt::xxirt_createThetaDistribution( par=par_Theta, est=c(FALSE,TRUE),
                       P=P_Theta1 )
mod2 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable,
                   customItems=customItems, customTheta=customTheta1,
                   penalty_fun_item=penalty_fun_item)
summary(mod2)

#############################################################################
## EXAMPLE 4: 2PL mixture model
#############################################################################

#*** simulate data
set.seed(123)
N <- 4000   # number of persons
I <- 15     # number of items
prop <- .25 # mixture proportion for second class

# discriminations and difficulties in first class
a1 <- rep(1,I)
b1 <- seq(-2,2,len=I)
# distribution in second class
mu2 <- 1
sigma2 <- 1.2
# compute parameters with constraint N(0,1) in second class
# a*(sigma*theta+mu-b)=a*sigma*(theta-(b-mu)/sigma)
#=> a2=a*sigma and b2=(b-mu)/sigma
a2 <- a1
a2[c(2,4,6,8)] <- 0.2  # some items with different discriminations
a2 <- a2*sigma2
b2 <- b1
b2[1:5] <- 1   # first 5 item with different difficulties
b2 <- (b2-mu2)/sigma2
dat1 <- sirt::sim.raschtype(theta=stats::rnorm(N*(1-prop)), b=b1, fixed.a=a1)
dat2 <- sirt::sim.raschtype(theta=stats::rnorm(N*prop), b=b2, fixed.a=a2)
dat <- rbind(dat1, dat2)

#**** model specification

#*** define theta distribution
TP <- 21
theta <- seq(-6,6,length=TP)
# stack theta vectors below each others=> 2 latent classes
Theta <- matrix( c(theta, theta ), ncol=1 )
# distribution of theta (i.e., N(0,1))
w_theta <- dnorm(theta)
w_theta <- w_theta / sum(w_theta)

P_Theta1 <- function( par, Theta, G){
    p2_logis <- par[1]
    p2 <- stats::plogis( p2_logis )
    p1 <- 1-p2
    pi_Theta <- c( p1*w_theta, p2*w_theta)
    pi_Theta <- matrix(pi_Theta, ncol=1)
    return(pi_Theta)
}

par_Theta <- c( p2_logis=qlogis(.25))
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta, est=c(TRUE),
                       P=P_Theta1)

# IRF for 2-class mixture 2PL model
par <- c(a1=1, a2=1, b1=0, b2=.5)

P_2PLmix <- function( par, Theta, ncat)
{
    a1 <- par[1]
    a2 <- par[2]
    b1 <- par[3]
    b2 <- par[4]
    P <- matrix( NA, nrow=2*TP, ncol=ncat)
    TP <- nrow(Theta)/2
    P1 <- stats::plogis( a1*(Theta[1:TP,1]-b1) )
    P2 <- stats::plogis( a2*(Theta[TP+1:(2*TP),1]-b2) )
    P[,2] <- c(P1, P2)
    P[,1] <- 1-P[,2]
    return(P)
}

# define some slightly informative prior of 2PL
item_2PLmix <- sirt::xxirt_createDiscItem( name="2PLmix", par=par,
               est=c(TRUE,TRUE,TRUE,TRUE), P=P_2PLmix )
customItems <- list( item_2PLmix )

#****************************************************************************
#******* Model 1: 2PL mixture model

itemtype <- rep( "2PLmix", I )
partable <- sirt::xxirt_createParTable( dat, itemtype=itemtype,
                        customItems=customItems )
mod1 <- sirt::xxirt( dat=dat, Theta=Theta, partable=partable,
                   customItems=customItems, customTheta=customTheta)
summary(mod1)

## End(Not run)

Create Item Response Functions and Item Parameter Table

Description

Create item response functions and item parameter table

Usage

xxirt_createDiscItem( name, par, est, P, lower=-Inf, upper=Inf,
     prior=NULL, prior_par1=NULL, prior_par2=NULL)

xxirt_createParTable(dat, itemtype, customItems=NULL)

xxirt_modifyParTable( partable, parname, item=NULL, value=NULL,
     est=NULL, parlabel=NULL, parindex=NULL, lower=NULL,
     upper=NULL, prior=NULL, prior_par1=NULL, prior_par2=NULL )

Arguments

name

Type of item response function

par

Named vector of starting values of item parameters

est

Logical vector indicating which parameters should be estimated

P

Item response function

lower

Lower bounds

upper

Upper bounds

prior

Prior distribution

prior_par1

First parameter prior distribution

prior_par2

Second parameter prior distribution

dat

Data frame with item responses

itemtype

Vector of item types

customItems

List with item objects created by xxirt_createDiscItem

partable

Item parameter table

parname

Parameter name

item

Item

value

Value of item parameter

parindex

Parameter index

parlabel

Item parameter label

Examples

#############################################################################
## EXAMPLE 1: Definition of item response functions
#############################################################################

data(data.read)
dat <- data.read

#------ Definition of item response functions
#*** IRF 2PL
P_2PL <- function( par, Theta, ncat){
    a <- par[1]
    b <- par[2]
    TP <- nrow(Theta)
    P <- matrix( NA, nrow=TP, ncol=ncat)
    P[,1] <- 1
    for (cc in 2:ncat){
        P[,cc] <- exp( (cc-1) * a * Theta[,1] - b )
    }
    P <- P / rowSums(P)
    return(P)
}

#*** IRF 1PL
P_1PL <- function( par, Theta, ncat){
    b <- par[1]
    TP <- nrow(Theta)
    par0 <- c(1,b)
    P <- P_2PL( par=par0, Theta=Theta, ncat=ncat)
    return(P)
}

#** created item classes of 1PL and 2PL models
par <- c( "a"=1, "b"=0 )
# define some slightly informative prior of 2PL
item_2PL <- sirt::xxirt_createDiscItem( name="2PL", par=par, est=c(TRUE,TRUE),
                P=P_2PL, prior=c( a="dlnorm"), prior_par1=c(a=0),
                prior_par2=c(a=5) )
item_1PL <- sirt::xxirt_createDiscItem( name="1PL", par=par[2], est=c(TRUE),
                P=P_1PL )
# list of item classes in customItems
customItems <- list( item_1PL,  item_2PL )

#-- create parameter table
itemtype <- rep( "1PL", 12 )
partable <- sirt::xxirt_createParTable(dat, itemtype=itemtype, customItems=customItems)
# privide starting values
partable1 <- sirt::xxirt_modifyParTable( partable, parname="b",
                   value=- stats::qlogis( colMeans(dat) ) )
# equality constraint of parameters and definition of lower bounds
partable1 <- sirt::xxirt_modifyParTable( partable1, item=c("A1","A2"),
                parname="b", parindex=110, lower=-1, value=0)
print(partable1)

Creates a User Defined Theta Distribution

Description

Creates a user defined theta distribution.

Usage

xxirt_createThetaDistribution(par, est, P, prior=NULL, prior_par1=NULL,
       prior_par2=NULL, lower=NULL, upper=NULL)

Arguments

par

Parameter vector with starting values

est

Vector of logicals indicating which parameters should be estimated

P

Distribution function for \bold{\theta}

prior

Prior distribution

prior_par1

First parameter of prior distribution

prior_par2

Second parameter of prior distribution

lower

Lower bounds for parameters

upper

Upper bounds for parameters

Examples

#############################################################################
## EXAMPLE 1: Definition of theta distribution
#############################################################################

#** theta grid
Theta <- matrix( seq(-10,10,length=31), ncol=1 )

#** theta distribution
P_Theta1 <- function( par, Theta, G){
    mu <- par[1]
    sigma <- max( par[2], .01 )
    TP <- nrow(Theta)
    pi_Theta <- matrix( 0, nrow=TP, ncol=G)
    pi1 <- stats::dnorm( Theta[,1], mean=mu, sd=sigma )
    pi1 <- pi1 / sum(pi1)
    pi_Theta[,1] <- pi1
    return(pi_Theta)
                }
#** create distribution class
par_Theta <- c( "mu"=0, "sigma"=1 )
customTheta  <- sirt::xxirt_createThetaDistribution( par=par_Theta,
                       est=c(FALSE,TRUE), P=P_Theta1 )

Supplementary Item Response Theory Models

Description

Details

Author(s)

References

See Also

Examples

Automatic Method of Finding Keys in a Dataset with Raw Item Responses

Description

Usage

Arguments

Value

Examples

Functions for the Beta Item Response Model

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Extended Bradley-Terry Model

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Categorize and Decategorize Variables in a Data Frame

Description

Usage

Arguments

Value

Examples

Nonparametric Estimation of Conditional Covariances of Item Pairs

Description

Usage

Arguments

Note

References

Examples

Estimation of a Unidimensional Factor Model under Full and Partial Measurement Invariance

Description

Usage

Arguments

Value

See Also

Examples

Classification Accuracy in the Rasch Model

Description

Usage

Arguments

Value

References

See Also

Examples

Confirmatory DETECT and polyDETECT Analysis

Description

Usage

Arguments

Details

Value

References

See Also

Examples

Item Parameters Cultural Activities

Description

Usage

Format

BEFKI Dataset (Schroeders, Schipolowski, & Wilhelm, 2015)

Description

Usage

Format

Details

Source

References

Dataset Big 5 from qgraph Package