Type: | Package |
Title: | Functions for Learning Bayesian Inference |
Version: | 2.15.1 |
Date: | 2018-03-18 |
Author: | Jim Albert |
Maintainer: | Jim Albert <albert@bgsu.edu> |
LazyData: | yes |
Description: | A collection of functions helpful in learning the basic tenets of Bayesian statistical inference. It contains functions for summarizing basic one and two parameter posterior distributions and predictive distributions. It contains MCMC algorithms for summarizing posterior distributions defined by the user. It also contains functions for regression models, hierarchical models, Bayesian tests, and illustrations of Gibbs sampling. |
License: | GPL-2 | GPL-3 [expanded from: GPL (≥ 2)] |
Packaged: | 2018-03-18 17:46:55 UTC; jamesalbert |
NeedsCompilation: | no |
Repository: | CRAN |
Date/Publication: | 2018-03-18 20:41:13 UTC |
School achievement data
Description
Achievement data for a group of Austrian school children
Usage
achievement
Format
A data frame with 109 observations on the following 7 variables.
- Gen
gender of child where 0 is male and 1 is female
- Age
age in months
- IQ
iq score
- math1
test score on mathematics computation
- math2
test score on mathematics problem solving
- read1
test score on reading speed
- read2
test score on reading comprehension
Source
Abraham, B., and Ledolter, J. (2006), Introduction to Regression Modeling, Duxbury.
Team records in the 1964 National League baseball season
Description
Head to head records for all teams in the 1964 National League baseball season. Teams are coded as Cincinnati (1), Chicago (2), Houston (3), Los Angeles (4), Milwaukee (5), New York (6), Philadelphia (7), Pittsburgh (8), San Francisco (9), and St. Louis (10).
Usage
baseball.1964
Format
A data frame with 45 observations on the following 4 variables.
- Team.1
Number of team 1
- Team.2
Number of team 2
- Wins.Team1
Number of games won by team 1
- Wins.Team2
Number of games won by team 2
Source
www.baseball-reference.com website.
Observation sensitivity analysis in beta-binomial model
Description
Computes probability intervals for the log precision parameter K in a beta-binomial model for all "leave one out" models using sampling importance resampling
Usage
bayes.influence(theta,data)
Arguments
theta |
matrix of simulated draws from the posterior of (logit eta, log K) |
data |
matrix with columns of counts and sample sizes |
Value
summary |
vector of 5th, 50th, 95th percentiles of log K for complete sample posterior |
summary.obs |
matrix where the ith row contains the 5th, 50th, 95th percentiles of log K for posterior when the ith observation is removed |
Author(s)
Jim Albert
Examples
data(cancermortality)
start=array(c(-7,6),c(1,2))
fit=laplace(betabinexch,start,cancermortality)
tpar=list(m=fit$mode,var=2*fit$var,df=4)
theta=sir(betabinexch,tpar,1000,cancermortality)
intervals=bayes.influence(theta,cancermortality)
Bayesian regression model selection using G priors
Description
Using Zellner's G priors, computes the log marginal density for all possible regression models
Usage
bayes.model.selection(y, X, c, constant=TRUE)
Arguments
y |
vector of response values |
X |
matrix of covariates |
c |
parameter of the G prior |
constant |
logical variable indicating if a constant term is in the matrix X |
Value
mod.prob |
data frame specifying the model, the value of the log marginal density and the value of the posterior model probability |
converge |
logical vector indicating if the laplace algorithm converged for each model |
Author(s)
Jim Albert
Examples
data(birdextinct)
logtime=log(birdextinct$time)
X=cbind(1,birdextinct$nesting,birdextinct$size,birdextinct$status)
bayes.model.selection(logtime,X,100)
Simulates from a probit binary response regression model using data augmentation and Gibbs sampling
Description
Gives a simulated sample from the joint posterior distribution of the regression vector for a binary response regression model with a probit link and a informative normal(beta, P) prior. Also computes the log marginal likelihood when a subjective prior is used.
Usage
bayes.probit(y,X,m,prior=list(beta=0,P=0))
Arguments
y |
vector of binary responses |
X |
covariate matrix |
m |
number of simulations desired |
prior |
list with components beta, the prior mean, and P, the prior precision matrix |
Value
beta |
matrix of simulated draws of regression vector beta where each row corresponds to one draw |
log.marg |
simulation estimate at log marginal likelihood of the model |
Author(s)
Jim Albert
Examples
response=c(0,1,0,0,0,1,1,1,1,1)
covariate=c(1,2,3,4,5,6,7,8,9,10)
X=cbind(1,covariate)
prior=list(beta=c(0,0),P=diag(c(.5,10)))
m=1000
s=bayes.probit(response,X,m,prior)
Computation of posterior residual outlying probabilities for a linear regression model
Description
Computes the posterior probabilities that Bayesian residuals exceed a cutoff value for a linear regression model with a noninformative prior
Usage
bayesresiduals(lmfit,post,k)
Arguments
lmfit |
output of the regression function lm |
post |
list with components beta, matrix of simulated draws of regression parameter, and sigma, vector of simulated draws of sampling standard deviation |
k |
cut-off value that defines an outlier |
Value
vector of posterior outlying probabilities
Author(s)
Jim Albert
Examples
chirps=c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1)
temp=c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76.3)
X=cbind(1,chirps)
lmfit=lm(temp~X)
m=1000
post=blinreg(temp,X,m)
k=2
bayesresiduals(lmfit,post,k)
Bermuda grass experiment data
Description
Yields of bermuda grass for a factorial design of nutrients nitrogen, phosphorus, and potassium.
Usage
bermuda.grass
Format
A data frame with 64 observations on the following 4 variables.
- y
yield of bermuda grass in tons per acre
- Nit
level of nitrogen
- Phos
level of phosphorus
- Pot
level of potassium
Source
McCullagh, P., and Nelder, J. (1989), Generalized Linear Models, Chapman and Hall.
Selection of Beta Prior Given Knowledge of Two Quantiles
Description
Finds the shape parameters of a beta density that matches knowledge of two quantiles of the distribution.
Usage
beta.select(quantile1, quantile2)
Arguments
quantile1 |
list with components p, the value of the first probability, and x, the value of the first quantile |
quantile2 |
list with components p, the value of the second probability, and x, the value of the second quantile |
Value
vector of shape parameters of the matching beta distribution
Author(s)
Jim Albert
Examples
# person believes the median of the prior is 0.25
# and the 90th percentile of the prior is 0.45
quantile1=list(p=.5,x=0.25)
quantile2=list(p=.9,x=0.45)
beta.select(quantile1,quantile2)
Log posterior of logit mean and log precision for Binomial/beta exchangeable model
Description
Computes the log posterior density of logit mean and log precision for a Binomial/beta exchangeable model
Usage
betabinexch(theta,data)
Arguments
theta |
vector of parameter values of logit eta and log K |
data |
a matrix with columns y (counts) and n (sample sizes) |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
n=c(20,20,20,20,20)
y=c(1,4,3,6,10)
data=cbind(y,n)
theta=c(-1,0)
betabinexch(theta,data)
Log posterior of mean and precision for Binomial/beta exchangeable model
Description
Computes the log posterior density of mean and precision for a Binomial/beta exchangeable model
Usage
betabinexch0(theta,data)
Arguments
theta |
vector of parameter values of eta and K |
data |
a matrix with columns y (counts) and n (sample sizes) |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
n=c(20,20,20,20,20)
y=c(1,4,3,6,10)
data=cbind(y,n)
theta=c(.1,10)
betabinexch0(theta,data)
Logarithm of integral of Bayes factor for testing homogeneity of proportions
Description
Computes the logarithm of the integral of the Bayes factor for testing homogeneity of a set of proportions
Usage
bfexch(theta,datapar)
Arguments
theta |
value of the logit of the prior mean hyperparameter |
datapar |
list with components data, matrix with columns y (counts) and n (sample sizes), and K, prior precision hyperparameter |
Value
value of the logarithm of the integral
Author(s)
Jim Albert
Examples
y=c(1,3,2,4,6,4,3)
n=c(10,10,10,10,10,10,10)
data=cbind(y,n)
K=20
datapar=list(data=data,K=K)
theta=1
bfexch(theta,datapar)
Bayes factor against independence assuming alternatives close to independence
Description
Computes a Bayes factor against independence for a two-way contingency table assuming a "close to independence" alternative model
Usage
bfindep(y,K,m)
Arguments
y |
matrix of counts |
K |
Dirichlet precision hyperparameter |
m |
number of simulations |
Value
bf |
value of the Bayes factor against hypothesis of independence |
nse |
estimate of the simulation standard error of the computed Bayes factor |
Author(s)
Jim Albert
Examples
y=matrix(c(10,4,6,3,6,10),c(2,3))
K=20
m=1000
bfindep(y,K,m)
Computes the posterior for binomial sampling and a mixture of betas prior
Description
Computes the parameters and mixing probabilities for a binomial sampling problem where the prior is a discrete mixture of beta densities.
Usage
binomial.beta.mix(probs,betapar,data)
Arguments
probs |
vector of probabilities of the beta components of the prior |
betapar |
matrix where each row contains the shape parameters for a beta component of the prior |
data |
vector of number of successes and number of failures |
Value
probs |
vector of probabilities of the beta components of the posterior |
betapar |
matrix where each row contains the shape parameters for a beta component of the posterior |
Author(s)
Jim Albert
Examples
probs=c(.5, .5)
beta.par1=c(15,5)
beta.par2=c(10,10)
betapar=rbind(beta.par1,beta.par2)
data=c(20,15)
binomial.beta.mix(probs,betapar,data)
Bird measurements from British islands
Description
Measurements on breedings pairs of landbird species were collected from 16 islands about Britain over several decades.
Usage
birdextinct
Format
A data frame with 62 observations on the following 5 variables.
- species
name of bird species
- time
average time of extinction on the islands
- nesting
average number of nesting pairs
- size
size of the species, 1 or 0 if large or small
- status
staus of the species, 1 or 0 if resident or migrant
Source
Pimm, S., Jones, H., and Diamond, J. (1988), On the risk of extinction, American Naturalists, 132, 757-785.
Birthweight regression study
Description
Dobson describes a study where one is interested in predicting a baby's birthweight based on the gestational age and the baby's gender.
Usage
birthweight
Format
A data frame with 24 observations on the following 3 variables.
- age
gestational age in weeks
- gender
gender of the baby where 0 (1) is male (female)
- weight
birthweight of baby in grams
Source
Dobson, A. (2001), An Introduction to Generalized Linear Models, New York: Chapman and Hall.
Simulation from Bayesian linear regression model
Description
Gives a simulated sample from the joint posterior distribution of the regression vector and the error standard deviation for a linear regression model with a noninformative or g prior.
Usage
blinreg(y,X,m,prior=NULL)
Arguments
y |
vector of responses |
X |
design matrix |
m |
number of simulations desired |
prior |
list with components c0 and beta0 of Zellner's g prior |
Value
beta |
matrix of simulated draws of beta where each row corresponds to one draw |
sigma |
vector of simulated draws of the error standard deviation |
Author(s)
Jim Albert
Examples
chirps=c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1)
temp=c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76.3)
X=cbind(1,chirps)
m=1000
s=blinreg(temp,X,m)
Simulates values of expected response for linear regression model
Description
Simulates draws of the posterior distribution of an expected response for a linear regression model with a noninformative prior
Usage
blinregexpected(X1,theta.sample)
Arguments
X1 |
matrix where each row corresponds to a covariate set |
theta.sample |
list with components beta, matrix of simulated draws of regression vector, and sigma, vector of simulated draws of sampling error standard deviation |
Value
matrix where a column corresponds to the simulated draws of the expected response for a given covariate set
Author(s)
Jim Albert
Examples
chirps=c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1)
temp=c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76.3)
X=cbind(1,chirps)
m=1000
theta.sample=blinreg(temp,X,m)
covset1=c(1,15)
covset2=c(1,20)
X1=rbind(covset1,covset2)
blinregexpected(X1,theta.sample)
Simulates values of predicted response for linear regression model
Description
Simulates draws of the predictive distribution of a future response for a linear regression model with a noninformative prior
Usage
blinregpred(X1,theta.sample)
Arguments
X1 |
matrix where each row corresponds to a covariate set |
theta.sample |
list with components beta, matrix of simulated draws of regression vector, and sigma, vector of simulated draws of sampling error standard deviation |
Value
matrix where a column corresponds to the simulated draws of the predicted response for a given covariate set
Author(s)
Jim Albert
Examples
chirps=c(20,16.0,19.8,18.4,17.1,15.5,14.7,17.1,15.4,16.2,15,17.2,16,17,14.1)
temp=c(88.6,71.6,93.3,84.3,80.6,75.2,69.7,82,69.4,83.3,78.6,82.6,80.6,83.5,76.3)
X=cbind(1,chirps)
m=1000
theta.sample=blinreg(temp,X,m)
covset1=c(1,15)
covset2=c(1,20)
X1=rbind(covset1,covset2)
blinregpred(X1,theta.sample)
Simulates fitted probabilities for a probit regression model
Description
Gives a simulated sample for fitted probabilities for a binary response regression model with a probit link and noninformative prior.
Usage
bprobit.probs(X1,fit)
Arguments
X1 |
matrix where each row corresponds to a covariate set |
fit |
simulated matrix of draws of the regression vector |
Value
matrix of simulated draws of the fitted probabilities, where a column corresponds to a particular covariate set
Author(s)
Jim Albert
Examples
response=c(0,1,0,0,0,1,1,1,1,1)
covariate=c(1,2,3,4,5,6,7,8,9,10)
X=cbind(1,covariate)
m=1000
fit=bayes.probit(response,X,m)
x1=c(1,3)
x2=c(1,8)
X1=rbind(x1,x2)
fittedprobs=bprobit.probs(X1,fit$beta)
Log posterior of a Bradley Terry random effects model
Description
Computes the log posterior density of the talent parameters and the log standard deviation for a Bradley Terry model with normal random effects
Usage
bradley.terry.post(theta,data)
Arguments
theta |
vector of talent parameters and log standard deviation |
data |
data matrix with columns team1, team2, wins by team1, and wins by team2 |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
data(baseball.1964)
team.strengths=rep(0,10)
log.sigma=0
bradley.terry.post(c(team.strengths,log.sigma),baseball.1964)
Survival experience of women with breast cancer under treatment
Description
Collett (1994) describes a study to evaluate the effectiveness of a histochemical marker in predicting the survival experience of women with breast cancer.
Usage
breastcancer
Format
A data frame with 45 observations on the following 3 variables.
- time
survival time in months
- status
censoring indicator where 1 (0) indicates a complete (censored) survival time
- stain
indicates by a 0 (1) if tumor was negatively (positively) stained
Source
Collett, D. (1994), Modelling Survival Data in Medical Research, London: Chapman and Hall.
Calculus grades dataset
Description
Grades and other variables collected for a sample of calculus students.
Usage
calculus.grades
Format
A data frame with 100 observations on the following 3 variables.
- grade
indicates if student received a A or B in class
- prev.grade
indicates if student received a A in prerequisite math class
- act
score on the ACT math test
Source
Collected by a colleague of the author at his university.
Cancer mortality data
Description
Number of cancer deaths and number at risk for 20 cities in Missouri.
Usage
cancermortality
Format
A data frame with 20 observations on the following 2 variables.
- y
number of cancer deaths
- n
number at risk
Source
Tsutakawa, R., Shoop, G., and Marienfeld, C. (1985), Empirical Bayes Estimation of Cancer Mortality Rates, Statistics in Medicine, 4, 201-212.
Setup for Career Trajectory Application
Description
Setups the data matrices for the use of WinBUGS in the career trajectory application.
Usage
careertraj.setup(data)
Arguments
data |
data matrix for ballplayers with variables Player, Year, Age, G, AB, R, H, X2B, X3B, HR, RBI, BB, SO |
Value
player.names |
vector of player names |
y |
matrix of home runs for players where a row corresponds to the home runs for a player during all the years of his career |
n |
matrix of AB-SO for all players |
x |
matrix of ages for all players for all years of their careers |
T |
vector of number of seasons for all players |
N |
number of players |
Author(s)
Jim Albert
Examples
data(sluggerdata)
careertraj.setup(sluggerdata)
Log posterior of median and log scale parameters for Cauchy sampling
Description
Computes the log posterior density of (M,log S) when a sample is taken from a Cauchy density with location M and scale S and a uniform prior distribution is taken on (M, log S)
Usage
cauchyerrorpost(theta,data)
Arguments
theta |
vector of parameter values of M and log S |
data |
vector containing sample of observations |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
data=c(108, 51, 7, 43, 52, 54, 53, 49, 21, 48)
theta=c(40,1)
cauchyerrorpost(theta,data)
Chemotherapy treatment effects on ovarian cancer
Description
Edmunson et al (1979) studied the effect of different chemotherapy treatments following surgical treatment of ovarian cancer.
Usage
chemotherapy
Format
A data frame with 26 observations on the following 5 variables.
- patient
patient number
- time
survival time in days following treatment
- status
indicates if time is censored (0) or actually observed (1)
- treat
control group (0) or treatment group (1)
- age
age of the patient
Source
Edmonson, J., Felming, T., Decker, D., Malkasian, G., Jorgensen, E., Jefferies, J.,Webb, M., and Kvols, L. (1979), Different chemotherapeutic sensitivities and host factors affecting prognosis in advanced ovarian carcinoma versus minimal residual disease, Cancer Treatment Reports, 63, 241-247.
Bayes factor against independence using uniform priors
Description
Computes a Bayes factor against independence for a two-way contingency table assuming uniform prior distributions
Usage
ctable(y,a)
Arguments
y |
matrix of counts |
a |
matrix of prior hyperparameters |
Value
value of the Bayes factor against independence
Author(s)
Jim Albert
Examples
y=matrix(c(10,4,6,3,6,10),c(2,3))
a=matrix(rep(1,6),c(2,3))
ctable(y,a)
Darwin's data on plants
Description
Fifteen differences of the heights of cross and self fertilized plants quoted by Fisher (1960)
Usage
darwin
Format
A data frame with 15 observations on the following 1 variable.
- difference
difference of heights of two types of plants
Source
Fisher, R. (1960), Statistical Methods for Research Workers, Edinburgh: Oliver and Boyd.
Highest probability interval for a discrete distribution
Description
Computes a highest probability interval for a discrete probability distribution
Usage
discint(dist, prob)
Arguments
dist |
probability distribution written as a matrix where the first column contain the values and the second column the probabilities |
prob |
probability content of interest |
Value
prob |
exact probability content of interval |
set |
set of values of the probability interval |
Author(s)
Jim Albert
Examples
x=0:10
probs=dbinom(x,size=10,prob=.3)
dist=cbind(x,probs)
pcontent=.8
discint(dist,pcontent)
Posterior distribution with discrete priors
Description
Computes the posterior distribution for an arbitrary one parameter distribution for a discrete prior distribution.
Usage
discrete.bayes(df,prior,y,...)
Arguments
df |
name of the function defining the sampling density |
prior |
vector defining the prior density; names of the vector define the parameter values and entries of the vector define the prior probabilities |
y |
vector of data values |
... |
any further fixed parameter values used in the sampling density function |
Value
prob |
vector of posterior probabilities |
pred |
scalar with prior predictive probability |
Author(s)
Jim Albert
Examples
prior=c(.25,.25,.25,.25)
names(prior)=c(.2,.25,.3,.35)
y=5
n=10
discrete.bayes(dbinom,prior,y,size=n)
Posterior distribution of two parameters with discrete priors
Description
Computes the posterior distribution for an arbitrary two parameter distribution for a discrete prior distribution.
Usage
discrete.bayes.2(df,prior,y=NULL,...)
Arguments
df |
name of the function defining the sampling density of two parameters |
prior |
matrix defining the prior density; the row names and column names of the matrix define respectively the values of parameter 1 and values of parameter 2 and the entries of the matrix give the prior probabilities |
y |
y is a matrix of data values, where each row corresponds to a single observation |
... |
any further fixed parameter values used in the sampling density function |
Value
prob |
matrix of posterior probabilities |
pred |
scalar with prior predictive probability |
Author(s)
Jim Albert
Examples
p1 = seq(0.1, 0.9, length = 9)
p2 = p1
prior = matrix(1/81, 9, 9)
dimnames(prior)[[1]] = p1
dimnames(prior)[[2]] = p2
discrete.bayes.2(twoproplike,prior)
The probability density function for the multivariate normal (Gaussian) probability distribution
Description
Computes the density of a multivariate normal distribution
Usage
dmnorm(x, mean = rep(0, d), varcov, log = FALSE)
Arguments
x |
vector of length d or matrix with d columns, giving the coordinates of points where density is to evaluated |
mean |
numeric vector giving the location parameter of the distribution |
varcov |
a positive definite matrix representing the scale matrix of the distribution |
log |
a logical value; if TRUE, the logarithm of the density is to be computed |
Value
vector of density values
Author(s)
Jim Albert
Examples
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
x <- c(2,14,0)
f <- dmnorm(x, mu, Sigma)
Probability density function for multivariate t
Description
Computes the density of a multivariate t distribution
Usage
dmt(x, mean = rep(0, d), S, df = Inf, log=FALSE)
Arguments
x |
vector of length d or matrix with d columns, giving the coordinates of points where density is to evaluated |
mean |
numeric vector giving the location parameter of the distribution |
S |
a positive definite matrix representing the scale matrix of the distribution |
df |
degrees of freedom |
log |
a logical value; if TRUE, the logarithm of the density is to be computed |
Value
vector of density values
Author(s)
Jim Albert
Examples
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
df <- 4
x <- c(2,14,0)
f <- dmt(x, mu, Sigma, df)
Donner survival study
Description
Data contains the age, gender and survival status for 45 members of the Donner Party who experienced difficulties in crossing the Sierra Nevada mountains in California.
Usage
donner
Format
A data frame with 45 observations on the following 3 variables.
- age
age of person
- male
gender that is 1 (0) if person is male (female)
- survival
survival status, 1 or 0 if person survived or died
Source
Grayson, D. (1960), Donner party deaths: a demographic assessment, Journal of Anthropological Assessment, 46, 223-242.
Florida election data
Description
For each of the Florida counties in the 2000 presidential election, the number of votes for George Bush, Al Gore, and Pat Buchanan is recorded. Also the number of votes for the minority candidate Ross Perot in the 1996 presidential election is recorded.
Usage
election
Format
A data frame with 67 observations on the following 5 variables.
- county
name of Florida county
- perot
number of votes for Ross Perot in 1996 election
- gore
number of votes for Al Gore in 2000 election
- bush
number of votes for George Bush in 2000 election
- buchanan
number of votes for Pat Buchanan in 2000 election
Poll data from 2008 U.S. Presidential Election
Description
Results of recent state polls in the 2008 United States Presidential Election between Barack Obama and John McCain.
Usage
election.2008
Format
A data frame with 51 observations on the following 4 variables.
- State
name of the state
- M.pct
percentage of poll survey for McCain
- O.pct
precentage of poll survey for Obama
- EV
number of electoral votes
Source
Data collected by author in November 2008 from www.cnn.com website.
Game outcomes and point spreads for American football
Description
Game outcomes and point spreads for 672 professional American football games.
Usage
footballscores
Format
A data frame with 672 observations on the following 8 variables.
- year
year of game
- home
indicates if favorite is the home team
- favorite
score of favorite team
- underdog
score of underdog team
- spread
point spread
- favorite.name
name of favorite team
- underdog.name
name of underdog team
- week
week number of the season
Source
Gelman, A., Carlin, J., Stern, H., and Rubin, D. (2003), Bayesian Data Analysis, Chapman and Hall.
Metropolis within Gibbs sampling algorithm of a posterior distribution
Description
Implements a Metropolis-within-Gibbs sampling algorithm for an arbitrary real-valued posterior density defined by the user
Usage
gibbs(logpost,start,m,scale,...)
Arguments
logpost |
function defining the log posterior density |
start |
array with a single row that gives the starting value of the parameter vector |
m |
the number of iterations of the chain |
scale |
vector of scale parameters for the random walk Metropolis steps |
... |
data that is used in the function logpost |
Value
par |
a matrix of simulated values where each row corresponds to a value of the vector parameter |
accept |
vector of acceptance rates of the Metropolis steps of the algorithm |
Author(s)
Jim Albert
Examples
data=c(6,2,3,10)
start=array(c(1,1),c(1,2))
m=1000
scale=c(2,2)
s=gibbs(logctablepost,start,m,scale,data)
Log posterior of normal parameters when data is in grouped form
Description
Computes the log posterior density of (M,log S) for normal sampling where the data is observed in grouped form
Usage
groupeddatapost(theta,data)
Arguments
theta |
vector of parameter values M and log S |
data |
list with components int.lo, a vector of left endpoints, int.hi, a vector of right endpoints, and f, a vector of bin frequencies |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
int.lo=c(-Inf,10,15,20,25)
int.hi=c(10,15,20,25,Inf)
f=c(2,5,8,4,2)
data=list(int.lo=int.lo,int.hi=int.hi,f=f)
theta=c(20,1)
groupeddatapost(theta,data)
Heart transplant mortality data
Description
The number of deaths within 30 days of heart transplant surgery for 94 U.S. hospitals that performed at least 10 heart transplant surgeries. Also the exposure, the expected number of deaths, is recorded for each hospital.
Usage
hearttransplants
Format
A data frame with 94 observations on the following 2 variables.
- e
expected number of deaths (the exposure)
- y
observed number of deaths within 30 days of heart transplant surgery
Source
Christiansen, C. and Morris, C. (1995), Fitting and checking a two-level Poisson model: modeling patient mortality rates in heart transplant patients, in Berry, D. and Stangl, D., eds, Bayesian Biostatistics, Marcel Dekker.
Gibbs sampling for a hierarchical regression model
Description
Implements Gibbs sampling for estimating a two-way table of means under a hierarchical regression model.
Usage
hiergibbs(data,m)
Arguments
data |
data matrix with columns observed sample means, sample sizes, and values of two covariates |
m |
number of cycles of Gibbs sampling |
Value
beta |
matrix of simulated values of regression vector |
mu |
matrix of simulated values of cell means |
var |
vector of simulated values of second-stage prior variance |
Author(s)
Jim Albert
Examples
data(iowagpa)
m=1000
s=hiergibbs(iowagpa,m)
Density function of a histogram distribution
Description
Computes the density of a probability distribution defined on a set of equal-width intervals
Usage
histprior(p,midpts,prob)
Arguments
p |
vector of values for which density is to be computed |
midpts |
vector of midpoints of the intervals |
prob |
vector of probabilities of the intervals |
Value
vector of values of the probability density
Author(s)
Jim Albert
Examples
midpts=c(.1,.3,.5,.7,.9)
prob=c(.2,.2,.4,.1,.1)
p=seq(.01,.99,by=.01)
plot(p,histprior(p,midpts,prob),type="l")
Logarithm of Howard's dependent prior for two proportions
Description
Computes the logarithm of a dependent prior on two proportions proposed by Howard in a Statistical Science paper in 1998.
Usage
howardprior(xy,par)
Arguments
xy |
vector of proportions p1 and p2 |
par |
vector containing parameter values alpha, beta, gamma, delta, sigma |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
param=c(1,1,1,1,2)
p=c(.1,.5)
howardprior(p,param)
Importance sampling using a t proposal density
Description
Implements importance sampling to compute the posterior mean of a function using a multivariate t proposal density
Usage
impsampling(logf,tpar,h,n,data)
Arguments
logf |
function that defines the logarithm of the density of interest |
tpar |
list of parameters of t proposal density including the mean m, scale matrix var, and degrees of freedom df |
h |
function that defines h(theta) |
n |
number of simulated draws from proposal density |
data |
data and or parameters used in the function logf |
Value
est |
estimate at the posterior mean |
se |
simulation standard error of estimate |
theta |
matrix of simulated draws from proposal density |
wt |
vector of importance sampling weights |
Author(s)
Jim Albert
Examples
data(cancermortality)
start=c(-7,6)
fit=laplace(betabinexch,start,cancermortality)
tpar=list(m=fit$mode,var=2*fit$var,df=4)
myfunc=function(theta) return(theta[2])
theta=impsampling(betabinexch,tpar,myfunc,1000,cancermortality)
Independence Metropolis independence chain of a posterior distribution
Description
Simulates iterates of an independence Metropolis chain with a normal proposal density for an arbitrary real-valued posterior density defined by the user
Usage
indepmetrop(logpost,proposal,start,m,...)
Arguments
logpost |
function defining the log posterior density |
proposal |
a list containing mu, an estimated mean and var, an estimated variance-covariance matrix, of the normal proposal density |
start |
vector containing the starting value of the parameter |
m |
the number of iterations of the chain |
... |
data that is used in the function logpost |
Value
par |
a matrix of simulated values where each row corresponds to a value of the vector parameter |
accept |
the acceptance rate of the algorithm |
Author(s)
Jim Albert
Examples
data=c(6,2,3,10)
proposal=list(mu=array(c(2.3,-.1),c(2,1)),var=diag(c(1,1)))
start=array(c(0,0),c(1,2))
m=1000
fit=indepmetrop(logctablepost,proposal,start,m,data)
Admissions data for an university
Description
Students at a major university are categorized with respect to their high school rank and their ACT score. For each combination of high school rank and ACT score, one records the mean grade point average (GPA).
Usage
iowagpa
Format
A data frame with 40 observations on the following 4 variables.
- gpa
mean grade point average
- n
sample size
- HSR
high school rank
- ACT
act score
Source
Albert, J. (1994), A Bayesian approach to estimation of GPA's of University of Iowa freshmen under order restrictions, Journal of Educational Statistics, 19, 1-22.
Hitting data for Derek Jeter
Description
Batting data for the baseball player Derek Jeter for all 154 games in the 2004 season.
Usage
jeter2004
Format
A data frame with 154 observations on the following 10 variables.
- Game
the game number
- AB
the number of at-bats
- R
the number of runs scored
- H
the number of hits
- X2B
the number of doubles
- X3B
the number of triples
- HR
the number of home runs
- RBI
the number of runs batted in
- BB
the number of walks
- SO
the number of strikeouts
Source
Collected from game log data from www.retrosheet.org.
Summarization of a posterior density by the Laplace method
Description
For a general posterior density, computes the posterior mode, the associated variance-covariance matrix, and an estimate at the logarithm at the normalizing constant.
Usage
laplace(logpost,mode,...)
Arguments
logpost |
function that defines the logarithm of the posterior density |
mode |
vector that is a guess at the posterior mode |
... |
vector or list of parameters associated with the function logpost |
Value
mode |
current estimate at the posterior mode |
var |
current estimate at the associated variance-covariance matrix |
int |
estimate at the logarithm of the normalizing constant |
converge |
indication (TRUE or FALSE) if the algorithm converged |
Author(s)
Jim Albert
Examples
logpost=function(theta,data)
{
s=5
sum(-log(1+(data-theta)^2/s^2))
}
data=c(10,12,14,13,12,15)
start=10
laplace(logpost,start,data)
Logarithm of bivariate normal density
Description
Computes the logarithm of a bivariate normal density
Usage
lbinorm(xy,par)
Arguments
xy |
vector of values of two variables x and y |
par |
list with components m, a vector of means, and v, a variance-covariance matrix |
Value
value of the kernel of the log density
Author(s)
Jim Albert
Examples
mean=c(0,0)
varcov=diag(c(1,1))
value=c(1,1)
param=list(m=mean,v=varcov)
lbinorm(value,param)
Log posterior of difference and sum of logits in a 2x2 table
Description
Computes the log posterior density for the difference and sum of logits in a 2x2 contingency table for independent binomial samples and uniform prior placed on the logits
Usage
logctablepost(theta,data)
Arguments
theta |
vector of parameter values "difference of logits" and "sum of logits") |
data |
vector containing number of successes and failures for first sample, and then second sample |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
s1=6; f1=2; s2=3; f2=10
data=c(s1,f1,s2,f2)
theta=c(2,4)
logctablepost(theta,data)
Log posterior for a binary response model with a logistic link and a uniform prior
Description
Computes the log posterior density of (beta0, beta1) when yi are independent binomial(ni, pi) and logit(pi)=beta0+beta1*xi and a uniform prior is placed on (beta0, beta1)
Usage
logisticpost(beta,data)
Arguments
beta |
vector of parameter values beta0 and beta1 |
data |
matrix of columns of covariate values x, sample sizes n, and number of successes y |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
x = c(-0.86,-0.3,-0.05,0.73)
n = c(5,5,5,5)
y = c(0,1,3,5)
data = cbind(x, n, y)
beta=c(2,10)
logisticpost(beta,data)
Log posterior with Poisson sampling and gamma prior
Description
Computes the logarithm of the posterior density of a Poisson log mean with a gamma prior
Usage
logpoissgamma(theta,datapar)
Arguments
theta |
vector of values of the log mean parameter |
datapar |
list with components data, vector of observations, and par, vector of parameters of the gamma prior |
Value
vector of values of the log posterior for all values in theta
Author(s)
Jim Albert
Examples
data=c(2,4,3,6,1,0,4,3,10,2)
par=c(1,1)
datapar=list(data=data,par=par)
theta=c(-1,0,1,2)
logpoissgamma(theta,datapar)
Log posterior with Poisson sampling and normal prior
Description
Computes the logarithm of the posterior density of a Poisson log mean with a normal prior
Usage
logpoissnormal(theta,datapar)
Arguments
theta |
vector of values of the log mean parameter |
datapar |
list with components data, vector of observations, and par, vector of parameters of the normal prior |
Value
vector of values of the log posterior for all values in theta
Author(s)
Jim Albert
Examples
data=c(2,4,3,6,1,0,4,3,10,2)
par=c(0,1)
datapar=list(data=data,par=par)
theta=c(-1,0,1,2)
logpoissnormal(theta,datapar)
Marathon running times
Description
Running times in minutes for twenty male runners between the ages 20 and 29 who ran the New York Marathon.
Usage
marathontimes
Format
A data frame with 20 observations on the following 1 variable.
- time
running time
Source
www.nycmarathon.org website.
Bayesian test of one-sided hypothesis about a normal mean
Description
Computes a Bayesian test of the hypothesis that a normal mean is less than or equal to a specified value
Usage
mnormt.onesided(m0,normpar,data)
Arguments
m0 |
value of the normal mean to be tested |
normpar |
vector of mean and standard deviation of the normal prior distribution |
data |
vector of sample mean, sample size, and known value of the population standard deviation |
Value
BF |
Bayes factor in support of the null hypothesis |
prior.odds |
prior odds of the null hypothesis |
post.odds |
posterior odds of the null hypothesis |
postH |
posterior probability of the null hypothesis |
Author(s)
Jim Albert
Examples
y=c(182,172,173,176,176,180,173,174,179,175)
pop.s=3
data=c(mean(y),length(data),pop.s)
m0=175
normpar=c(170,1000)
mnormt.onesided(m0,normpar,data)
Bayesian test of a two-sided hypothesis about a normal mean
Description
Bayesian test that a normal mean is equal to a specified value using a normal prior
Usage
mnormt.twosided(m0, prob, t, data)
Arguments
m0 |
value of the mean to be tested |
prob |
prior probability of the hypothesis |
t |
vector of values of the prior standard deviation under the alternative hypothesis |
data |
vector containing the sample mean, the sample size, and the known value of the population standard deviation |
Value
bf |
vector of values of the Bayes factor in support of the null hypothesis |
post |
vector of posterior probabilities of the null hypothesis |
Author(s)
Jim Albert
Examples
m0=170
prob=.5
tau=c(.5,1,2,4,8)
samplesize=10
samplemean=176
popsd=3
data=c(samplemean,samplesize,popsd)
mnormt.twosided(m0,prob,tau,data)
Contour plot of a bivariate density function
Description
For a general two parameter density, draws a contour graph where the contour lines are drawn at 10 percent, 1 percent, and .1 percent of the height at the mode.
Usage
mycontour(logf,limits,data,...)
Arguments
logf |
function that defines the logarithm of the density |
limits |
limits (xlo, xhi, ylo, yhi) where the graph is to be drawn |
data |
vector or list of parameters associated with the function logpost |
... |
further arguments to pass to contour |
Value
A contour graph of the density is drawn
Author(s)
Jim Albert
Examples
m=array(c(0,0),c(2,1))
v=array(c(1,.6,.6,1),c(2,2))
normpar=list(m=m,v=v)
mycontour(lbinorm,c(-4,4,-4,4),normpar)
Computes the posterior for normal sampling and a mixture of normals prior
Description
Computes the parameters and mixing probabilities for a normal sampling problem, variance known, where the prior is a discrete mixture of normal densities.
Usage
normal.normal.mix(probs,normalpar,data)
Arguments
probs |
vector of probabilities of the normal components of the prior |
normalpar |
matrix where each row contains the mean and variance parameters for a normal component of the prior |
data |
vector of observation and sampling variance |
Value
probs |
vector of probabilities of the normal components of the posterior |
normalpar |
matrix where each row contains the mean and variance parameters for a normal component of the posterior |
Author(s)
Jim Albert
Examples
probs=c(.5, .5)
normal.par1=c(0,1)
normal.par2=c(2,.5)
normalpar=rbind(normal.par1,normal.par2)
y=1; sigma2=.5
data=c(y,sigma2)
normal.normal.mix(probs,normalpar,data)
Selection of Normal Prior Given Knowledge of Two Quantiles
Description
Finds the mean and standard deviation of a normal density that matches knowledge of two quantiles of the distribution.
Usage
normal.select(quantile1, quantile2)
Arguments
quantile1 |
list with components p, the value of the first probability, and x, the value of the first quantile |
quantile2 |
list with components p, the value of the second probability, and x, the value of the second quantile |
Value
mean |
mean of the matching normal distribution |
sigma |
standard deviation of the matching normal distribution |
Author(s)
Jim Albert
Examples
# person believes the 15th percentile of the prior is 100
# and the 70th percentile of the prior is 150
quantile1=list(p=.15,x=100)
quantile2=list(p=.7,x=150)
normal.select(quantile1,quantile2)
Log posterior density for mean and variance for normal sampling
Description
Computes the log of the posterior density of a mean M and a variance S2 when a sample is taken from a normal density and a standard noninformative prior is used.
Usage
normchi2post(theta,data)
Arguments
theta |
vector of parameter values M and S2 |
data |
vector containing the sample observations |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
parameter=c(25,5)
data=c(20, 32, 21, 43, 33, 21, 32)
normchi2post(parameter,data)
Log posterior of mean and log standard deviation for Normal/Normal exchangeable model
Description
Computes the log posterior density of mean and log standard deviation for a Normal/Normal exchangeable model where (mean, log sd) is given a uniform prior.
Usage
normnormexch(theta,data)
Arguments
theta |
vector of parameter values of mu and log tau |
data |
a matrix with columns y (observations) and v (sampling variances) |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
s.var <- c(0.05, 0.05, 0.05, 0.05, 0.05)
y.means <- c(1, 4, 3, 6,10)
data=cbind(y.means, s.var)
theta=c(-1, 0)
normnormexch(theta,data)
Posterior predictive simulation from Bayesian normal sampling model
Description
Given simulated draws from the posterior from a normal sampling model, outputs simulated draws from the posterior predictive distribution of a statistic of interest.
Usage
normpostpred(parameters,sample.size,f=min)
Arguments
parameters |
list of simulated draws from the posterior where mu contains the normal mean and sigma2 contains the normal variance |
sample.size |
size of sample of future sample |
f |
function defining the statistic |
Value
simulated sample of the posterior predictive distribution of the statistic
Author(s)
Jim Albert
Examples
# finds posterior predictive distribution of the min statistic of a future sample of size 15
data(darwin)
s=normpostsim(darwin$difference)
sample.size=15
sim.stats=normpostpred(s,sample.size,min)
Simulation from Bayesian normal sampling model
Description
Gives a simulated sample from the joint posterior distribution of the mean and variance for a normal sampling prior with a noninformative or informative prior. The prior assumes mu and sigma2 are independent with mu assigned a normal prior with mean mu0 and variance tau2, and sigma2 is assigned a inverse gamma prior with parameters a and b.
Usage
normpostsim(data,prior=NULL,m=1000)
Arguments
data |
vector of observations |
prior |
list with components mu, a vector with the prior mean and variance, and sigma2, a vector of the inverse gamma parameters |
m |
number of simulations desired |
Value
mu |
vector of simulated draws of normal mean |
sigma2 |
vector of simulated draws of normal variance |
Author(s)
Jim Albert
Examples
data(darwin)
s=normpostsim(darwin$difference)
Gibbs sampling for a hierarchical regression model
Description
Implements Gibbs sampling for estimating a two-way table of means under a order restriction.
Usage
ordergibbs(data,m)
Arguments
data |
data matrix with first two columns observed sample means and sample sizes |
m |
number of cycles of Gibbs sampling |
Value
matrix of simulated draws of the normal means where each row represents one simulated draw
Author(s)
Jim Albert
Examples
data(iowagpa)
m=1000
s=ordergibbs(iowagpa,m)
Predictive distribution for a binomial sample with a beta prior
Description
Computes predictive distribution for number of successes of future binomial experiment with a beta prior distribution for the proportion.
Usage
pbetap(ab, n, s)
Arguments
ab |
vector of parameters of the beta prior |
n |
size of future binomial sample |
s |
vector of number of successes for future binomial experiment |
Value
vector of predictive probabilities for the values in the vector s
Author(s)
Jim Albert
Examples
ab=c(3,12)
n=10
s=0:10
pbetap(ab,n,s)
Bayesian test of a proportion
Description
Bayesian test that a proportion is equal to a specified value using a beta prior
Usage
pbetat(p0,prob,ab,data)
Arguments
p0 |
value of the proportion to be tested |
prob |
prior probability of the hypothesis |
ab |
vector of parameter values of the beta prior under the alternative hypothesis |
data |
vector containing the number of successes and number of failures |
Value
bf |
the Bayes factor in support of the null hypothesis |
post |
the posterior probability of the null hypothesis |
Author(s)
Jim Albert
Examples
p0=.5
prob=.5
ab=c(10,10)
data=c(5,15)
pbetat(p0,prob,ab,data)
Posterior distribution for a proportion with discrete priors
Description
Computes the posterior distribution for a proportion for a discrete prior distribution.
Usage
pdisc(p, prior, data)
Arguments
p |
vector of proportion values |
prior |
vector of prior probabilities |
data |
vector consisting of number of successes and number of failures |
Value
vector of posterior probabilities
Author(s)
Jim Albert
Examples
p=c(.2,.25,.3,.35)
prior=c(.25,.25,.25,.25)
data=c(5,10)
pdisc(p,prior,data)
Predictive distribution for a binomial sample with a discrete prior
Description
Computes predictive distribution for number of successes of future binomial experiment with a discrete distribution for the proportion.
Usage
pdiscp(p, probs, n, s)
Arguments
p |
vector of proportion values |
probs |
vector of probabilities |
n |
size of future binomial sample |
s |
vector of number of successes for future binomial experiment |
Value
vector of predictive probabilities for the values in the vector s
Author(s)
Jim Albert
Examples
p=c(.1,.2,.3,.4,.5,.6,.7,.8,.9)
prob=c(0.05,0.10,0.10,0.15,0.20,0.15,0.10,0.10,0.05)
n=10
s=0:10
pdiscp(p,prob,n,s)
Log posterior of Poisson/gamma exchangeable model
Description
Computes the log posterior density of log alpha and log mu for a Poisson/gamma exchangeable model
Usage
poissgamexch(theta,datapar)
Arguments
theta |
vector of parameter values of log alpha and log mu |
datapar |
list with components data, a matrix with columns e and y, and z0, prior hyperparameter |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
e=c(532,584,672,722,904)
y=c(0,0,2,1,1)
data=cbind(e,y)
theta=c(-4,0)
z0=.5
datapar=list(data=data,z0=z0)
poissgamexch(theta,datapar)
Computes the posterior for Poisson sampling and a mixture of gammas prior
Description
Computes the parameters and mixing probabilities for a Poisson sampling problem where the prior is a discrete mixture of gamma densities.
Usage
poisson.gamma.mix(probs,gammapar,data)
Arguments
probs |
vector of probabilities of the gamma components of the prior |
gammapar |
matrix where each row contains the shape and rate parameters for a gamma component of the prior |
data |
list with components y, vector of counts, and t, vector of time intervals |
Value
probs |
vector of probabilities of the gamma components of the posterior |
gammapar |
matrix where each row contains the shape and rate parameters for a gamma component of the posterior |
Author(s)
Jim Albert
Examples
probs=c(.5, .5)
gamma.par1=c(1,1)
gamma.par2=c(10,2)
gammapar=rbind(gamma.par1,gamma.par2)
y=c(1,3,2,4,10); t=c(1,1,1,1,1)
data=list(y=y,t=t)
poisson.gamma.mix(probs,gammapar,data)
Plot of predictive distribution for binomial sampling with a beta prior
Description
For a proportion problem with a beta prior, plots the prior predictive distribution of the number of successes in n trials and displays the observed number of successes.
Usage
predplot(prior,n,yobs)
Arguments
prior |
vector of parameters for beta prior |
n |
sample size |
yobs |
observed number of successes |
Author(s)
Jim Albert
Examples
prior=c(3,10) # proportion has a beta(3, 10) prior
n=20 # sample size
yobs=10 # observed number of successes
predplot(prior,n,yobs)
Construct discrete uniform prior for two parameters
Description
Constructs a discrete uniform prior distribution for two parameters
Usage
prior.two.parameters(parameter1, parameter2)
Arguments
parameter1 |
vector of values of first parameter |
parameter2 |
vector of values of second parameter |
Value
matrix of uniform probabilities where the rows and columns are labelled with the parameter values
Author(s)
Jim Albert
Examples
prior.two.parameters(c(1,2,3,4),c(2,4,7))
Bird measurements from British islands
Description
Measurements on breedings of the common puffin on different habits at Great Island, Newfoundland.
Usage
puffin
Format
A data frame with 38 observations on the following 5 variables.
- Nest
nesting frequency (burrows per 9 square meters)
- Grass
grass cover (percentage)
- Soil
mean soil depth (in centimeters)
- Angle
angle of slope (in degrees)
- Distance
distance from cliff edge (in meters)
Source
Peck, R., Devore, J., and Olsen, C. (2005), Introduction to Statistics And Data Analysis, Thomson Learning.
Random draws from a Dirichlet distribution
Description
Simulates a sample from a Dirichlet distribution
Usage
rdirichlet(n,par)
Arguments
n |
number of simulations required |
par |
vector of parameters of the Dirichlet distribution |
Value
matrix of simulated draws where each row corresponds to a single draw
Author(s)
Jim Albert
Examples
par=c(2,5,4,10)
n=10
rdirichlet(n,par)
Computes the log posterior of a normal regression model with a g prior.
Description
Computes the log posterior of (beta, log sigma) for a normal regression model with a g prior with parameters beta0 and c0.
Usage
reg.gprior.post(theta, dataprior)
Arguments
theta |
vector of components of beta and log sigma |
dataprior |
list with components data and prior; data is a list with components y and X, prior is a list with components b0 and c0 |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
data(puffin)
data=list(y=puffin$Nest, X=cbind(1,puffin$Distance))
prior=list(b0=c(0,0), c0=10)
reg.gprior.post(c(20,-.5,1),list(data=data,prior=prior))
Collapses a matrix by summing over rows
Description
Collapses a matrix by summing over a specific number of rows
Usage
regroup(data,g)
Arguments
data |
a matrix |
g |
a positive integer beween 1 and the number of rows of data |
Value
reduced matrix found by summing over rows
Author(s)
Jim Albert
Examples
data=matrix(c(1:20),nrow=4,ncol=5)
g=2
regroup(data,2)
Rejecting sampling using a t proposal density
Description
Implements a rejection sampling algorithm for a probability density using a multivariate t proposal density
Usage
rejectsampling(logf,tpar,dmax,n,data)
Arguments
logf |
function that defines the logarithm of the density of interest |
tpar |
list of parameters of t proposal density including the mean m, scale matrix var, and degrees of freedom df |
dmax |
logarithm of the rejection sampling constant |
n |
number of simulated draws from proposal density |
data |
data and or parameters used in the function logf |
Value
matrix of simulated draws from density of interest
Author(s)
Jim Albert
Examples
data(cancermortality)
start=c(-7,6)
fit=laplace(betabinexch,start,cancermortality)
tpar=list(m=fit$mode,var=2*fit$var,df=4)
theta=rejectsampling(betabinexch,tpar,-569.2813,1000,cancermortality)
Random number generation for inverse gamma distribution
Description
Simulates from a inverse gamma (a, b) distribution with density proportional to $y^(-a-1) exp(-b/y)$
Usage
rigamma(n, a, b)
Arguments
n |
number of random numbers to be generated |
a |
inverse gamma shape parameter |
b |
inverse gamma rate parameter |
Value
vector of n simulated draws
Author(s)
Jim Albert
Examples
a=10
b=5
n=20
rigamma(n,a,b)
Random number generation for multivariate normal
Description
Simulates from a multivariate normal distribution
Usage
rmnorm(n = 1, mean = rep(0, d), varcov)
Arguments
n |
number of random numbers to be generated |
mean |
numeric vector giving the mean of the distribution |
varcov |
a positive definite matrix representing the variance-covariance matrix of the distribution |
Value
matrix of n rows of random vectors
Author(s)
Jim Albert
Examples
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
x <- rmnorm(10, mu, Sigma)
Random number generation for multivariate t
Description
Simulates from a multivariate t distribution
Usage
rmt(n = 1, mean = rep(0, d), S, df = Inf)
Arguments
n |
number of random numbers to be generated |
mean |
numeric vector giving the location parameter of the distribution |
S |
a positive definite matrix representing the scale matrix of the distribution |
df |
degrees of freedom |
Value
matrix of n rows of random vectors
Author(s)
Jim Albert
Examples
mu <- c(1,12,2)
Sigma <- matrix(c(1,2,0,2,5,0.5,0,0.5,3), 3, 3)
df <- 4
x <- rmt(10, mu, Sigma, df)
Gibbs sampling for a robust regression model
Description
Implements Gibbs sampling for a robust t sampling model with location mu, scale sigma, and degrees of freedom v
Usage
robustt(y,v,m)
Arguments
y |
vector of data values |
v |
degrees of freedom for t model |
m |
the number of cycles of the Gibbs sampler |
Value
mu |
vector of simulated values of mu |
s2 |
vector of simulated values of sigma2 |
lam |
matrix of simulated draws of lambda, where each row corresponds to a single draw |
Author(s)
Jim Albert
Examples
data=c(-67,-48,6,8,14,16,23,24,28,29,41,49,67,60,75)
fit=robustt(data,4,1000)
Simulates from a truncated probability distribution
Description
Simulates a sample from a truncated distribution where the functions for the cdf and inverse cdf are available.
Usage
rtruncated(n,lo,hi,pf,qf,...)
Arguments
n |
size of simulated sample |
lo |
low truncation point |
hi |
high truncation point |
pf |
function containing cdf of untruncated distribution |
qf |
function containing inverse cdf of untruncated distribution |
... |
parameters used in the functions pf and qf |
Value
vector of simulated draws from distribution
Author(s)
Jim Albert
Examples
# want a sample of 10 from normal(2, 1) distribution truncated below by 3
n=10
lo=3
hi=Inf
rtruncated(n,lo,hi,pnorm,qnorm,mean=2,sd=1)
# want a sample of 20 from beta(2, 5) distribution truncated to (.3, .8)
n=20
lo=0.3
hi=0.8
rtruncated(n,lo,hi,pbeta,qbeta,2,5)
Random walk Metropolis algorithm of a posterior distribution
Description
Simulates iterates of a random walk Metropolis chain for an arbitrary real-valued posterior density defined by the user
Usage
rwmetrop(logpost,proposal,start,m,...)
Arguments
logpost |
function defining the log posterior density |
proposal |
a list containing var, an estimated variance-covariance matrix, and scale, the Metropolis scale factor |
start |
vector containing the starting value of the parameter |
m |
the number of iterations of the chain |
... |
data that is used in the function logpost |
Value
par |
a matrix of simulated values where each row corresponds to a value of the vector parameter |
accept |
the acceptance rate of the algorithm |
Author(s)
Jim Albert
Examples
data=c(6,2,3,10)
varcov=diag(c(1,1))
proposal=list(var=varcov,scale=2)
start=array(c(1,1),c(1,2))
m=1000
s=rwmetrop(logctablepost,proposal,start,m,data)
Batting data for Mike Schmidt
Description
Batting statistics for the baseball player Mike Schmidt during all the seasons of his career.
Usage
schmidt
Format
A data frame with 18 observations on the following 14 variables.
- Year
year of the season
- Age
Schmidt's age that season
- G
games played
- AB
at-bats
- R
runs scored
- H
number of hits
- X2B
number of doubles
- X3B
number of triples
- HR
number of home runs
- RBI
number of runs batted in
- SB
number of stolen bases
- CS
number of times caught stealing
- BB
number of walks
- SO
number of strikeouts
Source
Sean Lahman's baseball database from www.baseball1.com.
Simulated draws from a bivariate density function on a grid
Description
For a general two parameter density defined on a grid, simulates a random sample.
Usage
simcontour(logf,limits,data,m)
Arguments
logf |
function that defines the logarithm of the density |
limits |
limits (xlo, xhi, ylo, yhi) that cover the joint probability density |
data |
vector or list of parameters associated with the function logpost |
m |
size of simulated sample |
Value
x |
vector of simulated draws of the first parameter |
y |
vector of simulated draws of the second parameter |
Author(s)
Jim Albert
Examples
m=array(c(0,0),c(2,1))
v=array(c(1,.6,.6,1),c(2,2))
normpar=list(m=m,v=v)
s=simcontour(lbinorm,c(-4,4,-4,4),normpar,1000)
plot(s$x,s$y)
Sampling importance resampling
Description
Implements sampling importance resampling for a multivariate t proposal density.
Usage
sir(logf,tpar,n,data)
Arguments
logf |
function defining logarithm of density of interest |
tpar |
list of parameters of multivariate t proposal density including the mean m, the scale matrix var, and the degrees of freedom df |
n |
number of simulated draws from the posterior |
data |
data and parameters used in the function logf |
Value
matrix of simulated draws from the posterior where each row corresponds to a single draw
Author(s)
Jim Albert
Examples
data(cancermortality)
start=c(-7,6)
fit=laplace(betabinexch,start,cancermortality)
tpar=list(m=fit$mode,var=2*fit$var,df=4)
theta=sir(betabinexch,tpar,1000,cancermortality)
Hitting statistics for ten great baseball players
Description
Career hitting statistics for ten great baseball players
Usage
sluggerdata
Format
A data frame with 199 observations on the following 13 variables.
- Player
names of the ballplayer
- Year
season played
- Age
age of the player during the season
- G
games played
- AB
number of at-bats
- R
number of runs scored
- H
number of hits
- X2B
number of doubles
- X3B
number of triples
- HR
number of home runs
- RBI
runs batted in
- BB
number of base on balls
- SO
number of strikeouts
Source
Sean Lahman's baseball database from www.baseball1.com.
Goals scored by professional soccer team
Description
Number of goals scored by a single professional soccer team during the 2006 Major League Soccer season
Usage
soccergoals
Format
A data frame with 35 observations on the following 1 variable.
- goals
number of goals scored
Source
Collected by author from the www.espn.com website.
Data from Stanford Heart Transplanation Program
Description
Heart transplant data for 82 patients from Stanford Heart Transplanation Program
Usage
stanfordheart
Format
A data frame with 82 observations on the following 4 variables.
- survtime
survival time in months
- transplant
variable that is 1 or 0 if patient had transplant or not
- timetotransplant
time a transplant patient waits for operation
- state
variable that is 1 or 0 if time is censored or not
Source
Turnbull, B., Brown, B. and Hu, M. (1974), Survivorship analysis of heart transplant data, Journal of the American Statistical Association, 69, 74-80.
Baseball strikeout data
Description
For all professional baseball players in the 2004 season, dataset gives the number of strikeouts and at-bats when runners are in scoring position and when runners are not in scoring position.
Usage
strikeout
Format
A data frame with 438 observations on the following 4 variables.
- r
number of strikeouts of player when runners are not in scoring position
- n
number of at-bats of player when runners are not in scoring position
- s
number of strikeouts of player when runners are in scoring position
- m
number of at-bats of player when runners are in scoring position
Source
Collected from www.espn.com website.
Student dataset
Description
Answers to a sheet of questions given to a large number of students in introductory statistics classes
Usage
studentdata
Format
A data frame with 657 observations on the following 11 variables.
- Student
student number
- Height
height in inches
- Gender
gender
- Shoes
number of pairs of shoes owned
- Number
number chosen between 1 and 10
- Dvds
name of movie dvds owned
- ToSleep
time the person went to sleep the previous night (hours past midnight)
- WakeUp
time the person woke up the next morning
- Haircut
cost of last haircut including tip
- Job
number of hours working on a job per week
- Drink
usual drink at suppertime among milk, water, and pop
Source
Collected by the author during the Fall 2006 semester.
Log posterior of a Pareto model for survival data
Description
Computes the log posterior density of (log tau, log lambda, log p) for a Pareto model for survival data
Usage
transplantpost(theta,data)
Arguments
theta |
vector of parameter values of log tau, log lambda, and log p |
data |
data matrix with columns survival time, transplant indicator, time to transplant, and censoring indicator |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
data(stanfordheart)
theta=c(0,3,-1)
transplantpost(theta,stanfordheart)
Plot of prior, likelihood and posterior for a proportion
Description
For a proportion problem with a beta prior, plots the prior, likelihood and posterior on one graph.
Usage
triplot(prior,data,where="topright")
Arguments
prior |
vector of parameters for beta prior |
data |
vector consisting of number of successes and number of failures |
where |
the location of the legend for the plot |
Author(s)
Jim Albert
Examples
prior=c(3,10) # proportion has a beta(3, 10) prior
data=c(10,6) # observe 10 successes and 6 failures
triplot(prior,data)
Log posterior of a Weibull proportional odds model for survival data
Description
Computes the log posterior density of (log sigma, mu, beta) for a Weibull proportional odds regression model
Usage
weibullregpost(theta,data)
Arguments
theta |
vector of parameter values log sigma, mu, and beta |
data |
data matrix with columns survival time, censoring variable, and covariate matrix |
Value
value of the log posterior
Author(s)
Jim Albert
Examples
data(chemotherapy)
attach(chemotherapy)
d=cbind(time,status,treat-1,age)
theta=c(-.6,11,.6,0)
weibullregpost(theta,d)