Title: | MCMC Generalised Linear Mixed Models |
---|---|
Description: | Fits Multivariate Generalised Linear Mixed Models (and related models) using Markov chain Monte Carlo techniques (Hadfield 2010 J. Stat. Soft.). |
Authors: | Jarrod Hadfield |
Maintainer: | Jarrod Hadfield <[email protected]> |
License: | GPL (>= 2) |
Version: | 2.36 |
Built: | 2024-12-20 14:46:09 UTC |
Source: | https://github.com/jarrodhadfield/mcmcglmm |
MCMCglmm is a package for fitting Generalised Linear Mixed Models using Markov chain Monte Carlo techniques (Hadfield 2009). Most commonly used distributions like the normal and the Poisson are supported together with some useful but less popular ones like the zero-inflated Poisson and the multinomial. Missing values and left, right and interval censoring are accommodated for all traits. The package also supports multi-trait models where the multiple responses can follow different types of distribution. The package allows various residual and random-effect variance structures to be specified including heterogeneous variances, unstructured covariance matrices and random regression (e.g. random slope models). Three special types of variance structure that can be specified are those associated with pedigrees (animal models), phylogenies (the comparative method) and measurement error (meta-analysis).
The package makes heavy use of results in Sorensen & Gianola (2002) and Davis (2006) which taken together result in what is hopefully a fast and efficient routine. Most small to medium sized problems should take seconds to a few minutes, but large problems (> 20,000 records) are possible. My interest is in evolutionary biology so there are also several functions for applying Rice's (2004) tensor analysis to real data and functions for visualising and comparing matrices.
Please read the tutorial vignette("Tutorial", "MCMCglmm")
or the course notes
vignette("CourseNotes", "MCMCglmm")
Jarrod Hadfield [email protected]
Hadfield, J.D. (2009) MCMC methods for Multi-response Generalised Linear Mixed Models: The MCMCglmm R Package, submitted
Sorensen, D. & Gianola, D. (2002) Likelihood, Bayesian and MCMC Methods in Quantitative Genetics, Springer
Davis, T.A. (2006) Direct Methods for Sparse Linear Systems, SIAM
Rice (2004) Evolutionary Theory: Mathematical and Conceptual Foundations, Sinauer
Incidence matrix of levels within a factor
at.level(x, level)
at.level(x, level)
x |
factor |
level |
factor level |
incidence matrix for level in x
Jarrod Hadfield [email protected]
fac<-gl(3,10,30, labels=letters[1:3]) x<-rnorm(30) model.matrix(~at.level(fac,"b"):x)
fac<-gl(3,10,30, labels=letters[1:3]) x<-rnorm(30) model.matrix(~at.level(fac,"b"):x)
Incidence Matrix of Combined Levels within a Factor
at.set(x, level)
at.set(x, level)
x |
factor |
level |
set of factor levels |
incidence matrix for the set level
in x
Jarrod Hadfield [email protected]
fac<-gl(3,10,30, labels=letters[1:3]) x<-rnorm(30) model.matrix(~at.set(fac,2:3):x)
fac<-gl(3,10,30, labels=letters[1:3]) x<-rnorm(30) model.matrix(~at.set(fac,2:3):x)
Blue Tit (Cyanistes caeruleus) Data for a Quantitative Genetic Experiment
data(BTdata)
data(BTdata)
a data frame with 828 rows and 7 columns, with variables tarsus length (tarsus
) and colour (back
) measured on 828 individuals (animal
). The mother of each is also recorded (dam
) together with the foster nest (fosternest
) in which the chicks were reared. The date on which the first egg in each nest hatched (hatchdate
) is recorded together with the sex (sex
) of the individuals.
Hadfield, J.D. et. al. 2007 Journal of Evolutionary Biology 20 549-557
Blue Tit (Cyanistes caeruleus) Pedigree
data(BTped)
data(BTped)
a data frame with 1040 rows and 3 columns, with individual identifier (animal
) mother identifier (dam
) and father identifier (sire
). The first 212 rows are the parents of the 828 offspring from 106 full-sibling families. Parents are assumed to be unrelated to each other and have NA's in the dam
and sire
column.
Hadfield, J.D. et. al. 2007 Journal of Evolutionary Biology 20 549-557
Forms the expected covariance structure of link-scale observations for GLMMs fitted with MCMCglmm
buildV(object, marginal=object$Random$formula, diag=TRUE, it=NULL, posterior="mean", ...)
buildV(object, marginal=object$Random$formula, diag=TRUE, it=NULL, posterior="mean", ...)
object |
an object of class |
marginal |
formula defining random effects to be maginalised |
diag |
logical; if |
it |
integer; optional, MCMC iteration on which covariance structure should be based |
posterior |
character; if |
... |
Further arguments to be passed |
If diag=TRUE
an n by n covariance matrix. If diag=FALSE
and posterior!="all"
an 1 by n matrix of variances. If posterior=="all"
an nit by n matrix of variances (where nit is the number of saved MCMC iterations).
Jarrod Hadfield [email protected]
Forms an mn x mn commutation matrix which transforms into
, where
is an m x n matrix
commutation(m, n)
commutation(m, n)
m |
integer; number of rows of A |
n |
integer; number of columns of A |
Commutation Matrix
Jarrod Hadfield [email protected]
Magnus, J. R. & Neudecker, H. (1979) Annals of Statistics 7 (2) 381-394
commutation(2,2)
commutation(2,2)
Density of a (conditional) multivariate normal variate
dcmvnorm(x, mean = 0, V = 1, keep=1, cond=(1:length(x))[-keep], log=FALSE)
dcmvnorm(x, mean = 0, V = 1, keep=1, cond=(1:length(x))[-keep], log=FALSE)
x |
vector of observations |
mean |
vector of means |
V |
covariance matrix |
keep |
vector of integers: observations for which density is required |
cond |
vector of integers: observations to condition on |
log |
if TRUE, density p is given as log(p) |
numeric
Jarrod Hadfield [email protected]
V1<-cbind(c(1,0.5), c(0.5,1)) dcmvnorm(c(0,2), c(0,0), V=V1, keep=1, cond=2) # density of x[1]=0 conditional on x[2]=2 given # x ~ MVN(c(0,0), V1) dcmvnorm(c(0,2), c(0,0), V=V1, keep=1, cond=NULL) # density of x[1]=0 marginal to x[2] dnorm(0,0,1) # same as univariate density V2<-diag(2) dcmvnorm(c(0,2), c(0,0), V=V2, keep=1, cond=2) # density of x[1]=0 conditional on x[2]=2 given # x ~ MVN(c(0,0), V2) dnorm(0,0,1) # same as univariate density because V2 is diagonal
V1<-cbind(c(1,0.5), c(0.5,1)) dcmvnorm(c(0,2), c(0,0), V=V1, keep=1, cond=2) # density of x[1]=0 conditional on x[2]=2 given # x ~ MVN(c(0,0), V1) dcmvnorm(c(0,2), c(0,0), V=V1, keep=1, cond=NULL) # density of x[1]=0 marginal to x[2] dnorm(0,0,1) # same as univariate density V2<-diag(2) dcmvnorm(c(0,2), c(0,0), V=V2, keep=1, cond=2) # density of x[1]=0 conditional on x[2]=2 given # x ~ MVN(c(0,0), V2) dnorm(0,0,1) # same as univariate density because V2 is diagonal
Calculates Ovaskainen's (2008) d-divergence between 2 zero-mean multivariate normal distributions.
Ddivergence(CA=NULL, CB=NULL, n=10000)
Ddivergence(CA=NULL, CB=NULL, n=10000)
CA |
Matrix A |
CB |
Matrix B |
n |
number of Monte Carlo samples for approximating the integral |
d-divergence
In versions of MCMCglmm <2.26 Ovaskainen's (2008) d-divergence was incorrectly calculated.
Jarrod Hadfield [email protected]
Ovaskainen, O. et. al. (2008) Proc. Roy. Soc - B (275) 1635 593-750
CA<-rIW(diag(2),10, n=1) CB<-rIW(diag(2),10, n=1) Ddivergence(CA, CB)
CA<-rIW(diag(2),10, n=1) CB<-rIW(diag(2),10, n=1) Ddivergence(CA, CB)
Unevaluated expressions for (mixed) partial derivatives of fitness with respect to linear predictors for survival and fecundity.
Dexpressions
Dexpressions
PW.d0W |
Fitness (W) function for the Poisson-Weibull (PW) model. |
PW.d1Wds |
First Partial derivative of fitness (d1W) with respect to survival (d1s) linear predictor for the Poisson-Weibull (PW) model. |
PW.d1Wdf |
First Partial derivative of fitness (d1W) with respect to fecundity (d1f) linear redictor for the Poisson-Weibull (PW) model. |
PW.d3Wd2sd1f |
Mixed third partial derivative of fitness (d3W) with 2nd derivative of survival linear predictor (d2s) and first derivative of fecundity linear predictor (d1f) from the Poisson-Weibull (PW) model. |
PW.d3Wdsd2f |
and so on ... |
PW.d2Wd2f |
|
PW.d2Wd2s |
|
PW.d3Wd3s |
|
PW.d3Wd3f |
Jarrod Hadfield [email protected]
Forms tensor of (mixed) partial derivatives
Dtensor(expr, name=NULL, mu = NULL, m=1, evaluate = TRUE)
Dtensor(expr, name=NULL, mu = NULL, m=1, evaluate = TRUE)
expr |
'expression' |
name |
character vector, giving the variable names with respect to which derivatives will be computed. If NULL all variables in the expression will be used |
mu |
optional: numeric vector, at which the derivatives are evaluated |
m |
order of derivative |
evaluate |
logical; if |
Dtensor |
(list) of unevaluated expression(s) if |
Jarrod Hadfield [email protected]
Rice, S.H. (2004) Evolutionary Theory: Mathematical and Conceptual Foundations. Sinauer (MA) USA.
f<-expression(beta_1 + time * beta_2 + u) Dtensor(f,eval=FALSE)
f<-expression(beta_1 + time * beta_2 + u) Dtensor(f,eval=FALSE)
Evaluates a list of (mixed) partial derivatives
evalDtensor(x, mu, m=1)
evalDtensor(x, mu, m=1)
x |
unevaluated (list) of expression(s) |
mu |
values at which the derivatives are evaluated: names need to match terms in x |
m |
order of derivative |
tensor
Jarrod Hadfield [email protected]
f<-expression(beta_1 + time*beta_2+u) Df<-Dtensor(f, eval=FALSE, m=2) evalDtensor(Df, mu=data.frame(beta_1=0.5, beta_2=1, time=3, u=2.3)) Dtensor(f, mu=c(1,3,1,2.3), m=2)
f<-expression(beta_1 + time*beta_2+u) Df<-Dtensor(f, eval=FALSE, m=2) evalDtensor(Df, mu=data.frame(beta_1=0.5, beta_2=1, time=3, u=2.3)) Dtensor(f, mu=c(1,3,1,2.3), m=2)
Prior Covariance Matrix for Fixed Effects.
gelman.prior(formula, data, scale=1, intercept=scale, singular.ok=FALSE)
gelman.prior(formula, data, scale=1, intercept=scale, singular.ok=FALSE)
formula |
|
data |
|
intercept |
prior standard deviation for the intercept |
scale |
prior standard deviation for regression parameters |
singular.ok |
logical: if |
Gelman et al. (2008) suggest that the input variables of a categorical regression are standardised and that the associated regression parameters are assumed independent in the prior. Gelman et al. (2008) recommend a scaled t-distribution with a single degree of freedom (scaled Cauchy) and a scale of 10 for the intercept and 2.5 for the regression parameters. If the degree of freedom is infinity (i.e. a normal distribution) then a prior covariance matrix B$V
can be defined for the regression parameters without input standardisation that corresponds to a diagonal prior for the regression parameters had the inputs been standardised. The diagonal elements of
are set to
scale^2
except the first which is set to intercept^2
. With logistic regression gives a prior that is approximately flat on the probability scale, where
is the total variance due to the random effects. For probit regression it is
.
prior covariance matrix
Jarrod Hadfield [email protected]
Gelman, A. et al. (2008) The Annals of Appled Statistics 2 4 1360-1383
dat<-data.frame(y=c(0,0,1,1), x=gl(2,2)) # data with complete separation ##################### # probit regression # ##################### prior1<-list( B=list(mu=c(0,0), V=gelman.prior(~x, data=dat, scale=sqrt(1+1))), R=list(V=1,fix=1)) m1<-MCMCglmm(y~x, prior=prior1, data=dat, family="ordinal", verbose=FALSE) c2<-1 p1<-pnorm(m1$Sol[,1]/sqrt(1+c2)) # marginal probability when x=1 ####################### # logistic regression # ####################### prior2<-list(B=list(mu=c(0,0), V=gelman.prior(~x, data=dat, scale=sqrt(pi^2/3+1))), R=list(V=1,fix=1)) m2<-MCMCglmm(y~x, prior=prior2, data=dat, family="categorical", verbose=FALSE) c2 <- (16 * sqrt(3)/(15 * pi))^2 p2<-plogis(m2$Sol[,1]/sqrt(1+c2)) # marginal probability when x=1 plot(mcmc.list(p1,p2))
dat<-data.frame(y=c(0,0,1,1), x=gl(2,2)) # data with complete separation ##################### # probit regression # ##################### prior1<-list( B=list(mu=c(0,0), V=gelman.prior(~x, data=dat, scale=sqrt(1+1))), R=list(V=1,fix=1)) m1<-MCMCglmm(y~x, prior=prior1, data=dat, family="ordinal", verbose=FALSE) c2<-1 p1<-pnorm(m1$Sol[,1]/sqrt(1+c2)) # marginal probability when x=1 ####################### # logistic regression # ####################### prior2<-list(B=list(mu=c(0,0), V=gelman.prior(~x, data=dat, scale=sqrt(pi^2/3+1))), R=list(V=1,fix=1)) m2<-MCMCglmm(y~x, prior=prior2, data=dat, family="categorical", verbose=FALSE) c2 <- (16 * sqrt(3)/(15 * pi))^2 p2<-plogis(m2$Sol[,1]/sqrt(1+c2)) # marginal probability when x=1 plot(mcmc.list(p1,p2))
Henderson (1976) and Meuwissen and Luo (1992) algorithm for inverting relatedness matrices, and Hadfield and Nakagawa (2010) algorithm for inverting phylogenetic covariance matrices.
inverseA(pedigree=NULL, nodes="ALL", scale=TRUE, reduced=FALSE, tol = .Machine$double.eps^0.5)
inverseA(pedigree=NULL, nodes="ALL", scale=TRUE, reduced=FALSE, tol = .Machine$double.eps^0.5)
pedigree |
ordered pedigree with 3 columns: id, dam and sire, or a
|
nodes |
|
scale |
logical: should a phylogeny (needs to be ultrametric) be scaled to unit length (distance from root to tip)? |
reduced |
logical: should childless nodes be dropped from the inverse and the pedigree/phylogeny representation be reduced? |
tol |
numeric: differences in branch length smaller than this are ignored when assessing whether a tree is ultrametric. |
Ainv |
inverse as |
inbreeding |
inbreeding coefficients/branch lengths |
pedigree |
pedigree/pedigree representation of phylogeny |
Jarrod Hadfield [email protected]
Henderson, C.R. (1976) Biometrics 32 (1) 69:83
Quaas, R. L. and Pollak, E. J. (1980) Journal of Animal Science 51:1277-1287.
Meuwissen, T.H.E and Luo, Z. (1992) Genetic Selection Evolution 24 (4) 305:313
Hadfield, J.D. and Nakagawa, S. (2010) Journal of Evolutionary Biology 23 494-508
data(bird.families) Ainv<-inverseA(bird.families)
data(bird.families) Ainv<-inverseA(bird.families)
Forms a tensor of (mixed) central moments of a multivariate normal distribution
knorm(V, k)
knorm(V, k)
V |
(co)variance matrix |
k |
kth central moment, must be even |
tensor
Jarrod Hadfield [email protected]
Schott, J.R.(2003) Journal of Multivariate Analysis 87 (1) 177-190
V<-diag(2) knorm(V,2) knorm(V,4)
V<-diag(2) knorm(V,2) knorm(V,4)
Forms an mk x mk Kronecker Product Permutation Matrix
KPPM(m, k)
KPPM(m, k)
m |
integer |
k |
integer |
Matrix
Jarrod Hadfield [email protected]
Schott, J.R.(2003) Journal of Multivariate Analysis 87 (1) 177-190
KPPM(2,3)
KPPM(2,3)
Calculates statistics of Krzanowski's comparison of subspaces.
krzanowski.test(CA, CB, vecsA, vecsB, corr = FALSE, ...)
krzanowski.test(CA, CB, vecsA, vecsB, corr = FALSE, ...)
CA |
Matrix A |
CB |
Matrix B |
vecsA |
Vector of integers indexing the eigenvectors determining the subspace of A |
vecsB |
Vector of integers indexing the eigenvectors determining the subspace of B |
corr |
logical; if |
... |
further arguments to be passed |
sumofS |
metric for overall similarity with 0 indicting no similarity and
a value of |
angles |
angle in degrees between each best matched pair of vectors |
bisector |
vector that lies between each best matched pair of vectors |
Jarrod Hadfield [email protected]
Krzanowski, W.J. (2000) Principles of Multivariate Analysis. OUP
CA<-rIW(diag(5),10, n=1) CB<-rIW(diag(5),10, n=1) krzanowski.test(CA, CB, vecsA=1:2, vecsB=1:2) krzanowski.test(CA, CA, vecsA=1:2, vecsB=1:2)
CA<-rIW(diag(5),10, n=1) CB<-rIW(diag(5),10, n=1) krzanowski.test(CA, CB, vecsA=1:2, vecsB=1:2) krzanowski.test(CA, CA, vecsA=1:2, vecsB=1:2)
Returns the central moments of a uniform distribution
kunif(min, max, k)
kunif(min, max, k)
min , max
|
lower and upper limits of the distribution. Must be finite. |
k |
k central moment, must be even |
kth central moment
Jarrod Hadfield [email protected]
kunif(-1,1,4) y<-runif(1000,-1,1) mean((y-mean(y))^4)
kunif(-1,1,4) y<-runif(1000,-1,1) mean((y-mean(y))^4)
Forms a block-diagonal matrix from a list of matrices
list2bdiag(x)
list2bdiag(x)
x |
list of square matrices |
matrix
Jarrod Hadfield [email protected]
M<-list(rIW(diag(3), 10), rIW(diag(2), 10)) list2bdiag(M)
M<-list(rIW(diag(3), 10), rIW(diag(2), 10)) list2bdiag(M)
Markov chain Monte Carlo Sampler for Multivariate Generalised Linear Mixed
Models with special emphasis on correlated random effects arising from pedigrees
and phylogenies (Hadfield 2010). Please read the course notes: vignette("CourseNotes",
"MCMCglmm")
or the overview vignette("Overview", "MCMCglmm")
MCMCglmm(fixed, random=NULL, rcov=~units, family="gaussian", mev=NULL, data,start=NULL, prior=NULL, tune=NULL, pedigree=NULL, nodes="ALL", scale=TRUE, nitt=13000, thin=10, burnin=3000, pr=FALSE, pl=FALSE, verbose=TRUE, DIC=TRUE, singular.ok=FALSE, saveX=TRUE, saveZ=TRUE, saveXL=TRUE, slice=FALSE, ginverse=NULL, trunc=FALSE, theta_scale=NULL, saveWS=TRUE, aggregate=NULL)
MCMCglmm(fixed, random=NULL, rcov=~units, family="gaussian", mev=NULL, data,start=NULL, prior=NULL, tune=NULL, pedigree=NULL, nodes="ALL", scale=TRUE, nitt=13000, thin=10, burnin=3000, pr=FALSE, pl=FALSE, verbose=TRUE, DIC=TRUE, singular.ok=FALSE, saveX=TRUE, saveZ=TRUE, saveXL=TRUE, slice=FALSE, ginverse=NULL, trunc=FALSE, theta_scale=NULL, saveWS=TRUE, aggregate=NULL)
fixed |
|
random |
|
rcov |
|
family |
optional character vector of trait distributions. Currently,
|
mev |
optional vector of measurement error variances for each data point for random effect meta-analysis. |
data |
|
start |
optional list having 5 possible elements:
|
prior |
optional list of prior specifications having 4 possible elements:
|
tune |
optional list with elements |
pedigree |
ordered pedigree with 3 columns id, dam and sire or a
|
nodes |
pedigree/phylogeny nodes to be estimated. The default,
|
scale |
logical: should the phylogeny (needs to be ultrametric) be scaled to unit length (distance from root to tip)? |
nitt |
number of MCMC iterations |
thin |
thinning interval |
burnin |
burnin |
pr |
logical: should the posterior distribution of random effects be saved? |
pl |
logical: should the posterior distribution of latent variables be saved? |
verbose |
logical: if |
DIC |
logical: if |
singular.ok |
logical: if |
saveX |
logical: save fixed effect design matrix |
saveZ |
logical: save random effect design matrix |
saveXL |
logical: save structural parameter design matrix |
slice |
logical: should slice sampling be used? Only applicable for binary traits with independent residuals |
ginverse |
a list of sparse inverse matrices ( |
trunc |
logical: should latent variables in binary models be truncated to prevent under/overflow (+/-20 for categorical/multinomial models and +/-7 for threshold/probit models)? |
theta_scale |
optional list of 4 possible elements specifying a set of location effects (fixed or random) that are to be scaled by the parameter |
saveWS |
logical: save design matrix for scaled effects. |
aggregate |
character: names of long-format variables to be formed from wide-format variables in multi-trait models. Wide-format variables should have the long-format variable name followed by an underscore and then the trait name. For example, a wide-format data frame might have variable names |
Sol |
Posterior Distribution of MME solutions, including fixed effects |
VCV |
Posterior Distribution of (co)variance matrices |
CP |
Posterior Distribution of cut-points from an ordinal model |
Liab |
Posterior Distribution of latent variables |
Fixed |
list: fixed formula and number of fixed effects |
Random |
list: random formula, dimensions of each covariance matrix, number of levels per covariance matrix, and term in random formula to which each covariance belongs |
Residual |
list: residual formula, dimensions of each covariance matrix, number of levels per covariance matrix, and term in residual formula to which each covariance belongs |
Deviance |
deviance -2*log(p(y|...)) |
DIC |
deviance information criterion |
X |
sparse fixed effect design matrix |
Z |
sparse random effect design matrix |
XL |
sparse structural parameter design matrix |
error.term |
residual term for each datum |
family |
distribution of each datum |
Tune |
(co)variance matrix of the proposal distribution for the latent variables |
meta |
logical; was |
Wscale |
sparse design matrix for scaled terms. |
Jarrod Hadfield [email protected]
General analyses: Hadfield, J.D. (2010) Journal of Statistical Software 33 2 1-22
Phylogenetic analyses: Hadfield, J.D. & Nakagawa, S. (2010) Journal of Evolutionary Biology 23 494-508
Background Sorensen, D. & Gianola, D. (2002) Springer
# Example 1: univariate Gaussian model with standard random effect data(PlodiaPO) model1<-MCMCglmm(PO~1, random=~FSfamily, data=PlodiaPO, verbose=FALSE, nitt=1300, burnin=300, thin=1) summary(model1) # Example 2: univariate Gaussian model with phylogenetically correlated # random effect data(bird.families) phylo.effect<-rbv(bird.families, 1, nodes="TIPS") phenotype<-phylo.effect+rnorm(dim(phylo.effect)[1], 0, 1) # simulate phylogenetic and residual effects with unit variance test.data<-data.frame(phenotype=phenotype, taxon=row.names(phenotype)) Ainv<-inverseA(bird.families)$Ainv # inverse matrix of shared phyloegnetic history prior<-list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002))) model2<-MCMCglmm(phenotype~1, random=~taxon, ginverse=list(taxon=Ainv), data=test.data, prior=prior, verbose=FALSE, nitt=1300, burnin=300, thin=1) plot(model2$VCV)
# Example 1: univariate Gaussian model with standard random effect data(PlodiaPO) model1<-MCMCglmm(PO~1, random=~FSfamily, data=PlodiaPO, verbose=FALSE, nitt=1300, burnin=300, thin=1) summary(model1) # Example 2: univariate Gaussian model with phylogenetically correlated # random effect data(bird.families) phylo.effect<-rbv(bird.families, 1, nodes="TIPS") phenotype<-phylo.effect+rnorm(dim(phylo.effect)[1], 0, 1) # simulate phylogenetic and residual effects with unit variance test.data<-data.frame(phenotype=phenotype, taxon=row.names(phenotype)) Ainv<-inverseA(bird.families)$Ainv # inverse matrix of shared phyloegnetic history prior<-list(R=list(V=1, nu=0.002), G=list(G1=list(V=1, nu=0.002))) model2<-MCMCglmm(phenotype~1, random=~taxon, ginverse=list(taxon=Ainv), data=test.data, prior=prior, verbose=FALSE, nitt=1300, burnin=300, thin=1) plot(model2$VCV)
Sets up design matrix for measurement error models.
me(formula, error=NULL, group=NULL, type="classical")
me(formula, error=NULL, group=NULL, type="classical")
formula |
|
error |
character; name of column in |
group |
name of column in |
type |
character; one of |
design matrix, with a prior distribution attribute
Jarrod Hadfield [email protected]
Forms design matrices for multiple membership models
mult.memb(formula)
mult.memb(formula)
formula |
formula |
Currently mult.memb
can only usefully be used inside an idv
variance function. The formula usually contains serveral factors that have the same factor levels.
design matrix
Jarrod Hadfield [email protected]
fac1<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) fac2<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) cbind(fac1, fac2) mult.memb(~fac1+fac2)
fac1<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) fac2<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) cbind(fac1, fac2) mult.memb(~fac1+fac2)
Forms design matrix for path analyses that involve paths within residual blocks
path(cause=NULL, effect=NULL, k)
path(cause=NULL, effect=NULL, k)
cause |
integer; index of predictor ‘trait’ within residual block |
effect |
integer; index of response ‘trait’ within residual block |
k |
integer; dimension of residual block |
design matrix
For more general path anlaytic models see sir which allows paths to exist between responses that are not in the same residual block. However, sir
does not handle non-Gaussian or missing responses. Note that path models involving non-Gaussian data are defined on the link scale which may not always be appropriate.
Jarrod Hadfield [email protected]
path(1, 2,2)
path(1, 2,2)
Calculates the probability that all categories in a multinomial have a non-zero count.
pkk(prob, size)
pkk(prob, size)
prob |
numeric non-negative vector of length K, specifying the probability for the K classes; is internally normalized to sum 1. Infinite and missing values are not allowed. |
size |
integer, say N, specifying the total number of objects that are put into K boxes in the typical multinomial experiment. |
probability that there is at least one object in each of the K boxes
Jarrod Hadfield [email protected]
p<-runif(4) pkk(p, 10)
p<-runif(4) pkk(p, 10)
Phenoloxidase measures on caterpillars of the Indian meal moth (Plodia interpunctella).
data(PlodiaPO)
data(PlodiaPO)
a data frame with 511 rows and 3 columns, with variables indicating
full-sib family (FSfamily
), phenoloxidase measures (PO
), and plate
(plate
). PO has undergone a Box-Cox power transformation of 0.141
Tidbury H & Boots M (2007) University of Sheffield
Resistance of Indian meal moth (Plodia interpunctella) caterpillars to the granulosis virus PiGV.
data(PlodiaR)
data(PlodiaR)
a data frame with 50 rows and 5 columns, with variables indicating full-
sib family (FSfamly
), date of egg laying (date_laid
) and assaying
(date_Ass
), and the number of individuals from the family that were
experimentally infected with the virus Infected
and the number of those
that pupated Pupated
. These full-sib family identifiers also relate to
the full-sib family identifiers in PlodiaPO
Tidbury H & Boots M (2007) University of Sheffield
Resistance (as a binary trait) of Indian meal moth (Plodia interpunctella) caterpillars to the granulosis virus PiGV.
data(PlodiaRB)
data(PlodiaRB)
a data frame with 784 rows and 4 columns, with variables indicating full-
sib family (FSfamly
), date of egg laying (date_laid
) and assaying
(date_Ass
), and a binary variable indicating whether an individual was
resistant (Pupated
) to an experimental infection of the virus. These data
are identical to those in the data.frame PlodiaR
except each family-level
binomial variable has been expanded into a binary variable for each individual.
Tidbury H & Boots M (2007) University of Sheffield
plot
method for class "MCMCglmm"
.
## S3 method for class 'MCMCglmm' plot(x, random=FALSE, ...)
## S3 method for class 'MCMCglmm' plot(x, random=FALSE, ...)
x |
an object of class |
random |
logical; should saved random effects be plotted |
... |
Further arguments to be passed |
Jarrod Hadfield [email protected]
plot.mcmc
, MCMCglmm
Represents covariance matrices as 3-d ellipsoids using the rgl
package.
Covariance matrices of dimension greater than 3 are plotted on the subspace
defined by the first three eigenvectors.
plotsubspace(CA, CB=NULL, corr = FALSE, shadeCA = TRUE, shadeCB = TRUE, axes.lab = FALSE, ...)
plotsubspace(CA, CB=NULL, corr = FALSE, shadeCA = TRUE, shadeCB = TRUE, axes.lab = FALSE, ...)
CA |
Matrix |
CB |
Optional second matrix |
corr |
If |
shadeCA |
If |
shadeCB |
If |
axes.lab |
If |
... |
further arguments to be passed |
The matrix CA is always red, and the matrix CB if given is always blue. The subspace is defined by the first three eigenvectors of CA, and the percentage of variance for each matrix along these three dimensions is given in the plot title.
Jarrod Hadfield [email protected] with code taken from the rgl
package
if(requireNamespace("rgl")!=FALSE){ G1<-rIW(diag(4),10) G2<-G1*1.2 # plotsubspace(G1, G2, shadeCB=FALSE) # commented out because of problems with rgl }
if(requireNamespace("rgl")!=FALSE){ G1<-rIW(diag(4),10) G2<-G1*1.2 # plotsubspace(G1, G2, shadeCB=FALSE) # commented out because of problems with rgl }
Posterior distribution of ante-dependence parameters
posterior.ante(x,k=1)
posterior.ante(x,k=1)
x |
mcmc object of (co)variances stacked column-wise |
k |
order of the ante-dependence structure |
posterior ante-dependence parameters (innovation variances followed by regression ceofficients)
Jarrod Hadfield [email protected]
posterior.cor
, posterior.evals
, posterior.inverse
v<-rIW(diag(2),10, n=1000) plot(posterior.ante(mcmc(v),1))
v<-rIW(diag(2),10, n=1000) plot(posterior.ante(mcmc(v),1))
Transforms posterior distribution of covariances into correlations
posterior.cor(x)
posterior.cor(x)
x |
mcmc object of (co)variances stacked column-wise |
posterior correlation matrices
Jarrod Hadfield [email protected]
posterior.evals
, posterior.inverse
, posterior.ante
v<-rIW(diag(2),3, n=1000) hist(posterior.cor(mcmc(v))[,2])
v<-rIW(diag(2),3, n=1000) hist(posterior.cor(mcmc(v))[,2])
Posterior distribution of eigenvalues
posterior.evals(x)
posterior.evals(x)
x |
mcmc object of (co)variances stacked column-wise |
posterior eigenvalues
Jarrod Hadfield [email protected]
posterior.cor
, posterior.inverse
, posterior.ante
v<-rIW(diag(2),3, n=1000) hist(posterior.evals(mcmc(v))[,2])
v<-rIW(diag(2),3, n=1000) hist(posterior.evals(mcmc(v))[,2])
Posterior distribution of matrix inverse
posterior.inverse(x)
posterior.inverse(x)
x |
mcmc object of (co)variances stacked column-wise |
posterior of inverse (co)variance matrices
Jarrod Hadfield [email protected]
posterior.cor
, posterior.evals
, posterior.ante
V<-rIW(diag(2),3, n=1000) inv.V<-posterior.inverse(v)
V<-rIW(diag(2),3, n=1000) inv.V<-posterior.inverse(v)
Estimates the marginal parameter modes using kernel density estimation
posterior.mode(x, adjust=0.1, ...)
posterior.mode(x, adjust=0.1, ...)
x |
mcmc object |
adjust |
numeric, passed to |
... |
other arguments to be passed |
modes of the kernel density estimates
Jarrod Hadfield [email protected]
v<-rIW(as.matrix(1),10, n=1000) hist(v) abline(v=posterior.mode(mcmc(v)), col="red")
v<-rIW(as.matrix(1),10, n=1000) hist(v) abline(v=posterior.mode(mcmc(v)), col="red")
Predicted values for GLMMs fitted with MCMCglmm
## S3 method for class 'MCMCglmm' predict(object, newdata=NULL, marginal=object$Random$formula, type="response", interval="none", level=0.95, it=NULL, posterior="all", verbose=FALSE, approx="numerical", ...)
## S3 method for class 'MCMCglmm' predict(object, newdata=NULL, marginal=object$Random$formula, type="response", interval="none", level=0.95, it=NULL, posterior="all", verbose=FALSE, approx="numerical", ...)
object |
an object of class |
newdata |
An optional data frame in which to look for variables with which to predict |
marginal |
formula defining random effects to be maginalised |
type |
character; either "terms" (link scale) or "response" (data scale) |
interval |
character; either "none", "confidence" or "prediction" |
level |
A numeric scalar in the interval (0,1) giving the target probability content of the intervals. |
it |
integer; optional, MCMC iteration on which predictions should be based |
posterior |
character; should marginal posterior predictions be calculated ("all"), or should they be made conditional on the marginal posterior means ("mean") of the parameters, the posterior modes ("mode"), or a random draw from the posterior ("distribution"). |
verbose |
logical; if |
approx |
character; for distributions for which the mean cannot be calculated analytically what approximation should be used: numerical integration ( |
... |
Further arguments to be passed |
Expectation and credible interval
Jarrod Hadfield [email protected]
Diggle P, et al. (2004). Analysis of Longitudinal Data. 2nd Edition. Oxford University Press.
McCulloch CE and Searle SR (2001). Generalized, Linear and Mixed Models. John Wiley & Sons, New York.
Creates a subset of a pedigree by retaining the ancestors of a specified subset of individuals
prunePed(pedigree, keep, make.base=FALSE)
prunePed(pedigree, keep, make.base=FALSE)
pedigree |
pedigree with id in column 1 dam in column 2 and sire in column 3 |
keep |
individuals in pedigree for which the ancestors should be retained |
make.base |
logical: should ancestors that do not provide additional information be discarded? |
subsetted pedigree
If the individuals in keep
are the only phenotyped individuals for some analysis then some non-phenotyped individuals can often be discarded if they are not responsible for pedigree links between phenotyped individuals. In the simplest case (make.base=FALSE
) all ancestors of phenotyped individuals will be retained, although further pruning may be possible using make.base=TRUE
. In this case all pedigree links that do not connect phenotyped individuals are discarded resulting in some individuals becoming part of the base population. In terms of variance component and fixed effect estimation pruning the pedigree should have no impact on the target posterior distribution, although convergence and mixing may be better because there is less missing data.
Jarrod Hadfield [email protected] + Michael Morrissey
Forms a tensor of sample (mixed) central moments
Ptensor(x, k)
Ptensor(x, k)
x |
matrix; traits in columns samples in rows |
k |
kth central moment |
tensor
Jarrod Hadfield [email protected]
n<-1000 y<-matrix(rnorm(n), n/2, 2) Ptensor(y,2) cov(y)*((n-1)/n)
n<-1000 y<-matrix(rnorm(n), n/2, 2) Ptensor(y,2) cov(y)*((n-1)/n)
Random Generation of MVN Breeding Values and Phylogenetic Effects
rbv(pedigree, G, nodes="ALL", scale=TRUE, ggroups=NULL, gmeans=NULL)
rbv(pedigree, G, nodes="ALL", scale=TRUE, ggroups=NULL, gmeans=NULL)
pedigree |
ordered pedigree with 3 columns id, dam and sire or a
|
G |
(co)variance matrix |
nodes |
effects for pedigree/phylogeny nodes to be returned. The default,
|
scale |
logical: should a phylogeny (needs to be ultrametric) be scaled to unit length (distance from root to tip)? |
ggroups |
optional; vector of genetic groups |
gmeans |
matrix of mean breeding value for genetic groups (rows) by traits (columns) |
matrix of breeding values/phylogenetic effects
Jarrod Hadfield [email protected]
data(bird.families) bv<-rbv(bird.families, diag(2))
data(bird.families) bv<-rbv(bird.families, diag(2))
residuals
method for class "MCMCglmm"
.
## S3 method for class 'MCMCglmm' residuals(object, type = c("deviance", "pearson", "working", "response", "partial"), ...)
## S3 method for class 'MCMCglmm' residuals(object, type = c("deviance", "pearson", "working", "response", "partial"), ...)
object |
an object of class |
type |
the type of residuals which should be returned. The alternatives are: |
... |
Further arguments to be passed |
vector of residuals
Jarrod Hadfield [email protected]
Samples from the inverse Wishart distribution, with the possibility of conditioning on a diagonal submatrix
rIW(V, nu, fix=NULL, n=1, CM=NULL)
rIW(V, nu, fix=NULL, n=1, CM=NULL)
V |
Expected (co)varaince matrix as |
nu |
degrees of freedom |
fix |
optional integer indexing the partition to be conditioned on |
n |
integer: number of samples to be drawn |
CM |
matrix: optional matrix to condition on. If not given, and |
If is a draw from the inverse Wishart,
fix
indexes the diagonal element of which partitions
into 4 submatrices.
fix
indexes the upper left corner of the lower
diagonal matrix and it is this matrix that is conditioned on.
For example partioning such that
fix indexes the upper left corner of . If
CM!=NULL
then is fixed at
CM
, otherwise is fixed at
. For example, if
dim(V)
=4 and fix=2
then is a 1X1 matrix and
is a 3X3 matrix.
if n
= 1 a matrix equal in dimension to V
, if n
>1 a
matrix of dimension n
x length(V)
In versions of MCMCglmm >1.10 the arguments to rIW
have changed so that they are more intuitive in the context of MCMCglmm
. Following the notation of Wikipedia (https://en.wikipedia.org/wiki/Inverse-Wishart_distribution) the inverse scale matrix . In earlier versions of MCMCglmm (<1.11)
. Although the old parameterisation is consistent with the
riwish
function in MCMCpack and the rwishart
function in bayesm it is inconsistent with the prior definition for MCMCglmm
. The following pieces of code are sampling from the same distributions:
riwish(nu, nu*V) |
from MCMCpack |
rwishart(nu, solve(nu*V))$IW |
from bayesm |
rIW(nu, solve(nu*V)) |
from MCMCglmm <1.11 |
rIW(V, nu) |
from MCMCglmm >=1.11 |
Jarrod Hadfield [email protected]
Korsgaard, I.R. et. al. 1999 Genetics Selection Evolution 31 (2) 177:181
nu<-10 V<-diag(4) rIW(V, nu, fix=2)
nu<-10 V<-diag(4) rIW(V, nu, fix=2)
Samples from the Truncated Conditional Normal Distribution
rtcmvnorm(n = 1, mean = 0, V = 1, x=0, keep=1, lower = -Inf, upper = Inf)
rtcmvnorm(n = 1, mean = 0, V = 1, x=0, keep=1, lower = -Inf, upper = Inf)
n |
integer: number of samples to be drawn |
mean |
vector of means |
V |
covariance matrix |
x |
vector of observations to condition on |
keep |
element of x to be sampled |
lower |
left truncation point |
upper |
right truncation point |
vector
Jarrod Hadfield [email protected]
par(mfrow=c(2,1)) V1<-cbind(c(1,0.5), c(0.5,1)) x1<-rtcmvnorm(10000, c(0,0), V=V1, c(0,2), keep=1, lower=-1, upper=1) x2<-rtnorm(10000, 0, 1, lower=-1, upper=1) plot(density(x1), main="Correlated conditioning observation") lines(density(x2), col="red") # denisties of conditional (black) and unconditional (red) distribution # when the two variables are correlated (r=0.5) V2<-diag(2) x3<-rtcmvnorm(10000, c(0,0), V=V2, c(0,2), keep=1, lower=-1, upper=1) x4<-rtnorm(10000, 0, 1, lower=-1, upper=1) plot(density(x3), main="Uncorrelated conditioning observation") lines(density(x4), col="red") # denisties of conditional (black) and unconditional (red) distribution # when the two variables are uncorrelated (r=0)
par(mfrow=c(2,1)) V1<-cbind(c(1,0.5), c(0.5,1)) x1<-rtcmvnorm(10000, c(0,0), V=V1, c(0,2), keep=1, lower=-1, upper=1) x2<-rtnorm(10000, 0, 1, lower=-1, upper=1) plot(density(x1), main="Correlated conditioning observation") lines(density(x2), col="red") # denisties of conditional (black) and unconditional (red) distribution # when the two variables are correlated (r=0.5) V2<-diag(2) x3<-rtcmvnorm(10000, c(0,0), V=V2, c(0,2), keep=1, lower=-1, upper=1) x4<-rtnorm(10000, 0, 1, lower=-1, upper=1) plot(density(x3), main="Uncorrelated conditioning observation") lines(density(x4), col="red") # denisties of conditional (black) and unconditional (red) distribution # when the two variables are uncorrelated (r=0)
Samples from the Truncated Normal Distribution
rtnorm(n = 1, mean = 0, sd = 1, lower = -Inf, upper = Inf)
rtnorm(n = 1, mean = 0, sd = 1, lower = -Inf, upper = Inf)
n |
integer: number of samples to be drawn |
mean |
vector of means |
sd |
vector of standard deviations |
lower |
left truncation point |
upper |
right truncation point |
vector
Jarrod Hadfield [email protected]
Robert, C.P. (1995) Statistics & Computing 5 121-125
hist(rtnorm(100, lower=-1, upper=1))
hist(rtnorm(100, lower=-1, upper=1))
Simulated response vectors for GLMMs fitted with MCMCglmm
## S3 method for class 'MCMCglmm' simulate(object, nsim = 1, seed = NULL, newdata=NULL, marginal = object$Random$formula, type = "response", it=NULL, posterior = "all", verbose=FALSE, ...)
## S3 method for class 'MCMCglmm' simulate(object, nsim = 1, seed = NULL, newdata=NULL, marginal = object$Random$formula, type = "response", it=NULL, posterior = "all", verbose=FALSE, ...)
object |
an object of class |
nsim |
number of response vectors to simulate. Defaults to |
seed |
Either |
newdata |
An optional data frame for which to simulate new observations |
marginal |
formula defining random effects to be maginalised |
type |
character; either "terms" (link scale) or "response" (data scale) |
it |
integer; optional, MCMC iteration on which predictions should be based |
posterior |
character; if |
verbose |
logical; if |
... |
Further arguments to be passed |
A matrix (with nsim columns) of simulated response vectors
Jarrod Hadfield [email protected]
Forms design matrix for simultaneous and recursive relationships between responses
sir(formula1=NULL, formula2=NULL, diag0=FALSE)
sir(formula1=NULL, formula2=NULL, diag0=FALSE)
formula1 |
formula |
formula2 |
formula |
diag0 |
logical: should the design matrix have zero's along the diagonal |
design matrix
Jarrod Hadfield [email protected]
fac1<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) fac2<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) cbind(fac1, fac2) sir(~fac1, ~fac2)
fac1<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) fac2<-factor(sample(letters[1:3], 5, TRUE), levels=letters[1:3]) cbind(fac1, fac2) sir(~fac1, ~fac2)
Converts sparseMatrix to asreml's giv format: row-ordered, upper triangle sparse matrix.
sm2asreml(A=NULL, rownames=NULL)
sm2asreml(A=NULL, rownames=NULL)
A |
sparseMatrix |
rownames |
rownames of A |
data.frame: if A
was formed from a pedigree equivalent to giv format
returned by asreml.Ainverse
Jarrod Hadfield [email protected]
inverseA
data(bird.families) A<-inverseA(bird.families) Aasreml<-sm2asreml(A$Ainv, A$node.names)
data(bird.families) A<-inverseA(bird.families) Aasreml<-sm2asreml(A$Ainv, A$node.names)
Orthogonal Spline Design Matrix
spl(x, k=10, knots=NULL, type="LRTP")
spl(x, k=10, knots=NULL, type="LRTP")
x |
a numeric covariate |
k |
integer, defines knot points at the 1:k/(k+1) quantiles of x |
knots |
vector of knot points |
type |
type of spline - currently only low-rank thin-plate ("LRTP") are implemented |
Design matrix post-multiplied by the inverse square root of the penalty matrix
Jarrod Hadfield [email protected]
## Not run: x<-rnorm(100) y<-x^2+cos(x)-x+0.2*x^3+rnorm(100) plot(y~x) lines((x^2+cos(x)-x+0.2*x^3)[order(x)]~sort(x)) dat<-data.frame(y=y, x=x) m1<-MCMCglmm(y~x, random=~idv(spl(x)), data=dat, pr=TRUE, verbose=FALSE) # penalised smoother m2<-MCMCglmm(y~x+spl(x),data=dat, verbose=FALSE) # non-penalised pred1<-(cbind(m1$X,m1$Z)%*%colMeans(m1$Sol))@x pred2<-(cbind(m2$X)%*%colMeans(m2$Sol))@x lines(pred1[order(x)]~sort(x), col="red") lines(pred2[order(x)]~sort(x), col="green") m1$DIC-mean(m1$Deviance) # effective number of parameters < 13 m2$DIC-mean(m2$Deviance) # effective number of parameters ~ 13 ## End(Not run)
## Not run: x<-rnorm(100) y<-x^2+cos(x)-x+0.2*x^3+rnorm(100) plot(y~x) lines((x^2+cos(x)-x+0.2*x^3)[order(x)]~sort(x)) dat<-data.frame(y=y, x=x) m1<-MCMCglmm(y~x, random=~idv(spl(x)), data=dat, pr=TRUE, verbose=FALSE) # penalised smoother m2<-MCMCglmm(y~x+spl(x),data=dat, verbose=FALSE) # non-penalised pred1<-(cbind(m1$X,m1$Z)%*%colMeans(m1$Sol))@x pred2<-(cbind(m2$X)%*%colMeans(m2$Sol))@x lines(pred1[order(x)]~sort(x), col="red") lines(pred2[order(x)]~sort(x), col="green") m1$DIC-mean(m1$Deviance) # effective number of parameters < 13 m2$DIC-mean(m2$Deviance) # effective number of parameters ~ 13 ## End(Not run)
Horn type and genders of Soay Sheep Ovis aires
data(SShorns)
data(SShorns)
a data frame with 666 rows and 3 columns, with individual idenitifier (id
), horn type (horn
) and gender (sex
).
Clutton-Brock T., Pemberton, J. Eds. 2004 Soay Sheep: Dynamics and Selection in an Island Population
summary
method for class "MCMCglmm"
. The returned object is suitable for printing with the print.summary.MCMCglmm
method.
## S3 method for class 'MCMCglmm' summary(object, random=FALSE, ...)
## S3 method for class 'MCMCglmm' summary(object, random=FALSE, ...)
object |
an object of class |
random |
logical: should the random effects be summarised |
... |
Further arguments to be passed |
DIC |
Deviance Information Criterion |
fixed.formula |
model formula for the fixed terms |
random.formula |
model formula for the random terms |
residual.formula |
model formula for the residual terms |
solutions |
posterior mean, 95% HPD interval, MCMC p-values and effective sample size of fixed (and random) effects |
Gcovariances |
posterior mean, 95% HPD interval and effective sample size of random effect (co)variance components |
Gterms |
indexes random effect (co)variances by the component terms defined in the random formula |
Rcovariances |
posterior mean, 95% HPD interval and effective sample size of residual (co)variance components |
Rterms |
indexes residuals (co)variances by the component terms defined in the rcov formula |
csats |
chain length, burn-in and thinning interval |
cutpoints |
posterior mean, 95% HPD interval and effective sample size of cut-points from an ordinal model |
theta_scale |
posterior mean, 95% HPD interval, MCMC p-values and effective sample size of scaling parameter in theta_scale models. |
Jarrod Hadfield [email protected]
Lower/Upper triangle elements of a matrix or forms a matrix from a vector of lower/upper triangle elements
Tri2M(x, lower.tri = TRUE, reverse = TRUE, diag = TRUE)
Tri2M(x, lower.tri = TRUE, reverse = TRUE, diag = TRUE)
x |
Matrix or vector |
lower.tri |
If |
reverse |
logical: if |
diag |
logical: if |
numeric or matrix
Jarrod Hadfield [email protected]
M<-rIW(diag(3), 10) x<-Tri2M(M) x Tri2M(x, reverse=TRUE) Tri2M(x, reverse=FALSE)
M<-rIW(diag(3), 10) x<-Tri2M(M) x Tri2M(x, reverse=TRUE) Tri2M(x, reverse=FALSE)