Tuesday, June 12, 2012

Variance Components in Multilevel Models

(User script TeX the World should be used to view mathematical expressions in this post.)

The term "variance components" can have different meanings. In a one-way ANOVA problem, we partition the total variance (or sum of squares) into between and within group variance (or sum of squares). The term variance components is often used either for the between or within group variances (sum of squares divided by the respective degrees of freedom) or the sum of squares.  Using a model-based (Bayesian) expression, ANOVA is a hierarchical model with
at the data level, and
at the group level, where [;\sigma_y^2;] is the within group variance and [;\sigma_g^2;] is the between group variance. When data are from a balanced design, [;Var(y) = \sigma_y^2+\sigma_g^2;].

In mixed effects model literature, the term variance components is better explained using a Bayesian notation. Suppose that we have a linear mixed effects model. At the data level:
[;y_{ij} \sim N(\alpha_{j[i]}+\beta_{j[i]} x_{ij}, \sigma_y^2),;]
and at the group level
[;\left (\begin{array}{c}\alpha_j\\ \beta_j\end{array}\right ) \sim N\left [ \left (\begin{array}{c}
\mu_{\alpha}\\ \mu_{\beta}\end{array}\right ),\Sigma \right ];]
where [;\Sigma=\left (\begin{array}{cc} \sigma_{\alpha}^2 & \rho\sigma_{\alpha}\sigma_{\beta}\\
\rho\sigma_{\alpha}\sigma_{\beta} & \sigma_{\beta}^2\end{array}\right );]
In classical literature (and software), variance components are often the estimated [;\sigma^2;]s: [;\sigma_y^2;], [;\sigma_{\alpha}^2;], and [;\sigma_{\beta}^2;].

In the two Ecology papers on multilevel model (Qian and Shen, 2007; Qian et al., 2010), I used the ANOVA notion of variance components, i.e., fraction of total variance in the response due to factor g. The above simple linear multilevel model is fitted using MCMC and the varying model coefficients are divided into overall mean (fixed effect) and group effects (random effects):
[;y_{ij} = (\mu_{\alpha}+\delta_{\alpha_{j[i]}})+(\mu_{\beta}+\delta_{\beta_{j[i]}}) x_{ij} +\varepsilon,;]
The total variance in the response variable [;y_{ij};] is partitioned into the variances due to group [;\left ( Var(\delta_{\alpha_{\alpha_j}})\right );], the predictor [;\left ( Var(\mu_{\beta}\times x_{ij})\right );], predictor-group interaction [;\left ( Var(\delta_{\beta_{\beta_j}}\times x_{ij})\right );], and residuals [;\left (Var(\varepsilon)\right );]. This is what calculated in Gelman and Hill (2007) to produce the ANOVA table-like variance components figures. If comparing variance components to the results from using the R function lmer (from package lme4), [;Var(\delta_{\alpha_{\alpha_j}});] should be the same (or very close) to [;\sigma_{\alpha}^2;] and [;Var(\delta_{\beta_{\beta_j}});] should be very close to [;\sigma_{\beta};].

Log or not log

LOGorNOTLOG.html Log or not log, that is the question May 19, 2018 In 2014 I taught a special topics class on statistical i...