Wednesday, November 2, 2011

Exchangeability

When writing chapter 10 of Environmental and Ecological Statistics with R (the use of the lme4 package to fit multilevel models), I had a difficult time in explaining the exchangeability assumption, which is needed in using a multilevel model as in the EUSE (effects of urbanization on stream ecosystems) example. I opted to skip most mathematical treatment of the assumption and side-stepped the issue in the examples. But I always believe the the assumption is more than a simple mathematical treatment. Gelman et al (2003) interpreted the assumption as a model for ignorance (which is very intriguing). Bernardo (1995) suggests that the exchangeability concept forces a Bayesian model ("the representation theorems for exchangeable sequences of random variables establish that any coherent analysis of the information thus modelled requires the specification of a joint probability distribution on all the parameters involved, hence forcing a Bayesian approach.").

My question was always the practical implication of violating the assumption when fitting a model. In revising a recent paper, I must address the request from a reviewer to explain the exchangeability assumption  (which was initially buried in a citation). The second paragraph in Section 2.2 of the paper was added.  I came to a better understanding of the assumption as a result of revising the paper. Not to brush aside something that I can't give a satisfactory explanation is obviously beneficial. 

The exchangeability assumption is a generalization of the most commonly used assumption in classical statistics: i.i.d. The assumption that observations were independent identically distributed random variables makes the formulation of a likelihood function a simple and straightforward process, that is, we can simply multiply the densities of individual observations to form the likelihood function. Checking the conformity with the assumption is, however, another issue. The assumptions (both i.i.d. and exchangeability) are associated with the model we propose. When we fit a regression model we always check the residuals to see if they are i.i.d. normal with mean 0 and a constant variance.  Non-conformity with the assumption often suggests a wrong model was used.  The i.i.d. assumption is verified after a model is fit, usually with graphs. How about the assumption that observations are exchangeable? I assume that we should check for the assumption

Log or not log

LOGorNOTLOG.html Log or not log, that is the question May 19, 2018 In 2014 I taught a special topics class on statistical i...