An Alternative Interpretation of the \( p \)-value
The likelihood principle is often used as the basis for criticizing the use of a \( p \)-value in classical statistics. Reading the recent discussions in the March issue of Ecology, I had an alternative explanation of what a \( p \)-value is.
The \( p \)-value is defined as the conditional probability of observing something as extreme or more than the data if the “null'' hypothesis is true. In a one sample \( t \)-test problem, we are interested in testing whether the population mean is equal to a specific value \[ H_0: \mu=\mu_0. \]
The test is based on the central limit theorem, which states that the sampling distribution of a sample mean \( \bar{x} \) is \( N(\mu,\sigma^2/n) \), where \( \mu, \sigma \) are the population mean and standard deviation and \( n \) is the sample size. In a hypothesis testing, we compare the sample mean \( \bar{x} \) to the sampling distribution under the null hypothesis \( N(\mu_0,\sigma^2/n) \).
For simplicity, we assume that \( \sigma \) is known and use a one-sided \( p \)-value. The typical definition of the \( p \)-value is the shaded area in the figure, the probability of observing sample means as extreme as and more extreme than \( \bar{x} \), hence the criticisms of violating the likelihood principle because data not observed must be used to calculate the \( p \)-value.
The \( p \)-value can be interpreted as a probability, but it can also be interpreted as an indicator of the likelihood, the density value of \( \bar{x} \). That is, the tail area is a monotonic function of the likelihood. A small \( p \)-value is related to a small likelihood value and vice versa. A density value cannot be easily understood, while a probability is scaled and easy to understand. Alternatively, we can also measure the evidence against the null hypothesis using the distance between \( \bar{x} \) and \( \mu_0 \): \( d=|\bar{x} - \mu_0| \), and \( d \) is also a monotonic function of the likelihood. Because \( \bar{x} = \pm d \) share the same likelihood, the probabilistic interpretation of \( p \)-value must be 2 times the shaded area. As long as we know the rule of translating a likelihood value to a \( p \)-value, whether it is 1 or 2 times the shaded area is irrelevant.
Using a \( p \)-value, we now interpret the evidence'' against the hypothesis in terms of a probability and a binary decision rule is now easily justifiable (no matter how arbitrary). I would argue now that the \( p \)-value itself does not violate the likelihood principle. It can be seen as an easy-to-understand indicator of the likelihood (of the observed sample mean being a random sample of the sampling distribution defined by the null hypothesis). It is the literal interpretation of the indicator that introduces confusion.
Many years ago, when I took my first statistics course, we must find the \( p \)-value in the standard normal distribution table in the back of the text. That is, we calculate \( z=\frac{\bar{x}-\mu_0}{\sigma/\sqrt{n}} \) and find a \( p \)-value in a table. Could it (using \( p \)-value) be simply a computational short cut for the likelihood? After all, a likelihood is a monotonic function of the \( p \)-value.