Friday, December 18, 2015

Explaining Science to the Media

The memory of the 2014 water crisis is still fresh in the minds of many people in the Toledo, OH area. Anything related to harmful algal bloom in Lake Erie will make to the news one way or the other. Yesterday, I was asked to explain the research published in this paper to two local TV channels. In my mind, the goal of the work was to use better statistical method to reduce measurement uncertainty. To introduce the work without touching the term Bayesian statistics, I talked about how decision under uncertainty often result in the lack of confidence in the final choice. The lack of confidence is often the reason for less effective communication between the decision maker and the public. In this case, uncertainty made explaining the "Do Not Drink" order very difficult. The difficulty, in turn, led to the lack of communication between the city and the public, resulting in public anxiety and second guessing about the order later. When implemented, our method can result in a more confident decision, thereby, help the city to better communicate with the public. The final cuts from both TV stations present the problem with a single question, did the City of Toledo make the right call in issuing the "Do Not Drink" order? They effectively conveyed my message without mentioning words like "risk communication" and the reasoning behind my answer to the question. Our training in scientific writing does not help us in explaining science to the public. I have a lot to learn from reporters. By the way, the news cast seems to show that I know how to operate the ELISA test. This was, in fact, the first time I touched an ELISA kit.

Tuesday, November 24, 2015

Uncertainty in measured microcystin concentrations using ELISA

We published a paper on the uncertainty in the measured microcystin concentration using the commonly used method known as ELISA. Microcystin is a group of toxin associated with blooms of cyanobacteria. One high concentration value detected in a drinking water sample in Toledo in 2014 resulted in a "do not drink" advisor that affected about half a million people in the Toledo area. In the paper we discussed the high level of uncertainty associated with the estimated concentrations and provided a Bayesian Hierarchical Modeling (BHM) approach for reducing the uncertainty. I have uploaded data used in that paper to GitHub.

Thursday, October 29, 2015

Results or Methods

In a recent paper, I and my co-author discussed the use of statistical causal analysis (propensity score matching) for analyzing observational data. The example is to estimate the effect of water and soil conservation practices on controlling nutrient loss from farm fields. The data were observational, a collection of measured P and N loss from various field level studies. My interest on the subject started in 2009 after I completed this book, thinking about whether the topic of causal analysis should be included in the future. When a colleague shared a dataset collected by USDA, I decided to study causal analysis and use it in my class. The effect of conservation practices on nutrient loss has been a topic of agricultural studies for a long time. But the method used in various studies is inevitably modeling. But these models were never properly calibrated and the basic input to these model is the nutrient yield. In many cases, researchers simply assume a fixed rate of reduction in nutrient yield and plug the rate into the model to calculate the total reduction. I have not seen a study properly document the effect of conservation practices using a randomized experiment. After discussing with colleagues, I realized that a randomized experiment is practically impossible. However, the decision of implementing conservation practices is often based on whether a field is prone to soil and water loss. As a result, if we combine data from many studies and compare the nutrient loss from fields with and without conservation practices, we often find that fields with conservation practices have larger nutrient losses. But we are often comparing fields with row crops (with conservation practices) to pastures (without conservation practices). In one published paper, the authors were puzzled by the result of such a comparison. I worked on a dataset consisting of measurements from about 160 papers with field scale measurements on nutrient loss, fertilizer application rate and methods, and other routinely measured variables (crops, best management practices, conservation practices). Using the dataset as an example, I taught statistical causal analysis in 2011, 2013, and 2014 in my graduate level statistics classes. I wrote the paper based on how students responded to the materials. I found that the concept of confounding factor is often new to students, which is not surprising as most students don't have a good conceptual understanding of statistics. I included a long background subsection in the Introduction. From a statistical perspective, the methods section on the propensity score matching method can be described in general terms. The results, in my mind, include not only the estimated effects, but also additional factors to be controlled. In other words, the derived statistical model and its interpretation should be part of the results section. One reviewer and the associate editor insisted that we move the description of the model to methods section. They recommended rejection because of the "organization" problem. I explained in my response that the model is a result of the the causal analysis process. By describing it as a result, I emphasize the process of finding the appropriate model (hence the model is part of the results). By presenting the model in the methods section, we give an impression that causal analysis is a simple application of another statistical procedure. The editor eventually decided a compromise. We presented more in the methods section and I added more in the results on the process of deriving the appropriate model. In my opinion, the usual structure of Introduction -- Methods -- Results -- Discussion (IMRD) is not always effective. My experience showed that the structure can be a hindrance to the presentation of the process of doing research. In two recent papers I published in Environmental Science and Technology, I deviated from IMRD to include background information. When conducting research, we often have to change our approaches or abandon our initial hypothesis. After all, research is a learning process. The IMRD structure encourages us to avoid the discussion of the process. This approach can be hazardous in presenting statistical modeling.

Thursday, May 14, 2015

Some Simple Statistics in Clean Water Act Compliance Assessment

Last month, we have published three articles discussing Clean Water Act (CWA) related statistics methods.

A Continuous Bayesian Networks (cBN) Model

In a paper appeared in Environmental Modelling and Software, we proposed a Bayesian Networks model using continuous variables (cBN). The model is a combination of the Gibbs sampler and empirical models through a graphical model representing the hypothesized causal links among relevant variables. When applied to a data collected for developing nutrient criterioa for Ohio’s small rivers and streams, we found that the concept of a single nutrient criterion for the entire state is impractical. In many cases, nutrient is not the primary factor affecting a stream’s ecological condition, something else (e.g., habitat quality) may be more important. As a result, we revised the mansucript three times to make it clear that we are not comfortable with the idea of a single nutrient criterion. A regional or even water-specific criterion is necessary. This argument should be a no-brainer as there are no two rivers that are exactly the same and the effect of nutrient on stream ecosystem will innevitably be also different, hence different nutrient criteria are needed. In the process of revising the manuscript, we noticed an interesting problem in EPA’s recommendedation on how to establish a nutrient criterion using reference condition. The approach takes the following steps:

Select streams that are largely not affected by human activities (good luck)
Collect nutrient concentration data from these “reference” streams and calculate a median for each stream
Pool these median values together to form the reference distribution
The nutrient criterion should be the 75th percentile of the reference distribution.

The reference distribution represents the distribution of median concentrations of reference streams. If we assume that a concentration variable can be adequately modeled by the log-normal distribution, this distribution of stream median concentrations reflects the among stream distribution. Within stream variance is not considered.
When the established nutrient criterion is used in CWA compliance assessment, data from an individual stream will be used and compared to the criterion, only the within stream variance is of concern. What is the implication of this bait and switch?
In the end of the review process, we ran into a very unfortunate mistake on the journal’s side. The journal’s editorial office ran a Plagiarism Checker on the manuscript and found “substantial” amount of text copied from the Internet. The editor rejected the manuscript. The copied text turned out to be several R functions I wrote for my textbook, which I included in the supporting document. When the book was translated into Japanese in 2010, the Japanese published posted all my code online. The editor promptly corrected the mistake. But I had to resubmit the manuscript as a new one, which is why the paper was accepted on the same day when it was submitted.

The frequency component of a water quality standard

In the paper appeared in the journal Environmental Management, we revisited EPA documents on the three components of a water quality standard (or criterion): magnitude, duration, and frequency. Based on our reading of documents as far back as 1972, we found that the interpretation of magnitude and duration is unambiguous. However, we have not found a convincing discussion on the frequency component. Based on a 1985 EPA document, we believe that the frequency component is added to provide a margin of safety. Using this interpretation, we formulated a probabilistic approach to derive the necessary frequency to maintain a consistent level of confidence.
An interesting by-product of writing this paper is our new interpretation of the 10 percent rule (the raw data approach) discussed in Smith et al (2001). In that paper, the authors interpreted a 1997 EPA rule that stipulates how to declare a water is in compliance with a numeric water quality standard. The rule is know as the 10 percent rule because it declares a water out of compliance when more than 10% of the data exceed the numeric standard. Smith et al (2001) argued that the 10% of data should be interpreted as 10% of the time. Therefore the 10% rule requires the 0.9 quantile of the underlying concentration distribution to be below the numeric standard. They proposed statistical hypothesis testing. As a result of the paper, nearly all states in the US are now using hypothesis testing.
Through reviewing EPA’s interpretation of the CWA, we believe that the legal definition of a numeric standard is intended for the population mean, not the 0.9 quantile. The 10% rule is simply a margin of safety. When comparing the 0.9 quantile of pollutant concentration dsitribution to the standard, instead of the mean, we set a very high standard. But is it necessary?

Implications of Stein’s paradox

This paper is published in Environmental Science and Technology, a paper we started (technically) in 1991 when we read Efron and Morris (1977) in a study group. Although I did not quite understand the mathematics (and the baseball example, because I had yet seen a baseball game), the conclusion that sample average is not the best estimator when multiple means are estimated simultaneously gave me a strong impression. In the next 20 years, we have talked about the paper from time to time. In 2006 while reading Gelman’s 2005 Bayesian ANOVA paper, I finally made the connection and went back to read the 1977 paper again. (I guess that we should definitely force our graduate students to form reading groups and read regularly.)
Stein’s paradox suggests that CWA compliance assessment can benefit when we pool data from similar waters together and apply a shrinkage estimator. In the paper, we documented the benefit of a shrinkage estimator. Writing this paper also allowed me to think more about the idea of prior distribution in Bayesian inference. I often have doubts about the subjective interpretation of the prior, largely because of Daniel Kahneman’s work on how we don’t think in terms of probability. Kahneman’s work suggests that eliciting a prior distribution from a human expert is a risky business. Using the empirical Bayes interpretation of the Jame-Stein estimator, we see an alternative interpretation of a prior distribution. A prior distribution should be the distribution of the parameter of interest at a higer level aggregation. In our example, when using the Bayes estimator for estimating the mean concentration of a water, the prior of the mean should be the distribution of means of similar waters. With this interpretation, we can put a physical meaning to the prior, which will make the process of establishing an informative prior distribution easier.

<!-- dynamically load mathjax for compatibility with self-contained

-->

Song's Blog