Saturday, May 26, 2012

Threshold, Change Point, and Other Practical Issues

(User script TeX the World should be used to properly view mathematical expressions in this post.)

Discussions with several colleagues at the SFS 2012 Conference reminded me that there are several conceptual and technical issues related the to threshold. Clear definitions of many concepts are needed.
First, we often confuse threshold and change point. A change point is a mathematical concept. It is the point along the x-axis where the response curve show a discontinuity. A threshold is an ecological or management concept. It is the point along a gradient where the response value reaches a critical level. Figure 1 shows the difference of the two concept.

 Figure 1. 

In Figure 1, the y-axis is a hypothetical response variable of interest. If human health concern dictates that Y should be kept above Ym1, we must set the standard for X at Xm1. If ecological concern only require that Y be above Ym2, we can set the standard for X at Xm2. The mathematical change point Xcp is of no concern, unless it coincides with Xm1 or Xm2. Figure 1 suggests that we should focus our attention on finding the underlying model describing the dependency of Y on X. This relationship can be a hockey stick model or otherwise.
Second, when estimating a change point, we must explicitly state the underlying model. Our Ecological Indicators paper (To threshold or not to threshold? That's the question) discussed this issue in detail. The model-specific nature of a change point problem is not a common knowledge in our field.
Third, a change point model makes strong assumptions about the behavior of the data. With many sources of uncertainty, it is rarely possible to discern an abrupt change model (a change point model) from its  continuous counterpart. Figure 2 shows two examples.

Figure 2.

In Figure 2, I show a step function and a hockey stick model. In both cases, I contrast the two change point (abrupt change) models (the dashed lines) to their continuous counterparts (the shaded lines). If the continuous model is the appropriate model, we are interested in estimating [;\phi-\gamma;]. But if the change point model is used instead, we will likely be end up with [;\phi;]. Unfortunately, statistical tests are not currently available for distinguishing these two types of models (Chiu, et al, 2006). The term "bent-cable" model in Grace Chiu's paper is the one shown in the right panel in Figure 2. The hockey stick model in my Environmental and Ecological statistics with R is essentially a bent-cable model (two linear line segments linked by a quadratic line segment), but I fixed [;\gamma;] to be a known small value (1% of the data range). The bent-cable model estimates both [;\gamma;] and [;\phi;].  I think that we should consider  abandoning the use of the hockey-stick model in favor of the bent-cable model (and the step function model in favor of the doubly-bent cable model, left panel of Figure 2). Figure 2 is more of an ecological concern than a management one. If the system is moving from one steady state to a different one (step function versus the doubly-bent model), [;\phi-\gamma;] is the point where such change starts, while [;\phi;] is the middle of the transition period. From a management perspective, setting an environmental standard at [;\phi-\gamma;] makes more sense.


G. Chiu, R. Lockhart, and R. Routledge. Bent-cable regression theory and applications. Journal of the American Statistical Association, 101(474):542– 553, 2006.


No comments:

Log or not log

LOGorNOTLOG.html Log or not log, that is the question May 19, 2018 In 2014 I taught a special topics class on statistical i...