# Peer-Review Fraud: Cite my paper or else!

### May 19, 2016 (revised in 2018)

I serve as a peer-reviewer a lot because I value the peer-review
process. In the first few years after graduate school, reviewer
comments on my manuscripts were often the most helpful part of the
writing process.I benefited from the process and I am willing do
what I can to contribute to the process. Reviewers' are volunteers
and their service is a critical part of academic publication. I
believe in the process and I value the system. As a result, I treat
my review assignments seriously and always write reviews objectively
and provide constructive recommendations. I want to do my part to
keep this academic common a sustainable endeavor.

In 2016, reviews on two manuscripts disturbed me. The lead authors of
the two manuscripts were former students of mine. One is about the
use of a generalized propensity score method to estimate the causal
effect of nitrogen on stream benthic community, and the other is on
statistical issues of discretization of a continuous variable when
constructing a Bayesian networks model. These two manuscripts have
nothing in common, except that I was the second author on both.
Reviewers' comments on the two manuscripts came back in the same week.
One reviewer apparently reviewed both papers. This reviewer's comments
on both papers were essentially the same. But the suggestions are
irrelevant to our work. It is clear to us that this reviewer was
sending a message: cite my papers and I will let you go.

For the Bayesian networks paper, we chose to ignore this reviewer as
he was one of four reviewers commented on our paper. We copied this
reviewer's comments on our propensity score paper to the editor and
the paper is now published. The propensity score paper had only one
reviewer. The lead author was a student at the time and was eager to
add more publications to his resume before graduation. After
discussion, I wrote to the editor of the journal to explain our
concerns. I requested that the manuscript be considered as a new
submission and go through the review process again. Although it would
be easy to add a sentence or two with the recommended citations, I
believe that it is important to uphold the principle. The associate
editor ignored my request for communication so I sent the request to
the editor in chief. Although the editor promised to handle the
re-review himself, he delegated the work to the same associate editor,
who in turn made sure that the paper went through repeated reviews
until it was rejected. The paper is now published in a different
journal.

I copy reviews in question below. Hopefully readers will reach the
same conclusion as I did. We want to publish and we want our peers to
read and cite our work because the work is worthwhile. Abusing the
"power" as a reviewer is just as bad as cheating!

### Review on the Bayesian networks model paper:

General comments:

Overall I like the study and I feel it is fairly well written. My
two observations are about the lack of global sensitivity and
uncertainty analyses (GSUA) and a conversation about management
implications that we can extract from the model/GSUA. Note that here
with ''model'' I mean any method that use the data, yet any model
that process the data in input and produce an output. That is
useful for assessing input factor importance and interaction,
regimes, and scaling laws between model input factors and
outcomes. This differs from traditional sensitivity analysis
methods. Thus, GSUA is very useful for finding out optimal
management/design strategies. GSUA is a variance-based method for
analyzing data and models given an objective function. It is a bit
unclear how many realizations of the model have been run and how the
authors maximized prediction accuracy. Are the values of the input
factors taken to maximize predictions? GSUA (see references below)
typically assigns probability distribution functions to all model
factors and propagate those into model outputs.

In this context, that is about discretization methods for pdfs, the
impact of discretization may be small or large depending on the pdf
chosen (or suitable) for the variables; yet, the discretization may
have different results as a function of the nature of the variables
of interest as well as of the model used.

I think that independently of the model / variables used the authors
should discuss these issues in their paper and possibly postpone
further research along these lines to another paper.

Specific comments:

Variance-based methods (see Saltelli and Convertino below) are a
class of probabilistic approaches which quantify the input and
output uncertainties as probability distributions, and decompose the
output variance into parts attributable to input variables and
combinations of variables. The sensitivity of the output to an input
variable is therefore measured by the amount of variance in the
output caused by that input. Variance-based methods allow full
exploration of the input space, accounting for interactions, and
nonlinear responses. For these reasons they are widely used when it
is feasible to calculate them. Typically this calculation involves
the use of Monte Carlo methods, but since this can involve many
thousands of model runs, other methods (such as emulators) can be
used to reduce computational expense when necessary. Note that full
variance decompositions are only meaningful when the input factors
are independent from one another. If that is not the case
information theory based GSUA is necessary (see Ludtke et al. )

Thus, I really would like to see GSUA done because it (i) informs
about the dynamics of the processes investigated and (ii) is very
important for management purposes.

Convertino et al. Untangling drivers of species distributions:
Global sensitivity and uncertainty analyses of MaxEnt. Journal
Environmental Modelling & Software archive Volume 51, January, 2014
Pages 296-309

Saltelli A, Marco Ratto, Terry Andres, Francesca Campolongo, Jessica
Cariboni, Debora Gatelli, Michaela Saisana, Stefano Tarantola Global
Sensitivity Analysis: The Primer ISBN: 978-0-470-05997-5

Ludtke et al. (2007), Information-theoretic Sensitivity Analysis: a
general method for credit assignment in complex networks J. Royal
Soc. Interface

### Review on the propensity score paper:

GENERAL COMMENTS

After a careful reading of the manuscript I really like the study
and I feel it can have some impact into the theory of biodiversity
and biogeography at multiple scales. My two technical observations
are about the lack of global sensitivity and uncertainty analyses
(GSUA) and a conversation about management implications that we can
extract from the model/GSUA. Also, I think the findings can be
presented in a clearer way by focusing on (i) the universality of
findings across macro-geographical areas, (2) probabilistic
structure of the variable considered and (3) the possibility to
discuss gradual and sudden change in a non-linear theoretical
framework (tipping points and gradual change). I would strongly
suggest to talk about ''potential causal factors/relationship''
rather than talking about true causality because that is very
difficulty proven and many causality assessment methods exist
(e.g. transfer entropy, conergence cross mapping, scaling analysis,
etc.). Also, can you provide an explanation for Eq. 6? Figure 2 does
not show regressions but scaling law relationship since you plot
everything in loglog. This can be an important results, in fact I
suggest you to consider this avenue of interpretation (see
Convertino et al. 2014 but also other work or Rinaldo and
Rodriguez-Iturbe).

Note that here with ''model'' I mean any method that use the data,
yet any model that process the data in input and produce an
output. Data in fact can be thought as a model and probability
distribution functions (pdfs) can be assigned to data variables (see
Convertino et al. 2014). These pdfs can be assigned to any source of
uncertainty about a variable (e.g. changing presence / absence into
a continuous variable) and the uncertainty of outputs (e.g. species
richness) can be tested against the uncertainty of all input
variables. I believe that just considering average values is not
enough.

As for the rest I really love the paper. I suggest to also plot the
patterns in Convertino et al (2009): these are for instance the JSI
and the Regional Species Richness; in ecological terms these can be
defined as alpha, beta and gamma diversity. These patters can be
studied as a function of geomorphological patterns such as the
distance from the coat in order to find potential drivers of
diversity. These are just ideas that can be pursued further. Lastly
I wonder if the data can be made available to the community for
further studies. For all above motivations I suggest to accept the
paper only after Moderate or Major Revisions. Again, I think that
these revisions can just make better the paper.

SPECIFIC COMMENTS:

In any context, e.g. as in this paper GSUA is very important because
it given an idea of what is driving the output in term of model
input factor importance and interaction, and how that can be used
for management. GSUA is a variance-based method for analyzing data
and models given an objective function. It is a bit unclear how many
realizations of the model have been run and how the authors
maximized prediction accuracy. Are the values of the input factors
taken to maximize predictions? GSUA (see references below) typically
assigns probability distribution functions to all model factors and
propagate that into model outputs. That is useful for assessing
input factor importance and interaction, regimes, and scaling laws
between model input factors and outcomes. This differs from
traditional sensitivity analysis methods (that are even missing
here)

Variance-based methods (see Saltelli and Convertino below) are a
class of probabilistic approaches which quantify the input and
output uncertainties as probability distributions, and decompose the
output variance into parts attributable to input variables and
combinations of variables. The sensitivity of the output to an input
variable is therefore measured by the amount of variance in the
output caused by that input. Variance-based methods allow full
exploration of the input space, accounting for interactions, and
nonlinear responses. For these reasons they are widely used when it
is feasible to calculate them. Typically this calculation involves
the use of Monte Carlo methods, but since this can involve many
thousands of model runs, other methods (such as emulators) can be
used to reduce computational expense when necessary. Note that full
variance decompositions are only meaningful when the input factors
are independent from one another. If that is not the case
information theory based GSUA is necessary (see Ludtke et al. for an
information theory model of GSUA).

Thus, I really would like to see GSUA done because it (i) informs
about the dynamics of the processes investigated and (ii) is very
important for management purposes.

REFERENCES

Convertino, M. et al (2009) On neutral metacommunity patterns of
river basins at different scales of aggregation
http://www1.maths.leeds.ac.uk/~fbssaz/articles/Convertino_WRR09.pdf

Convertino, M.; Baker, K.M.; Vogel, J.T.; Lu, C.; Suedel, B.; and
Linkov, I., "Multi-criteria decision analysis to select metrics for
design and monitoring of sustainable ecosystem restorations"
(2013). US Army
Research. Paper 190. http://digitalcommons.unl.edu/usarmyresearch/190
http://digitalcommons.unl.edu/cgi/viewcontent.cgi?article=1189&context=usarmyresearch

Convertino et al. Untangling drivers of species distributions:
Global sensitivity and uncertainty analyses of MaxEnt Journal
Environmental Modelling & Software archive Volume 51, January, 2014
Pages 296-309

Saltelli A, Marco Ratto, Terry Andres, Francesca Campolongo, Jessica
Cariboni, Debora Gatelli, Michaela Saisana, Stefano Tarantola Global
Sensitivity Analysis: The Primer ISBN: 978-0-470-05997-5

Ludtke et al. (2007), Information-theoretic Sensitivity Analysis: a
general method for credit assignment in complex networks J. Royal
Soc. Interface