A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

dc.contributor.advisor

Mukherjee, Sayan

dc.contributor.author

Coker, Beau

dc.date.accessioned

2018-05-31T21:18:46Z

dc.date.available

2018-11-15T09:17:13Z

dc.date.issued

2018

dc.department

Statistical Science

dc.description.abstract

Inference is the process of using facts we know to learn about facts we do not know. A theory of inference gives assumptions necessary to get from the former to the latter, along with a definition for and summary of the resulting uncertainty. Any one theory of inference is neither right nor wrong, but merely an axiom that may or may not be useful. Each of the many diverse theories of inference can be valuable for certain applications. However, no existing theory of inference addresses the tendency to choose, from the range of plausible data analysis specifications consistent with prior evidence, those that inadvertently favor one's own hypotheses. Since the biases from these choices are a growing concern across scientific fields, and in a sense the reason the scientific community was invented in the first place, we introduce a new theory of inference designed to address this critical problem. From this theory, we derive ``hacking intervals,'' which are the range of summary statistic one may obtain given a class of possible endogenous manipulations of the data. They make no appeal to hypothetical data sets drawn from imaginary superpopulations. A scientific result with a small hacking interval is more robust to researcher manipulation than one with a larger interval, and is often easier to interpret than a classic confidence interval. Hacking intervals turn out to be equivalent to classical confidence intervals under the linear regression model, and are equivalent to profile likelihood confidence intervals under certain other conditions, which means they may sometimes provide a more intuitive and potentially more useful interpretation of classical intervals.

dc.identifier.uri

https://hdl.handle.net/10161/17039

dc.subject

Statistics

dc.subject

Causal inference

dc.subject

Machine learning

dc.subject

p-hacking

dc.title

A Theory of Statistical Inference for Ensuring the Robustness of Scientific Results

dc.type

Master's thesis

duke.embargo.months

5

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Coker_duke_0066N_14626.pdf
Size:
1.98 MB
Format:
Adobe Portable Document Format

Collections