Multiple Testing for Data with Ancillary Information

Loading...
Thumbnail Image

Date

2022

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

93
views
54
downloads

Abstract

In my dissertation, I develop three powerful hierarchical multiple testing methods by accounting for ancillary information of data. In my first project, we develop a multiple testing framework named Distance Assisted Recursive Testing (DART). DART assumes there exists some informative distance information in the data. Through rigorous proof and extensive simulations, we justified the false discovery rate (FDR) control and sensitivity improvement of DART. As an illustration, we apply our method to a clinical trial in leukemia patients receiving hematopoietic cell transplantation to identify the gut microbiota whose abundance will be impacted by the after-transplant care. The second project is motivated by the flow cytometry analysis in immunology study. The analysis can be translated into a statistical problem which is trying to pinpoint the regions where two density functions differ. By partitioning the sample space into small bins and conducting testing on each bin, we model the analysis into a multiple testing problem. We provide theoretical justification that the procedure achieves the statistical goal of pinpointing the regions with differential density with high sensitivity and precision. My third project is motivated by the rare variant association study. We develop a multiple testing framework named DATED (Dynamic Aggregation and Tree-Embedded testing) to pinpoint the disease-associated rare-variant regions hierarchically and dynamically. To accommodate the application objective, DATED adopts a rare variant region-level FDR weighted by the proportions of the neutral rare-variant. Extensive numerical simulations demonstrate the superior performance of DATED under various scenarios compared to the existing methods. We illustrate DATED by applying it to an amyotrophic lateral sclerosis (ALS) study for identifying pathogenic rare variants.

Description

Provenance

Citation

Citation

Li, Xuechan (2022). Multiple Testing for Data with Ancillary Information. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/25262.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.