High Dimensional Variable Selection with Error Control.

Loading...
Thumbnail Image

Date

2016

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

158
views
125
downloads

Citation Stats

Abstract

Background. The iterative sure independence screening (ISIS) is a popular method in selecting important variables while maintaining most of the informative variables relevant to the outcome in high throughput data. However, it not only is computationally intensive but also may cause high false discovery rate (FDR). We propose to use the FDR as a screening method to reduce the high dimension to a lower dimension as well as controlling the FDR with three popular variable selection methods: LASSO, SCAD, and MCP. Method. The three methods with the proposed screenings were applied to prostate cancer data with presence of metastasis as the outcome. Results. Simulations showed that the three variable selection methods with the proposed screenings controlled the predefined FDR and produced high area under the receiver operating characteristic curve (AUROC) scores. In applying these methods to the prostate cancer example, LASSO and MCP selected 12 and 8 genes and produced AUROC scores of 0.746 and 0.764, respectively. Conclusions. We demonstrated that the variable selection methods with the sequential use of FDR and ISIS not only controlled the predefined FDR in the final models but also had relatively high AUROC scores.

Department

Description

Provenance

Citation

Published Version (Please cite this version)

10.1155/2016/8209453

Publication Info

Kim, Sangjin, and Susan Halabi (2016). High Dimensional Variable Selection with Error Control. Biomed Res Int, 2016. p. 8209453. 10.1155/2016/8209453 Retrieved from https://hdl.handle.net/10161/15383.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Halabi

Susan Halabi

James B. Duke Distinguished Professor

Design and analysis of clinical trials, statistical analysis of biomarker and high dimensional data, development and validation of prognostic and predictive models.


Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.