Variable and threshold selection to control predictive accuracy in logistic regression
Date
2014-01-01
Authors
Journal Title
Journal ISSN
Volume Title
Repository Usage Stats
views
downloads
Citation Stats
Abstract
Summary: Using data collected from the 'Sequenced treatment alternatives to relieve depression' study, we use logistic regression to predict whether a patient will respond to treatment on the basis of early symptom change and patient characteristics. Model selection criteria such as the Akaike information criterion AIC and mean-squared-error of prediction MSEP may not be appropriate if the aim is to predict with a high degree of certainty who will respond or not respond to treatment. Towards this aim, we generalize the definition of the positive and negative predictive value curves to the case of multiple predictors. We point out that it is the ordering rather than the precise values of the response probabilities which is important, and we arrive at a unified approach to model selection via two-sample rank tests. To avoid overfitting, we define a cross-validated version of the positive and negative predictive value curves and compare these curves after smoothing for various models. When applied to the study data, we obtain a ranking of models that differs from those based on AIC and MSEP, as well as a tree-based method and regularized logistic regression using a lasso penalty. Our selected model performs consistently well for both 4-week-ahead and 7-week-ahead predictions. © 2014 Royal Statistical Society.
Type
Department
Description
Provenance
Citation
Permalink
Published Version (Please cite this version)
Publication Info
Kuk, AYC, J Li and A John Rush (2014). Variable and threshold selection to control predictive accuracy in logistic regression. Journal of the Royal Statistical Society. Series C: Applied Statistics, 63(4). pp. 657–672. 10.1111/rssc.12058 Retrieved from https://hdl.handle.net/10161/24819.
This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.
Collections
Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.