Comparison of regression imputation methods of baseline covariates that predict survival outcomes.

dc.contributor.author

Solomon, Nicole

dc.contributor.author

Lokhnygina, Yuliya

dc.contributor.author

Halabi, Susan

dc.date.accessioned

2024-06-06T14:13:19Z

dc.date.available

2024-06-06T14:13:19Z

dc.date.issued

2020-09

dc.description.abstract

Introduction

Missing data are inevitable in medical research and appropriate handling of missing data is critical for statistical estimation and making inferences. Imputation is often employed in order to maximize the amount of data available for statistical analysis and is preferred over the typically biased output of complete case analysis. This article examines several types of regression imputation of missing covariates in the prediction of time-to-event outcomes subject to right censoring.

Methods

We evaluated the performance of five regression methods in the imputation of missing covariates for the proportional hazards model via summary statistics, including proportional bias and proportional mean squared error. The primary objective was to determine which among the parametric generalized linear models (GLMs) and least absolute shrinkage and selection operator (LASSO), and nonparametric multivariate adaptive regression splines (MARS), support vector machine (SVM), and random forest (RF), provides the "best" imputation model for baseline missing covariates in predicting a survival outcome.

Results

LASSO on an average observed the smallest bias, mean square error, mean square prediction error, and median absolute deviation (MAD) of the final analysis model's parameters among all five methods considered. SVM performed the second best while GLM and MARS exhibited the lowest relative performances.

Conclusion

LASSO and SVM outperform GLM, MARS, and RF in the context of regression imputation for prediction of a time-to-event outcome.
dc.identifier

S2059866120005336

dc.identifier.issn

2059-8661

dc.identifier.issn

2059-8661

dc.identifier.uri

https://hdl.handle.net/10161/31110

dc.language

eng

dc.publisher

Cambridge University Press (CUP)

dc.relation.ispartof

Journal of clinical and translational science

dc.relation.isversionof

10.1017/cts.2020.533

dc.rights.uri

https://creativecommons.org/licenses/by-nc/4.0

dc.subject

Missing data

dc.subject

proportional hazards model

dc.subject

regression imputation

dc.title

Comparison of regression imputation methods of baseline covariates that predict survival outcomes.

dc.type

Journal article

duke.contributor.orcid

Solomon, Nicole|0000-0002-5643-9958

duke.contributor.orcid

Halabi, Susan|0000-0003-4135-2777

pubs.begin-page

e40

pubs.issue

1

pubs.organisational-group

Duke

pubs.organisational-group

School of Medicine

pubs.organisational-group

Staff

pubs.organisational-group

Basic Science Departments

pubs.organisational-group

Institutes and Centers

pubs.organisational-group

Biostatistics & Bioinformatics

pubs.organisational-group

Duke Cancer Institute

pubs.organisational-group

Duke Clinical Research Institute

pubs.organisational-group

Biostatistics & Bioinformatics, Division of Biostatistics

pubs.publication-status

Published

pubs.volume

5

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Comparison of regression imputation methods of baseline covariates that predict survival outcomes.pdf
Size:
460.72 KB
Format:
Adobe Portable Document Format