A Comparison Of Multiple Imputation Methods For Categorical Data

Loading...
Thumbnail Image

Date

2015

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

377
views
2262
downloads

Abstract

This thesis evaluates the performance of several multiple imputation methods for categorical data, including multiple imputation by chained equations using generalized linear models, multiple imputation by chained equations using classification and regression trees and non-parametric Bayesian multiple imputation for categorical data (using the Dirichlet process mixture of products of multinomial distributions model). The performance of each method is evaluated with repeated sampling studies using housing unit data from the American Community Survey 2012. These data afford exploration of practical problems such as multicollinearity and large dimensions. This thesis highlights some advantages and limitations of each method compared to others. Finally, it provides suggestions on which method should be preferred, and conditions under which the suggestions hold.

Description

Provenance

Citation

Citation

Akande, Olanrewaju Michael (2015). A Comparison Of Multiple Imputation Methods For Categorical Data. Master's thesis, Duke University. Retrieved from https://hdl.handle.net/10161/10028.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.