Browsing by Author "Goldstein, Benjamin"
- Results Per Page
- Sort Options
Item Open Access Deployment of the Epic Readmission Risk Score in DUH GenMed Patients(2019-03-01) Daley, Caitlin; Gallagher, David; Melton, Jessica; Long, Andrea; Goldstein, Benjamin; Brucker, Amanda; Kramer, Patricia; McCarthy, Colleen; Yanik, JacquelynItem Open Access Improving Clinical Prediction Models with Statistical Representation Learning(2021) Xiu, ZidiThis dissertation studies novel statistical machine learning approaches for healthcare risk prediction applications in the presence of challenging scenarios, such as rare events, noisy observations, data imbalance, missingness and censoring. Such scenarios manifest frequently in practice, and they compromise the validity of standard predictive models which often expect clean and complete data. As such, alleviating the negative impacts of real-world data challenges is of great significance and constitutes the overarching goal of this dissertation, which investigates novel strategies to (i) account for data uncertainties and statistical characteristics of low-prevalence events; (ii) re-balancing and augmenting the representations for minority data under proper causal assumptions; (iii) dynamically assigning scores and attend to the observed units to derive robust features.By integrating the ideas from representation learning, variational Bayes, causal inference, and contrastive training, this dissertation builds tools for risk modeling frameworks that are robust to various peculiarities of real-world datasets to yield reliable individualized risk evaluations.
This dissertation starts with a systematic review of classical risk prediction models in Chapter 1, and discusses the new opportunities and challenges presented by the big data era. With the increasing availability of healthcare data and the current rapid development of machine learning models, clinical decision support systems have seen new opportunities to improve clinical practice. However, in healthcare risk prediction applications, statistical analysis is not only challenged by data incompleteness and skewed distributions but also the complexity of the inputs. To motivate the subsequent developments, discussions on the limitations in risk minimization methods, robustness against high-dimensional data with incompleteness, and the need for individualization are provided.
As a concrete example to address a canonical problem, Chapter 2 proposes a variational disentanglement approach to semi-parametrically learns from the heavily imbalanced binary classification datasets. In this new method, which is named Variational Inference for Extremals (VIE), we apply an extreme value theory to enable efficient learning with few observations. By organically integrating the generalized additive model and isotonic neural nets, VIE enjoys the merits of improved robustness, interpretability, and generalizability for the accurate prediction of rare events. An analysis of the COVID-19 cohort from Duke Hospitals demonstrates that the proposed approach outperforms competing solutions. We investigate a more generalized setting of a multi-classification problem with heavily imbalanced data in Chapter 3, from the perspective of causal machine learning to promote sample efficiency and model generalization. Our solution, named Energy-based Causal Representation Transfer (ECRT), posits a meta-distributional scenario, where the data generating mechanism for label-conditional features is invariant across different labels. Such causal assumption enables efficient knowledge transfer from the dominant classes to their under-represented counterparts, even if their feature distributions show apparent disparities. This allows us to leverage a causally informed data augmentation procedure based on nonlinear independent component analysis to enrich the representation of minority classes and simultaneously data whitening. The effectiveness and enhanced prediction accuracy are demonstrated through synthetic data and real-world benchmarks compared with state-of-art models.
In Chapter 4 we deal with the time-to-event prediction with censored (missing in outcomes) and incomplete (missing in covariates) observations. Also known as survival analysis, time-to-event prediction plays a crucial role in many clinical applications, yet the classical survival solutions scale poorly w.r.t. data complexity and incompleteness. To better handle sophisticated modern health data and alleviate the impact of real-world data challenges, we introduce a self-attention based model to capture helpful information for time-to-event prediction, called Energy-based Latent Self-Attentive Survival Analysis (ELSSA). A key novelty of this approach is the integration of contrastive mutual information based loss that non-parametrically maximizes the informativeness of learned data representations. The effectiveness of our approaches has been intensively validated with synthetic and real-world benchmarks, showing improved performance relative to competing solutions.
In summary, this dissertation presents flexible risk prediction frameworks that acknowledge representation uncertainties, data heterogeneity and incompleteness. Altogether, it presents three contributions: improved efficient learning from imbalanced data, enhanced robustness to missing data, and better generalizability to out-of-sample subjects.
Item Open Access Provision and Utilization of Team- and Community-Based Operative Care for Patients With Cleft Lip/Palate in North Carolina.(The Cleft palate-craniofacial journal : official publication of the American Cleft Palate-Craniofacial Association, 2020-11) Le, Elliot; Shrader, Peter; Bosworth, Hayden; Hurst, Jillian; Goldstein, Benjamin; Drake, Amelia; Wood, Jeyhan; David, Lisa R; Runyan, Christopher M; Vissoci, Joao Ricardo Nickenig; Harker, Matthew; Allori, Alexander CObjective
To characterize operative care for cleft lip and/or palate (CL/P) based on location (ie, from American Cleft Palate Craniofacial Association [ACPA]-approved multidisciplinary teams or from community providers).Design
Cross-sectional analysis of Healthcare Cost and Utilization Project State Inpatient Database and State Ambulatory Surgery & Services Database databases for North Carolina from 2012 to 2015.Setting/patients and main outcome measures
Clinical encounters for children with CL/P undergoing operative procedures were identified, classified by location as "Team" versus "Community," and characterized by demographic, geographic, clinical, and procedural factors. A secondary evaluation reviewed concordance of team and community practices with an ACPA guideline related to coordination of care.Results
Three teams and 39 community providers performed a total of 3010 cleft-related procedures across 2070 encounters. Teams performed 69.7% of total volume and performed the majority of cleft procedures, including cleft lip repair, palate repair, alveolar bone grafting, and correction of velopharyngeal insufficiency. Community locations principally offered myringotomy and rhinoplasty. Team care was associated with higher guideline concordance.Conclusions
American Cleft Palate Craniofacial Association -approved team-based care accounts for the majority of cleft-related care in North Carolina; however, a substantial volume of cleft-related procedures was provided by community providers, with 3 providers accounting for the vast majority of community cases.Item Open Access Towards a Characterization of the Complete Rotationally Symmetric Minimal Surfaces with Plateau-Like Singularities(2024-04) Goldstein, BenjaminThe problem of finding and characterizing the surfaces in R3 which locally minimize area is known as Plateau's problem. Although the catenoid and the plane were proven in the 1700s to minimize area, there has been little further study of rotationally symmetric minimal surfaces. In this study, we investigate the complete rotationally symmetric solutions to Plateau's problem, revealing surprising depth due to singularities that may appear in a broad class of minimal surfaces. Our analysis is structured around the topology of the surface's generating graph, and we first consider surfaces of a simple topological type. For these surfaces, we prove new statements about complexity and shape, relating the number of singularities to the Hausdorff distance from a canonical example. We then consider more complicated structures, producing a novel surface with a handle (in particular, whose generating graph contains a 4-cycle). We finally provide direction for future study.