Browsing by Author "Goldstein, Benjamin A"
Now showing 1 - 13 of 13
Results Per Page
Sort Options
Item Open Access Clinician Burnout Associated With Sex, Clinician Type, Work Culture, and Use of Electronic Health Records.(JAMA network open, 2021-04) McPeek-Hinz, Eugenia; Boazak, Mina; Sexton, J Bryan; Adair, Kathryn C; West, Vivian; Goldstein, Benjamin A; Alphin, Robert S; Idris, Sherif; Hammond, W Ed; Hwang, Shelley E; Bae, JonathanImportance
Electronic health records (EHRs) are considered a potentially significant contributor to clinician burnout.Objective
To describe the association of EHR usage, sex, and work culture with burnout for 3 types of clinicians at an academic medical institution.Design, setting, and participants
This cross-sectional study of 1310 clinicians at a large tertiary care academic medical center analyzed EHR usage metrics for the month of April 2019 with results from a well-being survey from May 2019. Participants included attending physicians, advanced practice providers (APPs), and house staff from various specialties. Data were analyzed between March 2020 and February 2021.Exposures
Clinician demographic characteristics, EHR metadata, and an institution-wide survey.Main outcomes and measures
Study metrics included clinician demographic data, burnout score, well-being measures, and EHR usage metadata.Results
Of the 1310 clinicians analyzed, 542 (41.4%) were men (mean [SD] age, 47.3 [11.6] years; 448 [82.7%] White clinicians, 52 [9.6%] Asian clinicians, and 21 [3.9%] Black clinicians) and 768 (58.6%) were women (mean [SD] age, 42.6 [10.3] years; 573 [74.6%] White clinicians, 105 [13.7%] Asian clinicians, and 50 [6.5%] Black clinicians). Women reported more burnout (survey score ≥50: women, 423 [52.0%] vs men, 258 [47.6%]; P = .008) overall. No significant differences in EHR usage were found by sex for multiple metrics of time in the EHR, metrics of volume of clinical encounters, or differences in products of clinical care. Multivariate analysis of burnout revealed that work culture domains were significantly associated with self-reported results for commitment (odds ratio [OR], 0.542; 95% CI, 0.427-0.688; P < .001) and work-life balance (OR, 0.643; 95% CI, 0.559-0.739; P < .001). Clinician sex significantly contributed to burnout, with women having a greater likelihood of burnout compared with men (OR, 1.33; 95% CI, 1.01-1.75; P = .04). An increased number of days spent using the EHR system was associated with less likelihood of burnout (OR, 0.966; 95% CI, 0.937-0.996; P = .03). Overall, EHR metrics accounted for 1.3% of model variance (P = .001) compared with work culture accounting for 17.6% of variance (P < .001).Conclusions and relevance
In this cross-sectional study, sex-based differences in EHR usage and burnout were found in clinicians. These results also suggest that local work culture factors may contribute more to burnout than metrics of EHR usage.Item Open Access Combining adult with pediatric patient data to develop a clinical decision support tool intended for children: leveraging machine learning to model heterogeneity.(BMC medical informatics and decision making, 2022-03) Sabharwal, Paul; Hurst, Jillian H; Tejwani, Rohit; Hobbs, Kevin T; Routh, Jonathan C; Goldstein, Benjamin ABackground
Clinical decision support (CDS) tools built using adult data do not typically perform well for children. We explored how best to leverage adult data to improve the performance of such tools. This study assesses whether it is better to build CDS tools for children using data from children alone or to use combined data from both adults and children.Methods
Retrospective cohort using data from 2017 to 2020. Participants include all individuals (adults and children) receiving an elective surgery at a large academic medical center that provides adult and pediatric services. We predicted need for mechanical ventilation or admission to the intensive care unit (ICU). Predictor variables included demographic, clinical, and service utilization factors known prior to surgery. We compared predictive models built using machine learning to regression-based methods that used a pediatric or combined adult-pediatric cohort. We compared model performance based on Area Under the Receiver Operator Characteristic.Results
While we found that adults and children have different risk factors, machine learning methods are able to appropriately model the underlying heterogeneity of each population and produce equally accurate predictive models whether using data only from pediatric patients or combined data from both children and adults. Results from regression-based methods were improved by the use of pediatric-specific data.Conclusions
CDS tools for children can successfully use combined data from adults and children if the model accounts for underlying heterogeneity, as in machine learning models.Item Open Access Correction to: Combining adult with pediatric patient data to develop a clinical decision support tool intended for children: leveraging machine learning to model heterogeneity.(BMC medical informatics and decision making, 2022-05) Sabharwal, Paul; Hurst, Jillian H; Tejwani, Rohit; Hobbs, Kevin T; Routh, Jonathan C; Goldstein, Benjamin AFollowing publication of the original article [1], it was reported that part of the ‘Outcome Variable Definition’ and the entirety of the ‘Descriptive statistics’ subsection was missing. These two subsections are given below with the previously missing text highlighted in bold. The original article [1] has been updated. Outcome Variable Definition In the initial development of the CDS tool, we were tasked with predicting four outcomes related to hospital resource utilization: overall length of stay, admission to the intensive care unit (ICU), requirement for mechanical ventilation, and discharge to a skilled nursing facility. Because children are rarely discharged to a skilled nursing facility and evaluating continuous outcomes poses unique challenges, we focused on the two binary outcomes: admission to the ICU and requirement for mechanical ventilation. Statistical Analysis Descriptive statistics We compared the pediatric and adult patient populations. We report standardized mean differences (SMDs) where an SMD > 0.10 indicates that the two groups are out of balance.Item Open Access Designing risk prediction models for ambulatory no-shows across different specialties and clinics.(Journal of the American Medical Informatics Association : JAMIA, 2018-08) Ding, Xiruo; Gellad, Ziad F; Mather, Chad; Barth, Pamela; Poon, Eric G; Newman, Mark; Goldstein, Benjamin AObjective:As available data increases, so does the opportunity to develop risk scores on more refined patient populations. In this paper we assessed the ability to derive a risk score for a patient no-showing to a clinic visit. Methods:Using data from 2 264 235 outpatient appointments we assessed the performance of models built across 14 different specialties and 55 clinics. We used regularized logistic regression models to fit and assess models built on the health system, specialty, and clinic levels. We evaluated fits based on their discrimination and calibration. Results:Overall, the results suggest that a relatively robust risk score for patient no-shows could be derived with an average C-statistic of 0.83 across clinic level models and strong calibration. Moreover, the clinic specific models, even with lower training set sizes, often performed better than the more general models. Examination of the individual models showed that risk factors had different degrees of predictability across the different specialties. Implementation of optimal modeling strategies would lead to capturing an additional 4819 no-shows per-year. Conclusion:Overall, this work highlights both the opportunity for and the importance of leveraging the available electronic health record data to develop more refined risk models.Item Open Access Development, Implementation, and Evaluation of an In-Hospital Optimized Early Warning Score for Patient Deterioration.(MDM policy & practice, 2020-01-10) O'Brien, Cara; Goldstein, Benjamin A; Shen, Yueqi; Phelan, Matthew; Lambert, Curtis; Bedoya, Armando D; Steorts, Rebecca CBackground. Identification of patients at risk of deteriorating during their hospitalization is an important concern. However, many off-shelf scores have poor in-center performance. In this article, we report our experience developing, implementing, and evaluating an in-hospital score for deterioration. Methods. We abstracted 3 years of data (2014-2016) and identified patients on medical wards that died or were transferred to the intensive care unit. We developed a time-varying risk model and then implemented the model over a 10-week period to assess prospective predictive performance. We compared performance to our currently used tool, National Early Warning Score. In order to aid clinical decision making, we transformed the quantitative score into a three-level clinical decision support tool. Results. The developed risk score had an average area under the curve of 0.814 (95% confidence interval = 0.79-0.83) versus 0.740 (95% confidence interval = 0.72-0.76) for the National Early Warning Score. We found the proposed score was able to respond to acute clinical changes in patients' clinical status. Upon implementing the score, we were able to achieve the desired positive predictive value but needed to retune the thresholds to get the desired sensitivity. Discussion. This work illustrates the potential for academic medical centers to build, refine, and implement risk models that are targeted to their patient population and work flow.Item Open Access Evaluation of ML-Based Clinical Decision Support Tool to Replace an Existing Tool in an Academic Health System: Lessons Learned.(Journal of personalized medicine, 2020-08-27) Woo, Myung; Alhanti, Brooke; Lusk, Sam; Dunston, Felicia; Blackwelder, Stephen; Lytle, Kay S; Goldstein, Benjamin A; Bedoya, ArmandoThere is increasing application of machine learning tools to problems in healthcare, with an ultimate goal to improve patient safety and health outcomes. When applied appropriately, machine learning tools can augment clinical care provided to patients. However, even if a model has impressive performance characteristics, prospectively evaluating and effectively implementing models into clinical care remains difficult. The primary objective of this paper is to recount our experiences and challenges in comparing a novel machine learning-based clinical decision support tool to legacy, non-machine learning tools addressing potential safety events in the hospitals and to summarize the obstacles which prevented evaluation of clinical efficacy of tools prior to widespread institutional use. We collected and compared safety events data, specifically patient falls and pressure injuries, between the standard of care approach and machine learning (ML)-based clinical decision support (CDS). Our assessment was limited to performance of the model rather than the workflow due to challenges in directly comparing both approaches. We did note a modest improvement in falls with ML-based CDS; however, it was not possible to determine that overall improvement was due to model characteristics.Item Open Access Feasibility of Post-hospitalization Telemedicine Video Visits for Children With Medical Complexity.(Journal of pediatric health care : official publication of National Association of Pediatric Nurse Associates & Practitioners, 2022-03) Ming, David Y; Li, Tingxuan; Ross, Melissa H; Frush, Jennifer; He, Jingyi; Goldstein, Benjamin A; Jarrett, Valerie; Krohl, Natalie; Docherty, Sharron L; Turley, Christine B; Bosworth, Hayden BObjectives
To evaluate feasibility and acceptability of post-hospitalization telemedicine video visits (TMVV) during hospital-to-home transitions for children with medical complexity (CMC); and explore associations with hospital utilization, caregiver self-efficacy (CSE), and family self-management (FSM).Method
This non-randomized pilot study assigned CMC (n=28) to weekly TMVV for four weeks post-hospitalization; control CMC (n=20) received usual care without telemedicine. Feasibility was measured by time to connection and proportion of TMVV completed; acceptability was measured by parent-reported surveys. Pre/post-discharge changes in CSE, FSM, and hospital utilization were assessed.Results
64 TMVV were completed; 82 % of patients completed 1 TMVV; 54 % completed four TMVV. Median time to TMVV connection was 1 minute (IQR=2.5). Parents reported high acceptability of TMVV (mean 6.42; 1 -7 scale). CSE and FSM pre/post-discharge were similar for both groups; utilization declined in both groups post-discharge.Discussion
Post-hospitalization TMVV for CMC were feasible and acceptable during hospital-to-home transitions.Item Open Access Gene by Environment Investigation of Incident Lung Cancer Risk in African-Americans.(EBioMedicine, 2016-02) David, Sean P; Wang, Ange; Kapphahn, Kristopher; Hedlin, Haley; Desai, Manisha; Henderson, Michael; Yang, Lingyao; Walsh, Kyle M; Schwartz, Ann G; Wiencke, John K; Spitz, Margaret R; Wenzlaff, Angela S; Wrensch, Margaret R; Eaton, Charles B; Furberg, Helena; Mark Brown, W; Goldstein, Benjamin A; Assimes, Themistocles; Tang, Hua; Kooperberg, Charles L; Quesenberry, Charles P; Tindle, Hilary; Patel, Manali I; Amos, Christopher I; Bergen, Andrew W; Swan, Gary E; Stefanick, Marcia LBACKGROUND: Genome-wide association studies have identified polymorphisms linked to both smoking exposure and risk of lung cancer. The degree to which lung cancer risk is driven by increased smoking, genetics, or gene-environment interactions is not well understood. METHODS: We analyzed associations between 28 single nucleotide polymorphisms (SNPs) previously associated with smoking quantity and lung cancer in 7156 African-American females in the Women's Health Initiative (WHI), then analyzed main effects of top nominally significant SNPs and interactions between SNPs, cigarettes per day (CPD) and pack-years for lung cancer in an independent, multi-center case-control study of African-American females and males (1078 lung cancer cases and 822 controls). FINDINGS: Nine nominally significant SNPs for CPD in WHI were associated with incident lung cancer (corrected p-values from 0.027 to 6.09 × 10(-5)). CPD was found to be a nominally significant effect modifier between SNP and lung cancer for six SNPs, including CHRNA5 rs2036527[A](betaSNP*CPD = - 0.017, p = 0.0061, corrected p = 0.054), which was associated with CPD in a previous genome-wide meta-analysis of African-Americans. INTERPRETATION: These results suggest that chromosome 15q25.1 variants are robustly associated with CPD and lung cancer in African-Americans and that the allelic dose effect of these polymorphisms on lung cancer risk is most pronounced in lighter smokers.Item Open Access Incorporating informatively collected laboratory data from EHR in clinical prediction models.(BMC medical informatics and decision making, 2024-07) Sun, Minghui; Engelhard, Matthew M; Bedoya, Armando D; Goldstein, Benjamin ABackground
Electronic Health Records (EHR) are widely used to develop clinical prediction models (CPMs). However, one of the challenges is that there is often a degree of informative missing data. For example, laboratory measures are typically taken when a clinician is concerned that there is a need. When data are the so-called Not Missing at Random (NMAR), analytic strategies based on other missingness mechanisms are inappropriate. In this work, we seek to compare the impact of different strategies for handling missing data on CPMs performance.Methods
We considered a predictive model for rapid inpatient deterioration as an exemplar implementation. This model incorporated twelve laboratory measures with varying levels of missingness. Five labs had missingness rate levels around 50%, and the other seven had missingness levels around 90%. We included them based on the belief that their missingness status can be highly informational for the prediction. In our study, we explicitly compared the various missing data strategies: mean imputation, normal-value imputation, conditional imputation, categorical encoding, and missingness embeddings. Some of these were also combined with the last observation carried forward (LOCF). We implemented logistic LASSO regression, multilayer perceptron (MLP), and long short-term memory (LSTM) models as the downstream classifiers. We compared the AUROC of testing data and used bootstrapping to construct 95% confidence intervals.Results
We had 105,198 inpatient encounters, with 4.7% having experienced the deterioration outcome of interest. LSTM models generally outperformed other cross-sectional models, where embedding approaches and categorical encoding yielded the best results. For the cross-sectional models, normal-value imputation with LOCF generated the best results.Conclusion
Strategies that accounted for the possibility of NMAR missing data yielded better model performance than those did not. The embedding method had an advantage as it did not require prior clinical knowledge. Using LOCF could enhance the performance of cross-sectional models but have countereffects in LSTM models.Item Open Access Investigating sources of inaccuracy in wearable optical heart rate sensors.(NPJ digital medicine, 2020-01) Bent, Brinnae; Goldstein, Benjamin A; Kibbe, Warren A; Dunn, Jessilyn PAs wearable technologies are being increasingly used for clinical research and healthcare, it is critical to understand their accuracy and determine how measurement errors may affect research conclusions and impact healthcare decision-making. Accuracy of wearable technologies has been a hotly debated topic in both the research and popular science literature. Currently, wearable technology companies are responsible for assessing and reporting the accuracy of their products, but little information about the evaluation method is made publicly available. Heart rate measurements from wearables are derived from photoplethysmography (PPG), an optical method for measuring changes in blood volume under the skin. Potential inaccuracies in PPG stem from three major areas, includes (1) diverse skin types, (2) motion artifacts, and (3) signal crossover. To date, no study has systematically explored the accuracy of wearables across the full range of skin tones. Here, we explored heart rate and PPG data from consumer- and research-grade wearables under multiple circumstances to test whether and to what extent these inaccuracies exist. We saw no statistically significant difference in accuracy across skin tones, but we saw significant differences between devices, and between activity types, notably, that absolute error during activity was, on average, 30% higher than during rest. Our conclusions indicate that different wearables are all reasonably accurate at resting and prolonged elevated heart rate, but that differences exist between devices in responding to changes in activity. This has implications for researchers, clinicians, and consumers in drawing study conclusions, combining study results, and making health-related decisions using these devices.Item Open Access Performance of the National Early Warning Score in Hospitalized Patients With Kidney Failure on Maintenance Hemodialysis.(Kidney medicine, 2022-08) Cavalier, Joanna; Zhao, Congwen; Scialla, Julia; Bedoya, Armando; Goldstein, Benjamin AItem Open Access Substance use and mental diagnoses among adults with and without type 2 diabetes: Results from electronic health records data.(Drug and alcohol dependence, 2015-11) Wu, Li-Tzy; Ghitza, Udi E; Batch, Bryan C; Pencina, Michael J; Rojas, Leoncio Flavio; Goldstein, Benjamin A; Schibler, Tony; Dunham, Ashley A; Rusincovitch, Shelley; Brady, Kathleen TBACKGROUND:Comorbid diabetes and substance use diagnoses (SUD) represent a hazardous combination, both in terms of healthcare cost and morbidity. To date, there is limited information about the association of SUD and related mental disorders with type 2 diabetes mellitus (T2DM). METHODS:We examined the associations between T2DM and multiple psychiatric diagnosis categories, with a focus on SUD and related psychiatric comorbidities among adults with T2DM. We analyzed electronic health record (EHR) data on 170,853 unique adults aged ≥18 years from the EHR warehouse of a large academic healthcare system. Logistic regression analyses were conducted to estimate the strength of an association for comorbidities. RESULTS:Overall, 9% of adults (n=16,243) had T2DM. Blacks, Hispanics, Asians, and Native Americans had greater odds of having T2DM than whites. All 10 psychiatric diagnosis categories were more prevalent among adults with T2DM than among those without T2DM. Prevalent diagnoses among adults with T2MD were mood (21.22%), SUD (17.02%: tobacco 13.25%, alcohol 4.00%, drugs 4.22%), and anxiety diagnoses (13.98%). Among adults with T2DM, SUD was positively associated with mood, anxiety, personality, somatic, and schizophrenia diagnoses. CONCLUSIONS:We examined a large diverse sample of individuals and found clinical evidence of SUD and psychiatric comorbidities among adults with T2DM. These results highlight the need to identify feasible collaborative care models for adults with T2DM and SUD related psychiatric comorbidities, particularly in primary care settings, that will improve behavioral health and reduce health risk.Item Open Access Vitals are Vital: Simpler Clinical Data Model Predicts Decompensation in COVID-19 Patients(ACI Open, 2022-01) Cavalier, Joanna Schneider; O'Brien, Cara L; Goldstein, Benjamin A; Zhao, Congwen; Bedoya, ArmandoAbstract Objective Several risk scores have been developed and tested on coronavirus disease 2019 (COVID-19) patients to predict clinical decompensation. We aimed to compare an institutional, automated, custom-built early warning score (EWS) to the National Early Warning Score (NEWS) in COVID-19 patients. Methods A retrospective cohort analysis was performed on patients with COVID-19 infection who were admitted to an intermediate ward from March to December 2020. A machine learning–based customized EWS algorithm, which incorporates demographics, laboratory values, vital signs, and comorbidities, and the NEWS, which uses vital signs only, were calculated at 12-hour intervals. These patients were retrospectively assessed for decompensation in the subsequent 12 or 24 hours, defined as death or transfer to an intensive care unit. Results Of 709 patients, 112 (15.8%) had a decompensation event. Using the custom EWS, decompensation within 12 and 24 hours was predicted with areas under the receiver operating curve (AUC) of 0.81 and 0.79, respectively. The NEWS score applied to the same population yielded AUCs of 0.83 and 0.81, respectively. The 24-hour negative predictive values (NPV) of the NEWS and EWS in patients identified as low risk were 99.6 and 99.2%, respectively. Conclusion The NEWS score performs as well as a customized EWS in COVID-19 patients, demonstrating the significance of vital signs in predicting outcomes. The relatively high positive predictive value and NPV of both scores are indispensable for optimally allocating clinical resources. In this relatively young, healthy population, a more complex score incorporating electronic health record data beyond vital signs does not add clinical benefit.