Machine learning functional impairment classification with electronic health record data.



Poor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a scalable process for identification of functional impairment.


We identified a cohort of patients with an electronically captured screening measure of functional status (Older Americans Resources and Services ADL/IADL) between 2018 and 2020 (N = 6484). Patients were classified using unsupervised learning K means and t-distributed Stochastic Neighbor Embedding into normal function (NF), mild to moderate functional impairment (MFI), and severe functional impairment (SFI) states. Using 11 EHR clinical variable domains (832 variable input features), we trained an Extreme Gradient Boosting supervised machine learning algorithm to distinguish functional status states, and measured prediction accuracies. Data were randomly split into training (80%) and test (20%) sets. The SHapley Additive Explanations (SHAP) feature importance analysis was used to list the EHR features in rank order of their contribution to the outcome.


Median age was 75.3 years, 62% female, 60% White. Patients were classified as 53% NF (n = 3453), 30% MFI (n = 1947), and 17% SFI (n = 1084). Summary of model performance for identifying functional status state (NF, MFI, SFI) was AUROC (area under the receiving operating characteristic curve) 0.92, 0.89, and 0.87, respectively. Age, falls, hospitalization, home health use, labs (e.g., albumin), comorbidities (e.g., dementia, heart failure, chronic kidney disease, chronic pain), and social determinants of health (e.g., alcohol use) were highly ranked features in predicting functional status states.


A machine learning algorithm run on EHR clinical data has potential utility for differentiating functional status in the clinical setting. Through further validation and refinement, such algorithms can complement traditional screening methods and result in a population-based strategy for identifying patients with poor functional status who need additional health resources.





Published Version (Please cite this version)


Publication Info

Pavon, Juliessa M, Laura Previll, Myung Woo, Ricardo Henao, Mary Solomon, Ursula Rogers, Andrew Olson, Jonathan Fischer, et al. (2023). Machine learning functional impairment classification with electronic health record data. Journal of the American Geriatrics Society, 71(9). pp. 2822–2833. 10.1111/jgs.18383 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Juliessa Pavon

Associate Professor of Medicine

Laura Ann Previll

Assistant Consulting Professor in the Department of Medicine

Ricardo Henao

Associate Professor in Biostatistics & Bioinformatics

Christopher Leo

Medical Instructor in the Department of Medicine

Gerda G. Fillenbaum

Professor Emeritus in Psychiatry and Behavioral Sciences

Major focus is in three areas: epidemiology of dementia, assessment of functional status in the elderly, and pharmacoepidemiology.

With respect to the epidemiology of dementia I am running a major survey to determine the prevalence and incidence of Alzheimer's disease and other dementias in black and white elderly community residents. In addition to determining age-, sex-, and race-specific rates, risk factors for these conditions are being determined. Attempts are underway to improve diagnostic distinctions among the dementias; computer-based assessment, suitable for epidemiological surveys, is being developed. Subjects are being followed to better identify those who will develop dementia, and compare the course of disease in black and white elderly who are demented.

Interest in the functional status of the elderly includes developing assessments suitable for use in an illiterate population, studies to try to understand the relationship of functional status to selected demographic and health characteristics, and identifying measures valuable cross-nationally for use and distribution by the World Health Organization.

In pharmacoepidemiology my major focus continues to be on identifying those characteristics (predisposing, need, enabling) which best explain use of prescription and nonprescription drugs and change in use of these in the elderly. I am particularly interested in determining whether lesser use of prescription drugs by black elderly indicates under use by this group, or whether white elderly are over-utilizing medications. The results of these studies have relevance for public health policy.


Helen Marie Hoenig

Professor of Medicine
  1. General Focus and Goals of Research: Dr. Hoenig's research focuses on rehabilitation, and more specifically on assistive technology and teletechnology. Patient populations of interest include geriatric patients with diverse medical problems including stroke, spinal and/or musculoskeletal disorders.

    2. Specific Approaches or Techniques: Randomized controlled trials, epidemiological studies including large data base analyses and survey research. Clinical trials include studies of the effects of motorized scooters in persons with difficulty walking, methods for providing wheelchairs, and telerehabilitation for exercise & functional mobility training in the home. Epidemiological studies and survey research have examined use of assistive technology and other coping strategies to disability.

    4. Special areas of expertise/national recognition: Rehabilitation health services research, geriatric rehabilitation, assistive technology outcomes, telerehabilitation.

    KEY WORDS/PHRASES: Rehabilitation, Process and Outcomes Research, Assistive Technology, Telehealth, Activities of Daily Living, Geriatrics, Disability.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.