Machine learning functional impairment classification with electronic health record data

Juliessa M. Pavon, Laura Previll, Myung Woo, Ricardo Henao, Mary Solomon, Ursula Rogers, Andrew Olson, Jonathan Fischer, Christopher Leo, Gerda Fillenbaum, Helen Hoenig, David Casarett

Research output: Contribution to journalArticlepeer-review


Background: Poor functional status is a key marker of morbidity, yet is not routinely captured in clinical encounters. We developed and evaluated the accuracy of a machine learning algorithm that leveraged electronic health record (EHR) data to provide a scalable process for identification of functional impairment. Methods: We identified a cohort of patients with an electronically captured screening measure of functional status (Older Americans Resources and Services ADL/IADL) between 2018 and 2020 (N = 6484). Patients were classified using unsupervised learning K means and t-distributed Stochastic Neighbor Embedding into normal function (NF), mild to moderate functional impairment (MFI), and severe functional impairment (SFI) states. Using 11 EHR clinical variable domains (832 variable input features), we trained an Extreme Gradient Boosting supervised machine learning algorithm to distinguish functional status states, and measured prediction accuracies. Data were randomly split into training (80%) and test (20%) sets. The SHapley Additive Explanations (SHAP) feature importance analysis was used to list the EHR features in rank order of their contribution to the outcome. Results: Median age was 75.3 years, 62% female, 60% White. Patients were classified as 53% NF (n = 3453), 30% MFI (n = 1947), and 17% SFI (n = 1084). Summary of model performance for identifying functional status state (NF, MFI, SFI) was AUROC (area under the receiving operating characteristic curve) 0.92, 0.89, and 0.87, respectively. Age, falls, hospitalization, home health use, labs (e.g., albumin), comorbidities (e.g., dementia, heart failure, chronic kidney disease, chronic pain), and social determinants of health (e.g., alcohol use) were highly ranked features in predicting functional status states. Conclusion: A machine learning algorithm run on EHR clinical data has potential utility for differentiating functional status in the clinical setting. Through further validation and refinement, such algorithms can complement traditional screening methods and result in a population-based strategy for identifying patients with poor functional status who need additional health resources.
Original languageEnglish (US)
Pages (from-to)2822-2833
Number of pages12
JournalJournal of the American Geriatrics Society
Issue number9
StatePublished - Sep 1 2023
Externally publishedYes


Dive into the research topics of 'Machine learning functional impairment classification with electronic health record data'. Together they form a unique fingerprint.

Cite this