Computational Methods For Insights From EHR Data: Improving Provider Productivity And Applications To Patient Outcomes
Date of Award
Spring 1-1-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Computational Biology and Bioinformatics
First Advisor
Melnick, Edward
Abstract
The increasing adoption of electronic health records (EHRs) has transformed healthcare delivery, yet inefficiencies in EHR use contribute to physician burnout and workflow disruptions. This dissertation explores how EHR audit log data, from individual sites to national-scale datasets, can be a powerful resource for generating actionable insights to improve healthcare efficiency and physician productivity. This research leverages EHR audit log data at both local and national levels to examine physician work patterns, productivity, and the implications of policy changes. At the local level, it explores variations in relative value units (RVUs) to assess how policy shifts influence gender-based compensation gaps. Simultaneously, it addresses the challenge of missing data in audit logs by evaluating the feasibility of different imputation strategies—including machine learning-based and naive approaches—to support health systems with varying computational resources. Expanding to a national scale, the study analyzes broader EHR utilization trends, identifying key factors that impact chart closure efficiency and patient volume. By systematically examining audit log data, this work provides data-driven insights to optimize clinical workflows and inform policy decisions.Additionally, two case real-life application studies have been included that demonstrate the potential of machine learning in augmenting clinical decision support. In the first study, a predictive model for urinary tract infection (UTI) diagnosis outperformed real-world physician decisions across diverse international populations, highlighting AI’s ability to enhance diagnostic accuracy and reduce unnecessary antibiotic prescriptions. The second study applied machine learning techniques to classify acute exacerbations of chronic obstructive pulmonary disease (AECOPD) into clinically meaningful subgroups, revealing stable subphenotypes that could support precision medicine approaches. While the findings underscore the value of EHR audit log data in improving healthcare operations and decision-making, this research also acknowledges the limitations of advanced machine learning methods, including biases introduced by data selection and interpretability challenges. To address these concerns, visualization tools such as SHAP are utilized to enhance model transparency. Future directions for this work include expanding the integration of AI-driven insights into clinical practice, refining predictive models for physician decision-making, and further exploring the impact of EHR system design on provider efficiency. This research lays the foundation for more sustainable, data-driven healthcare management by leveraging the power of EHR audit logs and machine learning.
Recommended Citation
Li, Huan, "Computational Methods For Insights From EHR Data: Improving Provider Productivity And Applications To Patient Outcomes" (2025). Yale Graduate School of Arts and Sciences Dissertations. 1777.
https://elischolar.library.yale.edu/gsas_dissertations/1777