Date of Award
January 2025
Document Type
Open Access Thesis
Degree Name
Medical Doctor (MD)
Department
Medicine
First Advisor
Dennis L. Shung
Abstract
The application of artificial intelligence (AI) in medicine has the potential to enhance clinical practice. One area of active investigation is clinical decision support systems (CDSS) that supplement physician decision-making. While AI-CDSS development has accelerated, little is known about how physicians and medical students interact with these tools in simulated clinical environments. This study investigates how physician and medical student teams interact with a large language model (LLM)-augmented CDSS, GutGPT, designed to support the management of upper gastrointestinal bleeding (UGIB). GutGPT combines a risk-prediction machine learning model (with an interactive dashboard), trained on data from Yale New Haven Hospital, with a chatbot interface providing recommendations based on American College of Gastroenterology UGIB guidelines. In this randomized controlled trial, 31 participants, grouped into teams of 2-4, were tasked with managing five UGIB cases in a high-fidelity simulation center. Teams were randomized to use GutGPT either with or without the chatbot interface. Data were collected from the System Usability Survey (SUS), chatbot query logs, and post-trial qualitative interviews. Quantitative data on prompt frequency and length were analyzed, and themes were extracted from qualitative data using rapid analysis methods. Participants with prior chatbot experience found GutGPT intuitive, but users unfamiliar with LLMs struggled with prompt formulation and understanding the tool’s capabilities. Qualitative feedback highlighted the need for seamless electronic health record integration and cited transparency, particularly through citation inclusion, as critical for trust in the system. Senior team members primarily used GutGPT to confirm decisions, while junior members focused on data collection and deferred to their seniors’ usage of the tool. Prompt frequency and word count were higher in open-ended management scenarios than in structured risk scenarios (3.9 vs. 2.4 queries; 15.3 vs. 11.0 words per prompt). Across SUS domains, mean scores ranged from 0.35 to 0.75 (on a numerical scale of -2 as “strongly disagree”, -1 as “disagree”, 0 as “neutral”, +1 as “agree”, and +2 as “strongly agree”). These findings underscore the importance of designing AI-CDSS to align with team structures, clinical tasks, and user experience levels. While GutGPT shows potential to support complex decision-making by providing tailored, guideline-based recommendations in natural language, its adoption depends on transparent data sourcing, usability improvements, and integration into established clinical workflows.
Recommended Citation
Rajashekar, Niroop, "Generative Artificial Intelligence In Clinical Decision Support - Quantitative And Qualitative Analyses" (2025). Yale Medicine Thesis Digital Library. 4349.
https://elischolar.library.yale.edu/ymtdl/4349

This Article is Open Access
Comments
This is an Open Access Thesis.