Date of Award

January 2024

Document Type

Thesis

Degree Name

Medical Doctor (MD)

Department

Medicine

First Advisor

Ron A. Adelman

Abstract

Background: In the landscape of medical innovation, Large Language Models (LLMs) represent a groundbreaking advancement, holding substantial promise for reshaping healthcare practices. These models demonstrate remarkable proficiency in assimilating and processing extensive medical literature, facilitating rapid information retrieval. Nevertheless, when applied to the specialized domain of ophthalmology, LLMs encounter significant hurdles owing to the intricacies inherent in this field. Ophthalmology, characterized by the intricate interplay of ocular anatomy, diverse disease manifestations, and nuanced diagnostic subtleties, poses considerable challenges for these generalized models. Compounding these challenges is the inherent limitation of the ophthalmological corpus itself, which remains notably narrower compared to the broader medical landscape. Within this constrained informational realm, LLMs grapple with the scarcity of comprehensive and specialized data crucial for accurate analysis and nuanced decision-making. To mitigate these constraints, the integration of domain-specific ontologies emerges as a critical augmentation strategy. By enriching LLMs with these ontological frameworks these models can improve performance within the limitations posed by the constrained ophthalmological corpus. This augmentation empowers LLMs to enhance their interpretative capacities, enabling more precise analyses and robust evidence attribution within the specialized domain of ophthalmology.

Methods: We use data acquired from the American Academy of Ophthalmology’s Eyewiki online encyclopedia, Preferred Practice Pattern documentation, tens of thousands of open access abstracts from top ophthalmologic journals available through PubMed to construct a domain specific corpus. Using the EyeWiki dataset, we apply BioSentVec and scipy Spacy natural language processing models to construct vector embeddings and perform named entity recognition to create a graph-based ontology of ophthalmic information. We then apply four graph-based traversal algorithms to explore topic relationship structure, closeness centrality, betweenness centrality, PageRank, and Hyperlink-Induced Topic search algorithm. We also employ Ure lexical density (ULD) to evaluate topic density. We use the complete dataset to construct a vector indexed based retrieval augmented system using the LlamaIndex wrapper for OpenAI’s GPT-3.5 API. We evaluate the performance of this model on AAOs Ask an Ophthalmologis questions, using both automated evaluation of reference quality and human evaluation of response quality across accuracy, completeness, and factuality. Results: When evaluating the ontology through connectivity metrics, we identify Oculoplastics/Orbit as the most interconnected topic to other subspecialties while Refractive Management/Intervention is the most peripheral. Oncology/Pathology has the greatest lexical density of all topics (ULD 3.28) while Glaucoma has the lowest (ULD 2.04). When evaluating the retrieval augmented model we found that the rate of hallucinated citations fell from 48% and 42% on the old and new questions respectively by base ChatGPT 3.5 to 16% and 22% by the retrieval augmented model. Additionally, the retrieval augmented model achieved a higher score in evidence attribution compared to base GPT (2.5 vs 1.7 p < 0.001). while remaining non-inferior in accuracy (3.2 vs 3.5, p = 0.308) and completeness (3.2 vs 3.5 p=0.215). Conclusion: The integration of domain-specific ontologies into LLMs offers a promising avenue to enhance their efficacy within the intricate domain of ophthalmology. We demonsrate this through the development of a graph-based ontology and a domain specific retrieval augmented large language model. Evaluation of our ontology highlighted Oculoplastics/Orbit as central and Refractive Management/Intervention as peripheral within the ophthalmological landscape. Oncology/Pathology emerged with the highest lexical density, indicative of its rich informational content. The integration of LlamaIndex with GPT-3.5 yielded substantial reductions in hallucinated citations and improved evidence attribution without compromising accuracy or completeness. This augmentation presents a transformative approach to bolstering LLMs' interpretative capacities within ophthalmology, promising more precise and robust decision support in healthcare applications.

Comments

This thesis is restricted to Yale network users only. It will be made publicly available on 04/30/2025

Share

COinS