Date of Award

January 2020

Document Type

Open Access Thesis

Degree Name

Master of Public Health (MPH)


School of Public Health

First Advisor

Richard A. Taylor

Second Advisor

Xinxin(Katie) Zhu


Diagnosis is a central aspect of emergency medicine. Coming to the correct diagnosis impacts patient morbidity and mortality and also the healthcare expenditures. Medical decision making is driven by the path of figuring out the differential diagnosis. Once a decent Natural Language Processing (NLP) system is developed including general characterization of differential diagnose, associated with downstream testing, diagnostic error, etc., we could be able to automatically extract differential diagnoses within clinical notes, which would have a large impact on healthcare. The main purpose of our investigative study is the characterization of differential diagnosis documentation within emergency provider notes and the development of an annotated corpus that could be used for further downstream development of NLP applications. We conducted a retrospective analysis of emergency provider notes to identify, categorize, and extract information around differential diagnoses using manual annotation. We used a light annotation framework within the MATTER cycle and extracted the information from our annotations based on a random sample of 1545 medical records. We describe the demographics information and note that only 18.1% of patients were actually given a differential diagnosis by the physicians. We examined factors including age groups, race and ethnicity groups, language preferred, acuity level, and major complaints that could lead to differences in differential diagnosis rates among patients. Within the differential diagnosis groups, evidence support and probability terms are reported. We also examined cough, chest pain, shortness of breath, abdominal pain, back pain, and falling, which are the top six complaints. Still, we suffered from limitations including sample size, nature of the accuracy of annotations, etc.


This is an Open Access Thesis.

Open Access

This Article is Open Access