Date of Award

January 2018

Document Type

Thesis

Degree Name

Medical Doctor (MD)

Department

Medicine

First Advisor

Emily A. Wang

Abstract

USING ELECTRONIC HEALTH RECORDS TO IDENTIFY INCARCERATION HISTORY AMONG VETERANS.

Clara H. Kim, Jonathan R. Bates, Jessica B. Long, Farah A. Kidwai-Khan, Cynthia A. Brandt, Amy C. Justice, and Emily A. Wang. Department of Internal Medicine, Yale University School of Medicine, New Haven, CT.

Hypothesis: A machine learning (ML) algorithm using natural language processing will predict whether individuals had been recently incarcerated with greater accuracy than either a human reader or a keyword search.

Methods: 450 subjects were randomly selected within the Veterans Aging Cohort Study (VACS), a prospective study based at eight Veterans Affairs hospitals. Three methods of detecting incarceration within the past year were compared against self-report: manual chart-review, a search for ten topically relevant keywords, and a support vector machine (SVM) classifier that was trained and tested 20 times on randomly selected sets.

Results: The F1 score was 0.400, 0.580, and 0.751, respectively, for manual chart-review, keyword search, and SVM classifier. Specificity was 100%, 67.5%, and 95.9%, respectively. Recall was 25.0%, 69.1%, and 63.5%, respectively.

Conclusions: Recent incarceration can be detected by a SVM classifier with greater specificity than a keyword search. Both methods are much more sensitive than a human reader. ML algorithms may thus aid in screening for incarceration history.

Comments

This thesis is restricted to Yale network users only. It will be made publicly available on 06/25/2100

Share

COinS