Date of Award

January 2016

Document Type

Thesis

Degree Name

Medical Doctor (MD)

Department

Medicine

First Advisor

William Becker

Abstract

Therapy for certain medical conditions occurs in a stepwise fashion, where one medication is recommended as initial therapy and other medications follow. Sequential pattern mining is a data mining technique used to identify patterns of ordered events. We sought to determine whether sequential pattern mining is effective for identifying temporal relationships between medications and accurately predicting the next medication likely to be prescribed for a patient.

We obtained claims data from Blue Cross Blue Shield of Texas for patients prescribed at least one diabetes medication between 2008 and 2011, and divided these into a training set (90% of patients) and test set (10% of patients). We applied the CSPADE algorithm to mine sequential patterns of diabetes medication prescriptions both at the drug class and generic drug level and ranked them by the support statistic. We then evaluated the accuracy of predictions made for which diabetes medication a patient was likely to be prescribed next.

We identified 161,497 patients who had been prescribed at least one diabetes medication. We were able to mine stepwise patterns of pharmacological therapy that were consistent with guidelines. Within three attempts, we were able to predict the medication prescribed for 90.0% of patients when making predictions by drug class, and for 64.1% when making predictions at the generic drug level. These results were stable under 10-fold cross validation, ranging from 89.1% to 90.5% at the drug class level and 63.5% to 64.9% at the generic drug level. Using 1 or 2 items in the patient’s medication history led to more accurate predictions than not using any history, but using the entire history was sometimes worse.

Sequential pattern mining is an effective technique to identify temporal relationships between medications and can be used to predict next steps in a patient’s medication regimen. Accurate predictions can be made without using the patient’s entire medication history.

Comments

This thesis is restricted to Yale network users only. This thesis is permanently embargoed from public release.

Share

COinS