Date of Award
January 2022
Document Type
Thesis
Degree Name
Medical Doctor (MD)
Department
Medicine
First Advisor
Sanjay Aneja
Abstract
Deep learning models are known to be powerful image classifiers and have demonstrated excellent performance on medical image datasets. However, one of their major limitations are that they can sometimes have limited performance on unseen datasets. The difference between model performance on seen and unseen data is known as the generalization gap. It is of value to be able to predict the generalization gap before using the model on unseen data or real-world data. We analyzed 1,696 scanned film mammograms from the Curated Breast Imaging Subset of Digital Database for Screening Mammography (CBIS-DDSM) and 3,306 lung nodule CT images from the Lung Image Database Consortium and Image Database Resource Initiative (LIDC-IDRI). Multiple VGG16 models of varying hyperparameters were trained to predict the presence of malignancy and their performances on the training and validation datasets were used to calculate their respective generalization gaps. A margin signature was calculated at four evenly spaced layers and used in a linear regression along with the training accuracy to predict the generalization gap. The use of margin signatures with training accuracy was able to predict the generalization gap with great accuracy. The adjusted R2 of the models analyzing the breast mammogram dataset was 0.914 whereas the adjusted R2 of the models analyzing the lung nodule CT dataset was 0.912. This represents a promising method to predict model performance in real world clinical settings before implementation and can have great implications in patient safety and aid regulatory approval.
Recommended Citation
Lee, Victor, "The Use Of Margin Distribution To Predict Generalization Gap In Deep Learning Models For Medical Imaging" (2022). Yale Medicine Thesis Digital Library. 4093.
https://elischolar.library.yale.edu/ymtdl/4093
Comments
This thesis is restricted to Yale network users only. This thesis is permanently embargoed from public release.