Date of Award

January 2025

Document Type

Thesis

Degree Name

Master of Public Health (MPH)

Department

School of Public Health

First Advisor

Hongyu Zhao

Abstract

Understanding the genetic basis of rare diseases remains a central challenge in human genomics, especially when conventional methods rely on case-control labels that may oversimplify complex clinical variation. In this thesis, we introduce SimiSKATO, a novel approach that integrates phenotype embeddings with rare variant association testing. Using PERADIGM-generated similarity scores to represent clinical resemblance between individuals, SimiSKATO treats these scores as continuous traits within the SKAT-O framework. By applying this method to rare diseases in the UK Biobank, we demonstrate that SimiSKATO not only replicates known gene-disease associations but also identifies biologically plausible novel candidates missed by other methods. This embedding-based framework expands the toolkit for rare variant studies and underscores the value of integrating rich clinical data into genetic discovery.

Comments

This thesis is restricted to Yale network users only. It will be made publicly available on 12/16/2025

Share

COinS