Date of Award

1-1-2025

Document Type

Open Access Thesis

Degree Name

Master of Public Health (MPH)

Department

School of Public Health

First Advisor

Hongyu Zhao

Abstract

Understanding the genetic basis of rare diseases remains a central challenge in human genomics, especially when conventional methods rely on case-control labels that may oversimplify complex clinical variation. In this thesis, we introduce SimiSKATO, a novel approach that integrates phenotype embeddings with rare variant association testing. Using PERADIGM-generated similarity scores to represent clinical resemblance between individuals, SimiSKATO treats these scores as continuous traits within the SKAT-O framework. By applying this method to rare diseases in the UK Biobank, we demonstrate that SimiSKATO not only replicates known gene-disease associations but also identifies biologically plausible novel candidates missed by other methods. This embedding-based framework expands the toolkit for rare variant studies and underscores the value of integrating rich clinical data into genetic discovery.

Open Access

This Article is Open Access

Included in

Public Health Commons

Share

COinS