Genomic Analysis of Rare Human Diseases

Date of Award

Spring 2021

Document Type


Degree Name

Doctor of Philosophy (PhD)



First Advisor

Lifton, Richard


Many rare human diseases have a strong genetic basis. Rare germline and somatic mutations exert risks to Mendelian diseases and cancer, respectively. The development of next-generation sequencing and genomic analysis approaches facilitate the characterization of rare causal variants to human diseases. This thesis describes the genomic approaches to identify underlying genetic factors in three rare human diseases focusing on analysis of three distinct genetic mechanisms: congenital heart disease caused by recessive genotypes, congenital hydrocephalus caused by dominant genotypes, and uterus leiomyosarcoma, caused by somatic mutations. Congenital heart disease (CHD) is a structural malformation of the heart and great vessels present at birth. Though causal genetic determinants have been identified in ~34% of CHD patients, the etiology of 56% of the cases remains unknown. Recessive inheritance has been implicated in CHD through human cohort studies and mouse mutagenesis screen. In my thesis, I have analyzed whole-exome sequencing (WES) of 5,497 CHD patients, searching for a significant burden of damaging recessive genotypes (RGs) in individual genes. The results identified significant enrichment of rare damaging RGs in known recessive human CHD genes. Genome-wide burden analysis revealed four significant genes, establishing PLD1 as a bona fide risk gene for right-sided valvular defect and C1orf127 as a novel risk gene for the laterality-associated defect. Exonic damaging RGs were estimated to account for ~2% of the CHD cohort and especially led to severe CHD subtypes. In another study, I analyzed congenital hydrocephalus (CH), which is characterized by excessive cerebrospinal fluid in the brain and enlargement of brain ventricles. The poor post-surgical outcomes of this disease highlight our limited knowledge of the disease mechanisms. Through WES of 381 sporadic CH patients, damaging de novo mutations were identified in more than 17% of cases, far exceeding the expected values, with significant burden of de novo mutation in five genes: TRIM71, PTEN, SMARCC1, FOXJ1, and PIK3CA. Overall, rare damaging de novo, dominant and recessive mutations with large effect contributed to ~22% of CH cases. Many candidate genes are regulators for neural stem cell development, which suggests that the disruption of prenatal neuro-gliogenesis is an essential pathomechanism of sporadic CH. Uterine leiomyosarcomas (uLMS) are aggressive tumors arising from the smooth muscle layer of the uterus. Information regarding the genetic landscape and targeted therapies of uLMS is currently limited. The next-generation sequencing data of 83 subjects with uLMS were analyzed: 56 from Yale and 27 from TCGA. The mutational landscape of uLMS was described. The clinically actionable signatures of homologous recombination defect were identified in 25% of the tumors. Four genes harbored significant somatic mutation burden, including a novel uLMS driver MEN1. Significant copy gains were characterized at 5p15.33 (TERT), 8q24.21 (C-MYC) and 17p11.2 (MYOCD, MAP2K4). Frequent fusions and complex structural variations disrupted the tumor suppressors in a large proportion of the uLMS cases. These findings revealed the complex genetic architecture of uLMS and suggested targets for potential therapies.

This document is currently not available here.