Date of Award

January 2024

Document Type

Open Access Thesis

Degree Name

Medical Doctor (MD)



First Advisor

Silvia Vilarinho


Chronic liver disease (CLD) is a major global health problem, leading to an estimatedtwo million annual deaths worldwide. Advances in next generation sequencing (NGS) technologies have revolutionized our ability to diagnose and investigate disease pathogenesis. Our research group has previously shown that genomic analysis is useful in the diagnosis of patients with unexplained CLD, providing a diagnostic yield of up to 30%. Moreover, this approach has revealed a newly described cause of non-cirrhotic portal hypertension and nodular regenerative hyperplasia due to GIMAP5 deficiency, which is recapitulated in a mouse model. Additionally, the growing use of single cell RNA sequencing (scRNA-seq) technologies has allowed for unprecedented insights into the cellular and molecular mechanisms of disease. Thus, we hypothesize that (i) the incorporation of state-of-the-art tools into our whole exome sequencing (WES) pipeline will improve the diagnostic yield in patients with unexplained liver disease, and (ii) the use of scRNA-seq technologies in mouse model(s) of recently uncovered genetic liver diseases will provide novel insights into disease pathogenesis.

We aim (1) to develop a WES analysis pipeline with state-of-the-art analytical tools toimprove gene discovery and diagnosis in liver disease, and (2) to employ scRNA-seq to advance our understanding of Gimap5-related liver disease. In Aim 1, we built a WES pipeline incorporating alignment to the human pangenome reference, along with updated variant calling and annotation tools. We performed a comprehensive evaluation alongside our standard WES pipeline using a cohort of patients with unexplained liver disease. In Aim 2, we performed comprehensive scRNA-seq analyses of hepatocytes isolated wild-type (WT) and Gimap5-deficient mice.

We successfully developed a WES pipeline which incorporates a recently publishedancestrally diverse human pangenome reference, and an improved deep-learning variant calling algorithm DeepVariant. We further incorporated variant annotation with novel pathogenicity prediction tools, and with gene expression across all liver cell types obtained from our group’s liver cell atlas, to improve the identification of potentially liver disease-causing variants. We found that our new pipeline recapitulated a diverse set of disease-causing variants previously identified with our standard pipeline. Furthermore, our re-analysis identified a genetic diagnosis in one patient with previously unexplained disease. In parallel, we used scRNA-seq to investigate the liver pathology underlying Gimap5 deficiency, identifying major transcriptional differences between WT and Gimap5-deficient mouse hepatocytes. Gimap5-deficient hepatocytes showed a global decrease in metabolic gene expression, with a reduction in the expression of downstream targets of Wnt/beta-catenin signaling, supporting a dysregulated endothelial cell-to-hepatocyte signaling circuit. Using previously published scRNA-seq datasets, we found that hepatocytes from adult Gimap5-deficient mice had a transcriptional profile resembling both immature, neonatal hepatocytes, and hepatocytes 48 hours post-partial hepatectomy. Furthermore, in Gimap5-deficient livers, we identified a population of hepatocytes likely arising from cholangiocyte-to-hepatocyte transdifferentiation.

Collectively, this work highlights the value of employing modern genomic analysis toolsin improving the diagnosis and molecular understanding of undiagnosed disease, and illustrates the utility of studying rare genetic liver diseases in humans and mice as a pathway towards new insights into liver biology, both in health and in disease.


This is an Open Access Thesis.

Open Access

This Article is Open Access