Date of Award

January 2022

Document Type


Degree Name

Medical Doctor (MD)



First Advisor

Jeffrey P. Townsend


Background: Many attempts have been made to characterize and describe the driver genes and mutations responsible for prostate cancer tumorigenesis. We have quantified the cancer effect size—a direct measurement of the survival advantage a mutation confers—for 2699 primary and metastatic prostate tumor samples. Our measure of cancer effect treats tumorigenesis as an evolutionary process, subject to positive and negative selective pressures. We have applied this metric in a stage-specific manner to elucidate which mutations are selected for as prostate cancer develops.Methods: We analyzed 2699 prostate cancer tumor exomes, genomes, and panel sequences (1648 primary tumors and 1051 metastatic samples). The Gleason grade groups were used to further divide the primary tumors into lower-risk (I/II) and higher-risk (III/IV/V) primary tumors. The deconstructSigs, dNdScv, and cancereffectsizeR packages were used for extraction of mutational signatures, calculation of gene mutation rate, and calculation of cancer effect sizes of somatic variants respectively. Furthermore, using a model of pairwise epistasis, we investigated pairs of genes, observing the effect that the presence or absence of a mutation in one gene has on the selection for mutations in the paired gene. Results: The lower-risk, higher-risk, and metastatic tumors showed very similar mutational signatures, with the underlying gene mutation rates generally increasing from lower-risk tumors to higher-risk tumors to metastatic tumors. However, the genes and somatic variants that were most highly selected-for within each cohort were notably different. A distinct set of genes featured mutations that were selected for in the metastatic samples. Pairwise epistasis analysis suggested that there is an early role for SPOP in the development of prostate cancer: SPOP mutations increase the selection for mutations in several other tumor suppressors and oncogenes. Conclusion: Application of cancer effect size analysis highlights which mutations and genes are selected for leading up to each stage of prostate cancer, emphasizing the genetic differences between lower-risk, higher-risk, metastatic prostate cancer. By incorporating pairwise epistasis analysis, we are able to support previous and hypothesize novel gene-gene interactions occurring during tumorigenesis.


This thesis is restricted to Yale network users only. It will be made publicly available on 06/29/2023