Date of Award

Fall 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Molecular Biophysics and Biochemistry

First Advisor

Simon, Matthew

Abstract

Gene expression levels are a product of RNA synthesis and degradation kinetics. In addition, the bulk rates of RNA synthesis and decay are a function of the rates of several highly regulated biochemical steps, from transcription initiation, productive elongation, RNA processing, export from the nucleus, and degradation in either the cytoplasm or nucleus. When gene expression levels need to be regulated, cells can tune any combination of these kinetic parameters to achieve the desired change. If one only studies the levels of mature RNA and their fluctuations, one misses out on understanding the intricate kinetic regulation underlying gene expression. Because of this, many methods have been developed to assay the kinetics of various steps in the RNA life cycle. Of particular relevance to my work is a class of methods referred to as nucleotide recoding RNA-seq (NR-seq; SLAM-seq, TimeLapse-seq, TUC-seq, etc.). These methods provide kinetic insights by combining metabolic labeling with unique chemistries that allow for bioinformatic quantification of labeled and unlabeled RNA. Making full use of the expanding universe of NR-seq methods requires robust bioinformatic tools capable of deriving accurate biological insights from this rich data. Here I describe my work developing a suite of software tools and statistical methods to facilitate novel biological investigations using NR-seq data. After introducing NR-seq, I describe bakR, an R package that, at the time, was the first software tool designed to perform well powered comparative analyses of RNA synthesis and degradation kinetics using NR-seq. I then discuss my work significantly expanding and improving the theoretical and computational framework established in bakR. This work culminated in the development of the EZbakR suite, a Snakemake pipeline (fastq2EZbakR) and R package (EZbakR) that allows researchers to thoroughly and rigorously analyze a wide array of NR-seq modalities. I then show how this framework facilitated performing isoform-level analyses of RNA synthesis and degradation kinetics, and how I was able to help our collaborators use these tools to deepen our understanding of nonsense mediated decay (NMD) and its regulation. Finally, I describe how I was able to use the EZbakR suite to reprocess published NR-seq data and compile a first-of-its-kind database of high quality RNA half-life estimates across 11 human cell lines. This resource, known as the RNAdecayCafe, allowed me to reveal that RNA stability regulation plays an underappreciated role in fine-tuning gene expression. Altogether, my work helped establish a robust and powerful bioinformatic framework by which to make full use of bulk NR-seq data, a framework that promises to help deepen our understanding of gene expression regulation.

Share

COinS