Date of Award
Fall 1-1-2025
Document Type
Dissertation
Degree Name
Doctor of Philosophy (PhD)
Department
Molecular, Cellular, and Developmental Biology
First Advisor
Isaacs, Farren
Abstract
Diverse microorganisms have delivered unique solutions in medicine, food, energy, and nutrient cycling through atmospheric gas fixation. Yet despite tremendous progress in model organisms like E. coli and S. cerevisiae, tools for editing and predictably manipulating undomesticated species – which comprise the vast majority of microbes1 – remain scarce. Cyanobacteria, with their light- and CO?-powered metabolism, hold particular promise for bioproduction, bioremediation, and in situ fertilization, but have been difficult to engineer. Furthermore, many of the most valuable applications of cyanobacteria would involve implementation in an open system. Such strains require two critical safeguards: biocontainment to ensure that their propagation is well controlled, and genetic isolation, to mitigate disruption of their genetic programs by external vectors and to keep their synthetic sequences from altering other environmental entities. Whole-genome recoding offers a solution to both biocontainment and genetic isolation, but requires genome-scale technologies to construct predictable synthetic genomes. In this work, I develop strategies to design, construct, and diversify synthetic genomes, implementing these strategies in cyanobacterium Synechococcus elongatus UTEX 2973. To design a recoded cyanobacterial genome (Chapter 2), I leveraged a codon language model. This model has been trained to learn features of coding sequences and can predict codons in any sequence context, where predictions are proxies for sequences that appear “natural” to the model. To construct this genome in a cyanobacterial host, I then developed a scheme, CRISPR/HDR-Integrase-Mediated Engineering (CHIME) for implementation of the recoded genome. CRISPR/Cas systems are used to introduce integrase att sites into nonconsecutive regions of genome; along with delivery of the att sites, these systems deliver 10 kb or more of recoded genome. The region between these sites is then replaced with another synthetic recoded sequence using integrase-mediated cassette exchange (IMCE). With this strategy in place, I began by determining the best CRISPR/Cas system design to use for delivery of att sites and recoding of adjacent regions (Chapter 3). The general strategy would rely on conjugation of a plasmid bearing a Cas effector along with its cognate gRNA to introduce the synthetic template, also encoded on the plasmid, at the target region. I first tested combinations of Cas effector (Cas9 or Cas12) and replication origin to determine the optimal delivery strategy. I then demonstrated delivery of 12- and 26-kb synthetic regions for editing two distinct regions of genome, testing multiple Cas effectors, schemes for guide design and approaches for selection of edited clones. Both Cas effectors generally performed comparably, and a multi-guide design can mitigate risk associated with a low-efficiency guide. I next established integrase-mediated delivery of a 49-kb region between the introduced att sites (Chapter 4). I designed a vector for IMCE with an integrase and counterselectable marker to ensure that selected clones would not possess integrated plasmid as well. I demonstrated its efficacy in fully recoding 11% of clones, and that the method could deliver a library of variants to a single cyanobacterial recipient strain, allowing identification of the most fit clones. Finally, I demonstrated feasibility of library delivery using CRISPR/Cas systems (Chapter 5); the ability to perform targeted diversification of codons allows identification of optimal genome designs and illuminates factors that determine fitness of different recoding schemes. I diversified all ATA (isoleucine) codons across a 20-kb region, to sample all synonyms, and delivered the diversified library to UTEX 2973. I then examined fitness of codons across the generated populations. The different synonyms appeared to have no detectable effect on cellular fitness, although this may be due to limitations in experimental setup. Overall, this work presents a rapid, broad host range genome design and engineering platform and uses it to recode five codons across 80 kb (3%) of the Synechococcus elongatus UTEX 2973 genome. I also demonstrate the ability to diversify genomes with the rewriting modalities employed, which allows parallel testing of many genotypes to more efficiently sample vast genomic landscapes and identify genotypes with optimal phenotypes. This work establishes genome-scale engineering and whole-genome recoding feasibility in a non-model organism, enabling engineering of biocontained, genetically isolated cyanobacteria for open-systems applications.
Recommended Citation
Quinto, Laura Beth, "Development of Genome-Scale Technologies for Writing and Recoding Cyanobacteria" (2025). Yale Graduate School of Arts and Sciences Dissertations. 1910.
https://elischolar.library.yale.edu/gsas_dissertations/1910