Date of Award

Fall 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Chemistry

First Advisor

Batista, Victor

Abstract

Designing efficient synthetic routes automatically remains a central challenge in chemistry. This thesis presents machine learning and algorithmic strategies for retrosynthetic planning that aim to improve both efficiency and practical usability over previous approaches. We first introduce the site-specific template (SST) generation framework (Chapter 2), which leverages transformer-based models for template generation and reaction center labeling to enable targeted single-step disconnections. We then develop DirectMultiStep (DMS) (Chapter 3), a multistep-first model that generates complete synthetic routes directly from product molecules, avoiding recursive search and achieving strong performance on benchmark datasets. In parallel, we propose FragmentRetro (Chapter 4), an efficient, stock-aware, bottom-up algorithm that formulates retrosynthesis as a fragment recombination problem with favorable scaling for complex molecules. We compare the computational complexity and modeling trade-offs of these approaches (Chapter 5), and conclude with broader research contributions beyond retrosynthesis (Chapter 6), including projects on other synthesis planning components, protein–ligand affinity modeling, and generative molecular design. Collectively, these methods demonstrate new directions for building fast, flexible, and effective tools for retrosynthetic planning.

Share

COinS