"Learning Embeddings from Adaptive Immune Receptor Sequences" by Meng Wang

Learning Embeddings from Adaptive Immune Receptor Sequences

Date of Award

Spring 2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computational Biology and Bioinformatics

First Advisor

Kleinstein, Steven

Abstract

Recent advancements in high-throughput sequencing have enhanced our ability to probe adaptive immune receptor repertoires (AIRR), revealing the immense diversity of B cell receptors (BCRs) and T cell receptors (TCRs) underlying the complexity of adaptive immunity. Despite the abundance of publicly available AIRR sequencing (AIRR-seq) data, the crucial challenge remains in extracting meaningful biological insights from these vast datasets. This necessitates the development of computational methods capable of discerning immune response-associated subsets within the repertoire. In this dissertation, we address the need to identify immune response-associated subsets of B cells and T cells by employing machine learning techniques on AIRR-seq datasets. We began by benchmarking the performance of BCR sequence embeddings on supervised prediction of receptor properties, showcasing its efficacy in predicting specificity to viral antigens. Further improvement in antigen specificity prediction was achieved by supervised fine-tuning of antibody language models. We demonstrated that fine-tuned classifiers were effective in capturing repertoire changes following vaccination, underscoring their potential in immune repertoire prediction tasks. Next, we delved into two integrative analyses of AIRR-seq data with high-throughput single-cell RNA sequencing in vaccination and infectious disease settings to find immune response-associated subsets. We examined the age-dependent dynamics of B cell responses to inactivated influenza vaccination, employing single-cell transcriptomic and BCR profiling to unravel quantitative and qualitative differences in B cell populations. This investigation provided nuanced insights into the impact of age on influenza vaccine responsiveness, contributing to a more comprehensive understanding of vaccine-induced immune responses. Finally, we turned our attention to the persistence of HIV-1-infected T cell clones in the central nervous system in patients undergoing antiretroviral therapy. By employing single-cell transcriptomic and TCR profiling, we elucidated the shared characteristics of infected T cell clones across cerebrospinal fluid and blood, offering insights into the dynamics of HIV persistence and its implications for therapy. The studies presented in this dissertation shed light on immune response through the lens of immune repertoire analysis. By utilizing embedding techniques and integrating gene expression data within a machine learning framework, we advance our understanding of antibody specificity and adaptive immune response. These findings hold significant implications for the development of more effective tools in antibody analysis and discovery.

This document is currently not available here.

Share

COinS