Date of Award

Spring 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Computer Science

First Advisor

Karbasi, Amin

Abstract

Modern machine learning (ML) systems operate in settings where classical theoretical analysis, which relies on worst-case bounds, often fails to provide meaningful guarantees. Neural networks and large language models are trained with significantly more parameters than training data, yet they generalize well in practice—contrary to theoretical results. Additionally, ML algorithms are increasingly deployed in high-stakes applications, raising concerns about replicability and, more broadly, their responsible use. This thesis makes progress on two fundamental challenges in modern ML: (i) developing generalization guarantees beyond worst-case analysis, and (ii) designing algorithms with provable replicability guarantees. In the first part of the thesis, we study the learning curves of algorithms across various problems—including interactive learning, multiclass classification, regression, and language generation—using the universal rates framework of Bousquet, Hanneke, Moran, van Handel, and Yehudayoff (2021). This aligns more closely with how practitioners evaluate learning algorithms than the worst-case Probably Approximately Correct (PAC) framework of Valiant (1984). Our results provide theoretical guarantees for several learning tasks that are provably non-learnable under PAC learning, offering a more fine-grained understanding of generalization. As part of this study, we initiate the statistical analysis of language generation, with or without breadth, and propose several definitions capturing the notion of breadth in language models. Beyond generalization, another challenge in modern ML is ensuring that results remain consistent across independent executions, particularly in high-stakes applications. In the second part of the thesis, we study the replicability crisis in ML and the broader scientific community. Building on the work of Impagliazzo, Lei, Pitassi, and Sorrell (2022), we propose an alternative definition of replicability, termed Total Variation (TV) indistinguishability, and establish its statistical equivalence to replicability and other stability notions such as differential privacy. Additionally, we initiate the study of replicability in sequential learning settings, including multi-armed bandits and reinforcement learning, as well as unsupervised learning tasks such as clustering. Finally, we develop replicable learning algorithms with improved sample complexity and computational complexity for the fundamental problem of learning large-margin halfspaces.

Share

COinS