Yale Graduate School of Arts and Sciences Dissertations

Resource-Aware Distributed Machine Learning: Unified Approaches for Private and Efficient On-Device Intelligence

Yeshwanth Venkatesha, Yale University Graduate School of Arts and SciencesFollow

Date of Award

Spring 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Engineering and Applied Science

First Advisor

Panda, Priya

Abstract

The widespread integration of Artificial Intelligence (AI) into everyday life is driving a paradigm shift towards deploying machine learning models on a massive, distributed ecosystem of edge devices. While this decentralization offers opportunities for data-driven applications, it introduces a fundamental trilemma: the concurrent pursuit of high performance, operational efficiency, and stringent data privacy. State-of-the-art AI models, particularly Large Language Models (LLMs), demand computational and memory resources that far exceed the capabilities of typical edge devices, creating significant bottlenecks in training and inference. Furthermore, the distributed nature of data gives rise to challenges of statistical heterogeneity and communication overhead, while the sensitivity of user data necessitates robust privacy-preserving mechanisms. This dissertation posits that overcoming these multifaceted challenges requires a holistic, cross-stack research methodology that spans hardware, algorithms, architectures, and models. This work presents a portfolio of six novel contributions that collectively form a comprehensive framework for building practical, high-performance AI systems that operate effectively on resource-constrained, heterogeneous, and privacy-sensitive edge devices. The contributions first address training efficiency. DC-NAS is a divide-and-conquer federated Neural Architecture Search that speeds up architecture discovery. FedSNN shows that Spiking Neural Networks (SNNs) are a scalable and energy-efficient alternative to traditional models. At the hardware level, HaLo-FL is a hardware-aware federated learning framework that co-designs the training process using adaptive low-precision techniques to match the specific constraints of diverse clients.Next, this work focuses on fine-tuning LLMs. FedPEFT is a specialized framework that reduces communication and training complexity for federated transformer models while maintaining accuracy. For privacy preserving fine-tuning, ECLIPSE introduces a split-compute architecture that combines on-device private components with a cloud-based backbone, using differential privacy. Finally, to optimize inference, FSD presents a fast speculative decoding framework for edge-cloud systems that reduces latency by maximizing client-server parallelism. Collectively, these contributions offer a multi-layered set of solutions that advance distributed machine learning, showing a clear path towards efficient, scalable, and private AI at edge devices.

Recommended Citation

Venkatesha, Yeshwanth, "Resource-Aware Distributed Machine Learning: Unified Approaches for Private and Efficient On-Device Intelligence" (2025). Yale Graduate School of Arts and Sciences Dissertations. 1546.
https://elischolar.library.yale.edu/gsas_dissertations/1546

Download

COinS

Yale Graduate School of Arts and Sciences Dissertations

Resource-Aware Distributed Machine Learning: Unified Approaches for Private and Efficient On-Device Intelligence

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Search

Browse

Contribute

Researcher Profiles

Copyright, Publishing and Open Access

Links

Yale Graduate School of Arts and Sciences Dissertations

Resource-Aware Distributed Machine Learning: Unified Approaches for Private and Efficient On-Device Intelligence

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Share

Search

Browse

Contribute

Researcher Profiles

Copyright, Publishing and Open Access

Links