Non-Ideality-aware In-Memory Computing for Efficient Machine Intelligence

Date of Award

Spring 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Electrical Engineering (ENAS)

First Advisor

Panda, Priyadarshini

Abstract

The growing computational demands of deep learning algorithms, particularly on conventional von-Neumann accelerators, have led to increasing energy inefficiency and memory bandwidth bottlenecks. In-Memory Computing (IMC) based on analog crossbar platforms presents a promising alternative by performing computation directly within memory arrays, thereby reducing the "memory-wall" bottleneck. However, analog IMC platforms suffer from significant device and circuit-level non-idealities that can degrade the inference accuracy of Deep Neural Networks (DNNs). This dissertation presents a comprehensive set of techniques to enable robust and energy-efficient inference on non-ideal IMC hardware spanning a wide range of network types - from convolutional and transformer-based models to spike-based neuromorphic algorithms. We first propose training-free, hardware-aware transformations that adapt DNN parameters at inference time without the need for costly retraining with hardware noise injection. Additionally, non-ideality-aware batchnorm optimization is proposed for convolutional neural networks, preserving inference accuracy without retraining. Beyond conventional DNNs, the thesis investigates the deployment of low-power Spiking Neural Networks (SNNs) on IMC, introducing a benchmarking framework tailored for SNN workloads. It reveals the impact of temporal accumulation of noise in SNNs that plagues their inference accuracy and proposes compensatory weight encoding and batchnorm adaptation strategies for non-ideality mitigation. Furthermore, this thesis presents two algorithm-hardware co-exploration strategies that encapsulate the key insight that joint algorithm-hardware design - rather than isolated optimization - is essential for realizing the full potential of IMC-based DNN inference. Together, these contributions span parameter-level transformations, hardware-aware algorithmic adaptations, and system-level architecture exploration. The resulting methodologies significantly improve inference robustness and hardware-efficiency, offering scalable solutions for deploying both DNNs and SNNs on next-generation IMC-based AI accelerators.

This document is currently not available here.

Share

COinS