Date of Award

Spring 1-1-2025

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics and Data Science

First Advisor

Barron, Andrew

Abstract

In this work, we consider a Bayesian method to train single-hidden-layer neural networks with $\ell_{1}$ controlled weights by defining posterior distributions using different subsets of the training data, and combining posterior means to form our estimators. We consider both a joint Bayesian model for all parameters of the neural network at once, and a greedy Bayes model training the neurons one at a time based on the residuals of previous fits. The log-likelihoods of the posterior distributions we define are multimodal and non-concave, so sampling algorithms such as Markov Chain Monte Carlo (MCMC) will not be rapidly mixing to directly sample the posteriors. Using an auxiliary random variable, we produce a mixture distribution which we call a log-concave coupling. Using a continuous uniform prior over the $\ell_{1}$ ball, the conditional distributions of this mixture are log-concave, and the mixing distribution itself is log-concave when the number of parameters in our neural network exceeds the squared number of data points. Thus the mixture distribution can be sampled efficiently to produce samples for our original target density. For a discrete uniform prior over the $\ell_{1}$ ball intersected with a grid of small spacing, we study the performance of our posterior mean estimator in an arbitrary regret sense and a statistical risk sense. Say we have a target function $g$, with $\tilde{g}$ being its projection into the closure of the convex hull of signed neurons scaled by a constant. With neuron weight vectors of dimension $d$ and $N$ data points, we show an estimator defined by a combination of our posterior means in the joint sampling problem has arbitrary sequence regret and statistical risk within $O([(\log d)/N]^{1/4})$ of the regret and risk of $\tilde{g}$. For the greedy construction, the additional regret and risk is an improved third root power.

Recommended Citation

McDonald, Curtis James, "Computation and Estimation for Neural Networks via Log-Concave Coupling" (2025). Yale Graduate School of Arts and Sciences Dissertations. 1625.
https://elischolar.library.yale.edu/gsas_dissertations/1625

Download

COinS

Yale Graduate School of Arts and Sciences Dissertations

Computation and Estimation for Neural Networks via Log-Concave Coupling

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Search

Browse

Contribute

Researcher Profiles

Copyright, Publishing and Open Access

Links

Yale Graduate School of Arts and Sciences Dissertations

Computation and Estimation for Neural Networks via Log-Concave Coupling

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Share

Search

Browse

Contribute

Researcher Profiles

Copyright, Publishing and Open Access

Links