Date of Award
Doctor of Philosophy (PhD)
Statistics and Data Science
Mixtures of distributions provide a flexible model for heterogeneous data, but this versatility is concomitant with computational difficulty. We study the task of generating samples from the "greedy'' gaussian mixture posterior. While it is widely known that Gibbs sampling can be slow to converge, concrete results quantifying this behavior are scarce. In this dissertation, we establish conditions under which the number of steps required by the Gibbs sampler is exponential in the separation of the data clusters. Further, we analyze the efficacy of potential solutions. The simulated tempering algorithm uses an auxiliary temperature variable to flatten the target density (reducing the effective cluster separation). As existing implementations are poorly suited to the unusual properties of the mixture posterior, we adapt simulated tempering by flattening the individual likelihood components (referred to as internal annealing). However, this is no universal solution, and we characterize conditions under which the original cause of slow convergence will persist. An alluring alternative is subsample annealing, which instead flattens the posterior by reducing the size of the observed subsample. Still, this approach is sensitive to the ordering of the data, and we prove that a single poorly chosen datum can be sufficient to prevent rapid convergence.
O'Connell, Dylan Potter, "Sampling from the Greedy Mixture Posterior" (2021). Yale Graduate School of Arts and Sciences Dissertations. 98.