Date of Award

Spring 2024

Document Type

Dissertation

Degree Name

Doctor of Philosophy (PhD)

Department

Statistics and Data Science

First Advisor

Sekhon, Jasjeet

Abstract

Statistical learning has proven highly effective across a wide array of applications. Among the various areas within statistical learning, one of particular interest is the study of decision making in dynamic environments, known as sequential decision making. With the rapid advancement of reinforcement learning and deep learning, practitioners increasingly encounter dynamic environments in real-world scenarios like robot control, vehicle navigation, medical science, and finance. This thesis concentrates on statistical estimation, evaluation, and optimization for sequential decision making problems. In such problems, a prevalent statistical model for the dynamic environment is the Markov Decision Process (MDP), characterized by features such as the state space, action space, transition probability, and other known or unknown structural elements. These characteristics determine the complexity of the environment, and data is collected through interaction with it. The richness of the data, combined with the complexity of the environment, determines the difficulty of a sequential decision making problem. A common objective in this domain is to learn a policy that, when executed, achieves high performance within the dynamic environment. In this thesis, we introduce innovative statistical algorithms designed to achieve accurate policy evaluation and learning, ensuring high performance. In Chapter 2, we delve into the offline policy evaluation problem assuming a linearly structured MDP. We devise a novel statistical algorithm that effectively utilizes variance information within the dataset. The statistical estimator produced by this algorithm exhibits asymptotically smaller error compared to standard methods. Chapter 3 focuses on policy learning within a goal-oriented framework termed Stochastic Shortest Path. Here, we introduce two novel algorithms capable of discovering policies that reach the goal state with sub-linear cost. Chapter 4 addresses the dynamic bipartite matching problem, prevalent in applications like ride-hailing and food delivery. We propose a novel framework, named Markov Matching Market, to model this dynamic matching problem and present algorithms that achieve optimal matching with minimal (sub-linear) utility loss.

Recommended Citation

Min, Yifei, "Statistical Theory and Algorithms for Sequential Decision Making" (2024). Yale Graduate School of Arts and Sciences Dissertations. 1401.
https://elischolar.library.yale.edu/gsas_dissertations/1401

Download

COinS

Yale Graduate School of Arts and Sciences Dissertations

Statistical Theory and Algorithms for Sequential Decision Making

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Search

Browse

Contribute

Copyright, Publishing and Open Access

Links

Yale Graduate School of Arts and Sciences Dissertations

Statistical Theory and Algorithms for Sequential Decision Making

Author

Date of Award

Document Type

Degree Name

Department

First Advisor

Abstract

Recommended Citation

Share

Search

Browse

Contribute

Copyright, Publishing and Open Access

Links