Performance Analysis in Large-Scale Stochastic Dynamic Programs

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



This dissertation studies approximations to large-scale stochastic dynamic programs, with an emphasis on applications in revenue management and operations.

Many sequential decision problems from a wide variety of fields can be formulated as stochastic dynamic programs, with optimal policies characterized by the corresponding Bellman equations. However, many such formulations suffer from the so-called “curse of dimensionality”: the number of possible states grows exponentially with the size of the problem, rendering finding an optimal solution impossible when the problem size is large. This forces us to design and analyze suboptimal heuristic policies. In this dissertation, we study two duality techniques --- information relaxations and Lagrangian relaxations --- for analyzing the sub-optimality of heuristic policies in particular applications. These performance bounds in turn imply asymptotic optimality of the heuristic policies for specific “large” regimes in which the “curse of dimensionality” is especially relevant.

The information relaxation duality approach relaxes non-anticipativity constraints by considering a decision maker who has access to the outcomes of future uncertainties before making decisions and adds a penalty that punishes violations of the non-anticipativity onstraints. With a “perfect” information relaxation, the decision maker has access to all future uncertainties, and the problem then reduces to solving a deterministic problem for each outcome. As an application, we consider the classic problem of non-preemptively scheduling a set of jobs on machines with stochastic job processing times and the weighted sum of expected completion time objective. Scheduling problems have a deep and well-developed literature in both operations research and computer science, and many varieties of scheduling problems arise in a broad range of applications. Over the past several decades, research in the stochastic scheduling literature has been focused on the case when processing times are identical across machines, and far less is known about the case with specialized (or “unrelated”) machines --- i.e., each job’s processing distribution may vary across machines. Using a perfect information relaxation duality approach in which all processing times are revealed before scheduling jobs and a penalty that appropriately compensates for this additional information, we explicitly characterize the performance loss of a simple static routing policy relative to an optimal adaptive, non-anticipative scheduling policy. Our result implies that a static routing policy is asymptotically optimal in the regime of many jobs relative to the number of machines.

The second part of the dissertation focuses on Lagrangian relaxations, which decompose a problem into simpler subproblems by moving “complicated” constraints to the objective using Lagrangian dual variables. As an application, we study the problem of dynamic pricing of resources that relocate over a network of locations. Customers with private willingness-to-pay sequentially request to relocate a resource from one location to another, and a revenue-maximizing service provider sets a price for each request. This problem can be formulated as an infinite horizon stochastic dynamic program, but is quite difficult to solve, as optimal pricing policies may depend on the locations of all resources in the network. We first focus on networks with a hub-and-spoke structure, and we develop a dynamic pricing policy and a performance bound based on a Lagrangian relaxation of the capacity constraint that the number of resources at the hub being nonnegative. This relaxation decomposes the problem over spokes and is thus far easier to solve than the original problem. We analyze the performance of the Lagrangian-based policy and focus on a supply-constrained large network regime in which the number of spokes (n) and the number of resources grow at the same rate. We show that the Lagrangian policy loses no more than O(\sqrt{ln n/n}) in performance compared to an optimal policy, thus implying asymptotic optimality as n grows large. Using the same methodology, we also show that no static policy is asymptotically optimal in the large network regime. Finally, we extend the approach to general networks (i.e., multiple, interconnected hubs and spoke-to-spoke connections) and to incorporate relocation times, and we observe strong empirical performance of the policies and bounds on extensive numerical examples, including examples based on data from the ride-hailing company RideAustin.





Chen, Chen (2020). Performance Analysis in Large-Scale Stochastic Dynamic Programs. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.