Data-driven Decision Making with Dynamic Learning under Uncertainty: Theory and Applications

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



Digital transformation is changing the landscape of business and sparking waves of innovation, calling for advanced big data analytics and artificial intelligence techniques. To survive from intensified and rapidly changing market conditions in the big data era, it is crucial for companies to hone up their competitive advantages by wielding the great power of data. This thesis provides data-driven solutions to facilitate informed decision making under various forms of uncertainties, making contributions to both theory and applications.

Chapter 1 presents a study motivated by a real-life data set from a leading Chinese supermarket chain. The grocery retailer sells a perishable product, making joint pricing and inventory ordering decisions over a finite time horizon with lost sales. However, she does not have perfect information on (1) the demand-price relationship, (2) the demand noise distribution, (3) the inventory perishability rate, and (4) how the demand-price relationship changes over time. Moreover, the demand noise distribution is nonparametric for some products but parametric for others. To help the retailer tackle these challenges, we design two versions of data-driven pricing and ordering (DDPO) policies, for the settings of nonparametric and parametric noise distributions, respectively. Measuring performance by regret, i.e., the profit loss relative to a clairvoyant policy with perfect information, we show that both versions of our DDPO policies achieve the best possible rates of regret in their respective settings. Through a case study on the real-life data, we also demonstrate that both of our policies significantly outperform the historical decisions of the supermarket, establishing the practical value of our approach. In the end, we extend our model and policy to account for age-dependent product perishability and demand censoring.

Chapter 2 discusses a work inspired by a real-life smart meter data set from a major U.S. electric utility company. The company serves retail electricity customers over a finite time horizon. Besides granular data of customers' consumptions, the company has access to high-dimensional features on customer characteristics and exogenous factors. What is unique in this context is that these features exhibit three types of heterogeneity---over time, customers, or both. They induce an underlying cluster structure and influence consumptions differently in each cluster. The company knows neither the underlying cluster structure nor the corresponding consumption models. To tackle this challenge, we design a novel data-driven policy of joint spectral clustering and feature-based dynamic pricing that efficiently learns the underlying cluster structure and the consumption behavior in each cluster, and maximizes profits on the fly. Measuring performance by average regret, i.e., the profit loss relative to a clairvoyant policy with perfect information per customer per period, we derive distinct theoretical performance guarantees by showing that our policy achieves the best possible rate of regret. Our case study based on the real-life smart meter data indicates that our policy significantly increases the company profits by 146\% over a three-month period relative to the company policy. Our policy performance is also robust to various forms of model misspecification. Finally, we extend our model and method to allow for temporal shifts in feature means, general cost functions, and potential effect of strategic customer behavior on consumptions.

Chapter 3 investigates an image cropping problem faced by a large Chinese digital platform. The platform aims to crop and display a large number of images to maximize customer conversions in an automated fashion, but it does not know how cropped images influence conversions, referred to as the reward function. What the platform knows is that good cropping should capture salient objects and texts, collectively referred to as salient features, as much as possible. Due to the high dimensionality of digital images and the unknown reward function, finding the optimal cropping for a given image is a highly unstructured learning problem. To overcome this challenge, we leverage the more advanced deep learning techniques to design a neural network policy with two types of neural networks, one for detecting salient features and the other for learning the reward function. We then show that our policy achieves the best possible theoretical performance guarantee by deriving matching upper and lower bounds on regret. To the best of our knowledge, these results are the first of their kind in deep learning applications in revenue management. Through case studies on the real-life data set and a field experiment, we demonstrate that our policy achieves statistically significant improvement on conversions over the platform's incumbent policy, translating into an annual revenue increase of 2.85 million U.S. dollars. Moreover, our neural network policy significantly outperforms the traditional machine learning methods and exhibits good performance even if the reward function is misspecified.







Li, Yuexing (2022). Data-driven Decision Making with Dynamic Learning under Uncertainty: Theory and Applications. Dissertation, Duke University. Retrieved from


Dukes student scholarship is made available to the public using a Creative Commons Attribution / Non-commercial / No derivative (CC-BY-NC-ND) license.