Essays on Adaptive Methods for Inference and Prediction under Dependence

dc.contributor.advisor

Belloni, Alexandre Nogueira

dc.contributor.advisor

Arlotto, Alessandro

dc.contributor.author

Fang, Fei

dc.date.accessioned

2022-09-21T13:55:09Z

dc.date.issued

2022

dc.department

Business Administration

dc.description.abstract

In real-world applications, observations are arguably dependent when they are collected for a system of interconnected units or a system along a time continuum. Ignoring or inappropriately adjusting such a dependence will reduce the validity of inference and prediction. Furthermore, the dependent structure is possibly complex. For example, in social networks, the interference from other units for a certain unit is heterogeneous; for the evolution of certain units along a time continuum, it is possible for them to be influenced by the previous history and decisions, whose dependence does not follow a regular structure. This complexity should be adjusted when developing methodologies. In this thesis, we develop adaptive methods and conduct empirical studies on inference and prediction problems with dependent data.

Chapter 2 presents the contributions to adaptive methods for inference. In Chapter 2, we consider a causal inference problem in the presence of network interference. Our focus is on observational studies where the interference of a unit depends on how the treatment is assigned to its neighboring units according to a known (interference) network. However, the radius (and intensity) of the interference is unknown and can be dependent on the relevant subnetwork. We study the estimators of interference that builds upon a Lepski-like procedure that searches over the possible relevant radius of patterns. In contrast to the literature, our procedure aims to approximate the relevant network interference patterns (e.g., exposure mappings). We establish oracle inequalities and corresponding adaptive rates for the estimation of the interference function. Such estimates lead to two different estimators ($\hat{\tau}^{OR}$ and $\hat{\tau}^{DR}$) for the average direct treatment effect on the treated. We build the adaptive rate of the oracle inequality for $\hat{\tau}^{OR}$ based on that of the interference estimates. By leveraging the conditional independence of the treatments, we prove the asymptotic normality for $\hat{\tau}^{DR}$. We address several challenges arising from the data-driven creation of the patterns and the network dependence. We also present theoretical examples and numerical simulations that illustrate the performance of the proposed estimators.

Chapter 3 includes our contribution to adaptive methods for prediction. In Chapter 3, we consider the simultaneous learnability of a continuum of quantile regression trees in online learning settings by investigating the uniform regret related to their sequential predictions. We show the following:

\noindent (i) In the case of the minimax regret uniformly across all quantiles in $[\alpha,1-\alpha]$ where $\alpha \in (0,1/2)$, the convergence rate for this regret is of order $O\left( \log T/\sqrt{T} \right)$ where $T$ is the total time periods. Therefore, there exists an algorithm such that the difference between the regret bound for a continuum of quantiles and that for a single quantile is upper bounded by a logarithmic factor.

\noindent (ii) Given any data distribution, an exponential weight-based algorithm can be explicitly constructed and we can obtain the regret bound at the $O\left( \log^{\frac{1}{2}} T/\sqrt{T} \right)$ rate. This algorithm can simultaneously select the quantile regression tree functions for predicting different quantiles at each time based on an identical set of data.

Chapter 4 contains our contribution to the empirical studies for real-world problems under dependence. In Chapter 4, we investigate the impact of ballot design on election outcomes. More specifically, we measure the causal effect of flipping the party order of the candidates running for non-partisan offices. Utilizing data collected from the North Carolina State Board of Elections from 2008 to 2012, our results suggest a heterogeneous flipping effect across vote shares of major partisan contests and a downward flipping effect on average. Adopting the causal assumptions of the clustered randomized experiment, we utilize random coefficient models to estimate the flipping effects, which can adjust to the dependence from units grouped by contests.

dc.identifier.uri

https://hdl.handle.net/10161/25830

dc.subject

Business administration

dc.subject

Statistics

dc.title

Essays on Adaptive Methods for Inference and Prediction under Dependence

dc.type

Dissertation

duke.embargo.months

23.86849315068493

duke.embargo.release

2024-09-16T00:00:00Z

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Fang_duke_0066D_16960.pdf
Size:
2.72 MB
Format:
Adobe Portable Document Format

Collections