# Essays on Adaptive Methods for Inference and Prediction under Dependence

## Abstract

In real-world applications, observations are arguably dependent when they are collected for a system of interconnected units or a system along a time continuum. Ignoring or inappropriately adjusting such a dependence will reduce the validity of inference and prediction. Furthermore, the dependent structure is possibly complex. For example, in social networks, the interference from other units for a certain unit is heterogeneous; for the evolution of certain units along a time continuum, it is possible for them to be influenced by the previous history and decisions, whose dependence does not follow a regular structure. This complexity should be adjusted when developing methodologies. In this thesis, we develop adaptive methods and conduct empirical studies on inference and prediction problems with dependent data.

Chapter 2 presents the contributions to adaptive methods for inference. In Chapter 2, we consider a causal inference problem in the presence of network interference. Our focus is on observational studies where the interference of a unit depends on how the treatment is assigned to its neighboring units according to a known (interference) network. However, the radius (and intensity) of the interference is unknown and can be dependent on the relevant subnetwork. We study the estimators of interference that builds upon a Lepski-like procedure that searches over the possible relevant radius of patterns. In contrast to the literature, our procedure aims to approximate the relevant network interference patterns (e.g., exposure mappings). We establish oracle inequalities and corresponding adaptive rates for the estimation of the interference function. Such estimates lead to two different estimators ($\hat{\tau}^{OR}$ and $\hat{\tau}^{DR}$) for the average direct treatment effect on the treated. We build the adaptive rate of the oracle inequality for $\hat{\tau}^{OR}$ based on that of the interference estimates. By leveraging the conditional independence of the treatments, we prove the asymptotic normality for $\hat{\tau}^{DR}$. We address several challenges arising from the data-driven creation of the patterns and the network dependence. We also present theoretical examples and numerical simulations that illustrate the performance of the proposed estimators.

Chapter 3 includes our contribution to adaptive methods for prediction. In Chapter 3, we consider the simultaneous learnability of a continuum of quantile regression trees in online learning settings by investigating the uniform regret related to their sequential predictions. We show the following:

\noindent (i) In the case of the minimax regret uniformly across all quantiles in $[\alpha,1-\alpha]$ where $\alpha \in (0,1/2)$, the convergence rate for this regret is of order $O\left( \log T/\sqrt{T} \right)$ where $T$ is the total time periods. Therefore, there exists an algorithm such that the difference between the regret bound for a continuum of quantiles and that for a single quantile is upper bounded by a logarithmic factor.

\noindent (ii) Given any data distribution, an exponential weight-based algorithm can be explicitly constructed and we can obtain the regret bound at the $O\left( \log^{\frac{1}{2}} T/\sqrt{T} \right)$ rate. This algorithm can simultaneously select the quantile regression tree functions for predicting different quantiles at each time based on an identical set of data.

Chapter 4 contains our contribution to the empirical studies for real-world problems under dependence. In Chapter 4, we investigate the impact of ballot design on election outcomes. More specifically, we measure the causal effect of flipping the party order of the candidates running for non-partisan offices. Utilizing data collected from the North Carolina State Board of Elections from 2008 to 2012, our results suggest a heterogeneous flipping effect across vote shares of major partisan contests and a downward flipping effect on average. Adopting the causal assumptions of the clustered randomized experiment, we utilize random coefficient models to estimate the flipping effects, which can adjust to the dependence from units grouped by contests.

## Type

## Department

## Description

## Provenance

## Subjects

## Citation

## Permalink

## Citation

Fang, Fei (2022). *Essays on Adaptive Methods for Inference and Prediction under Dependence*. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/25830.

## Collections

Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.