Three Essays on High-Frequency and High-Dimensional Financial Data Analysis
In recent decades, financial market data has become available with increasingly higher frequency and higher dimension. This rapidly growing amount of financial data has created many research opportunities and challenges. In this dissertation, I address several important issues in the areas of asset pricing, financial econometrics, and computational statistics using large-scale financial data techniques. In terms of asset pricing (Chapter 2), I investigate the relationship between the cross-section of expected stock returns and the associated market risks. In terms of financial econometrics (Chapter 3), I uncover the sources of extreme dependence risks between assets. In terms of computational statistics (Chapter 4), I design novel algorithms for efficiently estimating large-scale covariance matrices.
In Chapter 2, using a large novel high-frequency dataset, I investigate how individual stock returns respond to two different market changes: continuous and discontinuous (jump) movements. I also explore whether the different systematic risks associated with those two distinct movements are priced in the cross-section of expected stock returns. I show that the cross-section of expected stock returns reflects a risk premium for the systematic discontinuous risk but not for the systematic continuous risk. An investment strategy that goes long stocks in the highest discontinuous beta decile and shorts stocks in the lowest discontinuous beta decile produces average excess returns of 17% per annum. I estimate the risk premium for the systematic discontinuous risk is approximately 3% per annum after controlling for the usual firm characteristic variables including size, book-to-market ratio, momentum, idiosyncratic volatility, coskewness, cokurtosis, realized-skewness, realized-kurtosis, maximum daily return, and illiquidity.
In Chapter 3, co-authored with Professor Tim Bollerslev and Professor Viktor Todorov, we provide a new framework for estimating the systematic and idiosyncratic jump tail risks in the financial asset prices. Our estimates are based on in-fill asymptotics for directly identifying the jumps, together with Extreme Value Theory (EVT) approximations and methods-of-moments for assessing the tail decay parameters and the tail dependencies. On implementing the aforementioned procedures with a panel of intraday prices for a large cross-section of individual stocks and the S&P 500 market portfolio, we find that the distributions of the systematic and idiosyncratic jumps are both generally heavy-tailed and close to symmetric. We also show that the jump tail dependencies deduced from the high-frequency data together with the day-to-day variation in the diffusive volatility account for the "extreme" joint dependencies observed at the daily level.
When it comes to estimating large covariance matrices, a major challenge is the number of observations is often only comparable or even smaller than the number of parameters. Therefore, in Chapter 4, co-authored with Professor Hao Wang, we induce sparsity via graphical models in order to produce stable and robust covariance matrix estimates. We propose a new algorithm for Bayesian model determination in Gaussian graphical models under G-Wishart prior distributions. We first review recent developments in sampling from G-Wishart distributions for given graphs, with a particular interest in the efficiency of the block Gibbs samplers and other competing methods. We generalize the maximum clique block Gibbs samplers to a class of flexible block Gibbs samplers and prove its convergence. This class of block Gibbs samplers substantially outperforms its competitors along a variety of dimensions. We next develop the theory and computational details of a novel Markov chain Monte Carlo sampling scheme for Gaussian graphical model determination. Our method relies on the partial analytic structure of the G-Wishart distributions integrated with the exchange algorithm. Unlike existing methods, the new method requires neither proposal tuning nor evaluation of normalizing constants of the G-Wishart distributions.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations