# Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels

## Date

2011-08-01

## Authors

## Journal Title

## Journal ISSN

## Volume Title

## Repository Usage Stats

views

downloads

## Citation Stats

## Abstract

This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global features such as their translation, dilation, modulation and shape. Lévy random fields and their stochastic integrals are employed to induce prior distributions for the unknown functions or, equivalently, for the number of kernels and for the parameters governing their features. Scaling, shape, and other features of the generating functions are location-specific to allow quite different function properties in different parts of the space, as with wavelet bases and other methods employing overcomplete dictionaries. We provide conditions under which the stochastic expansions converge in specified Besov or Sobolev norms. Under a Gaussian error model, this may be viewed as a sparse regression problem, with regularization induced via the Lévy random field prior distribution. Posterior inference for the unknown functions is based on a reversible jump Markov chain Monte Carlo algorithm. We compare the Lévy Adaptive Regression Kernel (LARK) method to wavelet-based methods using some of the standard test functions, and illustrate its flexibility and adaptability in nonstationary applications. © Institute of Mathematical Statistics, 2011.

## Type

## Department

## Description

## Provenance

## Subjects

## Citation

## Permalink

## Published Version (Please cite this version)

## Publication Info

Wolpert, RL, MA Clyde and C Tu (2011). Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels. *Annals of Statistics*, 39(4). pp. 1916–1962. 10.1214/11-AOS889 Retrieved from https://hdl.handle.net/10161/8885.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

## Collections

### Scholars@Duke

#### Robert L. Wolpert

I'm a stochastic modeler-- I build computer-resident mathematical models

for complex systems, and invent and program numerical algorithms for making

inference from the models. Usually this involves predicting things that

haven't been measured (yet). Always it involves managing uncertainty and

making good decisions when some of the information we'd need to be fully

comfortable in our decision-making is unknown.

Originally trained as a mathematician specializing in probability theory and

stochastic processes, I was drawn to statistics by the interplay between

theoretical and applied research- with new applications suggesting what

statistical areas need theoretical development, and advances in theory and

methodology suggesting what applications were becoming practical and so

interesting. Through all of my statistical interests (theoretical, applied,

and methodological) runs the unifying theme of the **Likelihood ****Principle**,

a constant aid in the search for sensible methods of inference in complex

statistical problems where commonly-used methods seem unsuitable.

Three specific examples of such areas are:

- Computer modeling, the construction and analysis of fast small Bayesian

statistical emulators for big slow simulation models; - Meta-analysis, of how we can synthesize evidence of different sorts about

a statistical problem; and - Nonparametric Bayesian analysis, for applications in which common

parametric families of distributions seem unsuitable.

Many of the methods in common use in each of these areas are hard or

impossible to justify, and can lead to very odd inferences that seem to

misrepresent the statistical evidence. Many of the newer approaches

abandon the ``iid'' paradigm in order to reflect patterns of regional

variation, and abandon familiar (e.g. Gaussian) distributions in order to

reflect the heavier tails observed in realistic data, and nearly all of

them depend on recent advances in the power of computer hardware and

algorithms, leading to three other areas of interest:

- Spatial Statistics,
- Statistical Extremes, and
- Statistical computation.

I have a special interest in developing statistical methods for application

to problems in Environmental Science, where traditional methods often fail.

Recent examples include developing new and better ways to estimate the

mortality to birds and bats from encounters with wind turbines; the

development of nonexchangeable hierarchical Bayesian models for

synthesizing evidence about the health effects of environmental pollutants;

and the use of high-dimensional Bayesian models to reflect uncertainty in

mechanistic environmental simulation models.

My current research involves modelling and Bayesian inference of dependent

time series and (continuous-time) stochastic processes with jumps (examples

include work loads on networks of digital devices; peak heights in mass

spectrometry experiments; or multiple pollutant levels at spatially and

temporally distributed sites), problems arising in astrophysics (Gamma ray

bursts) and high-energy physics (heavy ion collisions), and the statistical

modelling of risk from, e.g., volcanic eruption.

#### Merlise Clyde

Model uncertainty and choice in prediction and variable selection problems for linear, generalized linear models and multivariate models. Bayesian Model Averaging. Prior distributions for model selection and model averaging. Wavelets and adaptive kernel non-parametric function estimation. Spatial statistics. Experimental design for nonlinear models. Applications in proteomics, bioinformatics, astro-statistics, air pollution and health effects, and environmental sciences.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.