Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats


Citation Stats


This article describes a new class of prior distributions for nonparametric function estimation. The unknown function is modeled as a limit of weighted sums of kernels or generator functions indexed by continuous parameters that control local and global features such as their translation, dilation, modulation and shape. Lévy random fields and their stochastic integrals are employed to induce prior distributions for the unknown functions or, equivalently, for the number of kernels and for the parameters governing their features. Scaling, shape, and other features of the generating functions are location-specific to allow quite different function properties in different parts of the space, as with wavelet bases and other methods employing overcomplete dictionaries. We provide conditions under which the stochastic expansions converge in specified Besov or Sobolev norms. Under a Gaussian error model, this may be viewed as a sparse regression problem, with regularization induced via the Lévy random field prior distribution. Posterior inference for the unknown functions is based on a reversible jump Markov chain Monte Carlo algorithm. We compare the Lévy Adaptive Regression Kernel (LARK) method to wavelet-based methods using some of the standard test functions, and illustrate its flexibility and adaptability in nonstationary applications. © Institute of Mathematical Statistics, 2011.






Published Version (Please cite this version)


Publication Info

Wolpert, RL, MA Clyde and C Tu (2011). Stochastic expansions using continuous dictionaries: Lévy adaptive regression kernels. Annals of Statistics, 39(4). pp. 1916–1962. 10.1214/11-AOS889 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Robert L. Wolpert

Professor Emeritus of Statistical Science

I'm a stochastic modeler-- I build computer-resident mathematical models
for complex systems, and invent and program numerical algorithms for making
inference from the models. Usually this involves predicting things that
haven't been measured (yet). Always it involves managing uncertainty and
making good decisions when some of the information we'd need to be fully
comfortable in our decision-making is unknown.

Originally trained as a mathematician specializing in probability theory and
stochastic processes, I was drawn to statistics by the interplay between
theoretical and applied research- with new applications suggesting what
statistical areas need theoretical development, and advances in theory and
methodology suggesting what applications were becoming practical and so
interesting. Through all of my statistical interests (theoretical, applied,
and methodological) runs the unifying theme of the Likelihood Principle,
a constant aid in the search for sensible methods of inference in complex
statistical problems where commonly-used methods seem unsuitable.
Three specific examples of such areas are:

  •  Computer modeling, the construction and analysis of fast small Bayesian
    statistical emulators for big slow simulation models;
  • Meta-analysis, of how we can synthesize evidence of different sorts about
    a statistical problem; and
  • Nonparametric Bayesian analysis, for applications in which common
    parametric families of distributions seem unsuitable.

Many of the methods in common use in each of these areas are hard or
impossible to justify, and can lead to very odd inferences that seem to
misrepresent the statistical evidence. Many of the newer approaches
abandon the ``iid'' paradigm in order to reflect patterns of regional
variation, and abandon familiar (e.g. Gaussian) distributions in order to
reflect the heavier tails observed in realistic data, and nearly all of
them depend on recent advances in the power of computer hardware and
algorithms, leading to three other areas of interest:

  • Spatial Statistics,
  • Statistical Extremes, and
  • Statistical computation.

I have a special interest in developing statistical methods for application
to problems in Environmental Science, where traditional methods often fail.
Recent examples include developing new and better ways to estimate the
mortality to birds and bats from encounters with wind turbines; the
development of nonexchangeable hierarchical Bayesian models for
synthesizing evidence about the health effects of environmental pollutants;
and the use of high-dimensional Bayesian models to reflect uncertainty in
mechanistic environmental simulation models.
My current research involves modelling and Bayesian inference of dependent
time series and (continuous-time) stochastic processes with jumps (examples
include work loads on networks of digital devices; peak heights in mass
spectrometry experiments; or multiple pollutant levels at spatially and
temporally distributed sites), problems arising in astrophysics (Gamma ray
bursts) and high-energy physics (heavy ion collisions), and the statistical
modelling of risk from, e.g., volcanic eruption.


Merlise Clyde

Professor of Statistical Science

Model uncertainty and choice in prediction and variable selection problems for linear, generalized linear models and multivariate models. Bayesian Model Averaging. Prior distributions for model selection and model averaging. Wavelets and adaptive kernel non-parametric function estimation. Spatial statistics. Experimental design for nonlinear models. Applications in proteomics, bioinformatics, astro-statistics, air pollution and health effects, and environmental sciences.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.