Efficient Gaussian process regression for large datasets.

Loading...
Thumbnail Image

Date

2013-03

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

136
views
723
downloads

Citation Stats

Abstract

Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n(3) where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.

Department

Description

Provenance

Citation

Published Version (Please cite this version)

10.1093/biomet/ass068

Publication Info

Banerjee, Anjishnu, David B Dunson and Surya T Tokdar (2013). Efficient Gaussian process regression for large datasets. Biometrika, 100(1). pp. 75–89. 10.1093/biomet/ass068 Retrieved from https://hdl.handle.net/10161/15591.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Tokdar

Surya Tapas Tokdar

Professor of Statistical Science

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.