Duke University Libraries
View Item 
  •   DukeSpace
  • Duke Scholarly Works
  • Scholarly Articles
  • View Item
  •   DukeSpace
  • Duke Scholarly Works
  • Scholarly Articles
  • View Item
    • Login
    JavaScript is disabled for your browser. Some features of this site may not work without it.

    Efficient Gaussian process regression for large datasets.

    Thumbnail
    View / Download
    230.8 Kb
    Date
    2013-03
    Authors
    Banerjee, A
    Dunson, David B
    Tokdar, ST
    Repository Usage Stats
    92
    views
    17
    downloads
    Abstract
    Gaussian processes are widely used in nonparametric regression, classification and spatiotemporal modelling, facilitated in part by a rich literature on their theoretical properties. However, one of their practical limitations is expensive computation, typically on the order of n(3) where n is the number of data points, in performing the necessary matrix inversions. For large datasets, storage and processing also lead to computational bottlenecks, and numerical stability of the estimates and predicted values degrades with increasing n. Various methods have been proposed to address these problems, including predictive processes in spatial data analysis and the subset-of-regressors technique in machine learning. The idea underlying these approaches is to use a subset of the data, but this raises questions concerning sensitivity to the choice of subset and limitations in estimating fine-scale structure in regions that are not well covered by the subset. Motivated by the literature on compressive sensing, we propose an alternative approach that involves linear projection of all the data points onto a lower-dimensional subspace. We demonstrate the superiority of this approach from a theoretical perspective and through simulated and real data examples.
    Type
    Journal article
    Subject
    Bayesian regression
    Compressive sensing
    Dimensionality reduction
    Gaussian process
    Random projection
    Permalink
    http://hdl.handle.net/10161/15591
    Published Version (Please cite this version)
    10.1093/biomet/ass068
    Publication Info
    Banerjee, A; Dunson, David B; & Tokdar, ST (2013). Efficient Gaussian process regression for large datasets. Biometrika, 100(1). pp. 75-89. 10.1093/biomet/ass068. Retrieved from http://hdl.handle.net/10161/15591.
    This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.
    Collections
    • Scholarly Articles
    More Info
    Show full item record

    Scholars@Duke

    Dunson

    David B. Dunson

    Arts and Sciences Professor of Statistical Science
    Development of novel approaches for representing and analyzing complex data.  A particular focus is on methods that incorporate geometric structure (both known and unknown) and on probabilistic approaches to characterize uncertainty.  In addition, a big interest is in scalable algorithms and in developing approaches with provable guarantees.This fundamental work is directly motivated by applications in biomedical research, network data analysis, neuroscience, genomics, ecol
    Open Access

    Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy

    Rights for Collection: Scholarly Articles

     

     

    Browse

    All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

    My Account

    LoginRegister

    Statistics

    View Usage Statistics