Ellipsoid fitting with the Cayley transform.

Loading...

Date

2024-01

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

1
views
1
downloads

Citation Stats

Attention Stats

Abstract

We introduce Cayley transform ellipsoid fitting (CTEF), an algorithm that uses the Cayley transform to fit ellipsoids to noisy data in any dimension. Unlike many ellipsoid fitting methods, CTEF is ellipsoid specific, meaning it always returns elliptic solutions, and can fit arbitrary ellipsoids. It also significantly outperforms other fitting methods when data are not uniformly distributed over the surface of an ellipsoid. Inspired by growing calls for interpretable and reproducible methods in machine learning, we apply CTEF to dimension reduction, data visualization, and clustering in the context of cell cycle and circadian rhythm data and several classical toy examples. Since CTEF captures global curvature, it extracts nonlinear features in data that other machine learning methods fail to identify. For example, on the clustering examples CTEF outperforms 10 popular algorithms.

Department

Description

Provenance

Subjects

Clustering, data visualization, dimension reduction, ellipsoid fitting, nonlinear data, optimization

Citation

Published Version (Please cite this version)

10.1109/tsp.2023.3332560

Publication Info

Melikechi, Omar, and David B Dunson (2024). Ellipsoid fitting with the Cayley transform. IEEE transactions on signal processing : a publication of the IEEE Signal Processing Society, 72. pp. 70–83. 10.1109/tsp.2023.3332560 Retrieved from https://hdl.handle.net/10161/33535.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Melikechi

Omar Melikechi

Assistant Professor of Statistical Science
Dunson

David B. Dunson

Arts and Sciences Distinguished Professor of Statistical Science

My research focuses on developing new tools for probabilistic learning from complex data - methods development is directly motivated by challenging applications in ecology/biodiversity, neuroscience, environmental health, criminal justice/fairness, and more.  We seek to develop new modeling frameworks, algorithms and corresponding code that can be used routinely by scientists and decision makers.  We are also interested in new inference framework and in studying theoretical properties of methods we develop.  

Some highlight application areas: 
(1) Modeling of biological communities and biodiversity - we are considering global data on fungi, insects, birds and animals including DNA sequences, images, audio, etc.  Data contain large numbers of species unknown to science and we would like to learn about these new species, community network structure, and the impact of environmental change and climate.

(2) Brain connectomics - based on high resolution imaging data of the human brain, we are seeking to developing new statistical and machine learning models for relating brain networks to human traits and diseases.

(3) Environmental health & mixtures - we are building tools for relating chemical and other exposures (air pollution etc) to human health outcomes, accounting for spatial dependence in both exposures and disease.  This includes an emphasis on infectious disease modeling, such as COVID-19.

Some statistical areas that play a prominent role in our methods development include models for low-dimensional structure in data (latent factors, clustering, geometric and manifold learning), flexible/nonparametric models (neural networks, Gaussian/spatial processes, other stochastic processes), Bayesian inference frameworks, efficient sampling and analytic approximation algorithms, and models for "object data" (trees, networks, images, spatial processes, etc).





Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.