Cover Song Identification with Timbral Shape Sequences

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats



We introduce a novel low level feature for identifying cover songs which quantifies the relative changes in the smoothed frequency spectrum of a song. Our key insight is that a sliding window representation of a chunk of audio can be viewed as a time-ordered point cloud in high dimensions. For corresponding chunks of audio between different versions of the same song, these point clouds are approximately rotated, translated, and scaled copies of each other. If we treat MFCC embeddings as point clouds and cast the problem as a relative shape sequence, we are able to correctly identify 42/80 cover songs in the "Covers 80" dataset. By contrast, all other work to date on cover songs exclusively relies on matching note sequences from Chroma derived features.







Paul L Bendich

Research Professor of Mathematics

I am a mathematician whose main research focus lies in adapting theory from ostensibly pure areas of mathematics, such as topology, geometry, and abstract algebra, into tools that can be broadly used in many data-centered applications.

My initial training was in a recently-emerging field called topological data analysis (TDA). I have been responsible for several essential and widely-used elements of its theoretical toolkit, with a particular focus on building TDA methodology for use on stratified spaces. Some of this work involves the creation of efficient algorithms, but much of it centers around theorem-proof mathematics, using proof techniques not only from algebraic topology, but also from computational geometry, from probability, and from abstract algebra.

Recently, I have done foundational work on TDA applications in several areas, including to neuroscience, to multi-target tracking, to multi-modal data fusion, and to a probabilistic theory of database merging. I am also becoming involved in efforts to integrate TDA within deep learning theory and practice.

I typically teach courses that connect mathematical principles to machine learning, including upper-level undergraduate courses in topological data analysis and more general high-dimensional data analysis, as well as a sophomore level course (joint between pratt and math) that serves as a broad introduction to machine learning and data analysis concepts.

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.