Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens.
Abstract
Premise:Despite the economic significance of insect damage to plants (i.e., herbivory),
long-term data documenting changes in herbivory are limited. Millions of pressed plant
specimens are now available online and can be used to collect big data on plant-insect
interactions during the Anthropocene. Methods:We initiated development of machine
learning methods to automate extraction of herbivory data from herbarium specimens
by training an insect damage detector and a damage type classifier on two distantly
related plant species (Quercus bicolor and Onoclea sensibilis). We experimented with
(1) classifying six types of herbivory and two control categories of undamaged leaf,
and (2) detecting two of the damage categories for which several hundred annotations
were available. Results:Damage detection results were mixed, with a mean average precision
of 45% in the simultaneous detection and classification of two types of damage. However,
damage classification on hand-drawn boxes identified the correct type of herbivory
81.5% of the time in eight categories. The damage classifier was accurate for categories
with 100 or more test samples. Discussion:These tools are a promising first step for
the automation of herbivory data collection. We describe ongoing efforts to increase
the accuracy of these models, allowing researchers to extract similar data and apply
them to biological hypotheses.
Type
Journal articlePermalink
https://hdl.handle.net/10161/21731Published Version (Please cite this version)
10.1002/aps3.11369Publication Info
Meineke, Emily K; Tomasi, Carlo; Yuan, Song; & Pryer, Kathleen M (2020). Applying machine learning to investigate long-term insect-plant interactions preserved
on digitized herbarium specimens. Applications in plant sciences, 8(6). pp. e11369. 10.1002/aps3.11369. Retrieved from https://hdl.handle.net/10161/21731.This is constructed from limited available data and may be imprecise. To cite this
article, please review & use the official citation provided by the journal.
Collections
More Info
Show full item recordScholars@Duke
Kathleen M. Pryer
Professor of Biology
Carlo Tomasi
Iris Einheuser Distinguished Professor
Tomasi's research is at the intersection of computer vision, machine learning, and
applied mathematics. Tomasi's current projects include image motion analysis (funded
by NSF), satellite image interpretation (funded by IARPA), computer-assisted diagnosis,
and object recognition (funded by Amazon). He is an ACM Fellow and has won the IEEE
Computer Society Helmholtz Prize twice.
Alphabetical list of authors with Scholars@Duke profiles.

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info