Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens.

Thumbnail Image



Journal Title

Journal ISSN

Volume Title

Repository Usage Stats


Citation Stats


Premise:Despite the economic significance of insect damage to plants (i.e., herbivory), long-term data documenting changes in herbivory are limited. Millions of pressed plant specimens are now available online and can be used to collect big data on plant-insect interactions during the Anthropocene. Methods:We initiated development of machine learning methods to automate extraction of herbivory data from herbarium specimens by training an insect damage detector and a damage type classifier on two distantly related plant species (Quercus bicolor and Onoclea sensibilis). We experimented with (1) classifying six types of herbivory and two control categories of undamaged leaf, and (2) detecting two of the damage categories for which several hundred annotations were available. Results:Damage detection results were mixed, with a mean average precision of 45% in the simultaneous detection and classification of two types of damage. However, damage classification on hand-drawn boxes identified the correct type of herbivory 81.5% of the time in eight categories. The damage classifier was accurate for categories with 100 or more test samples. Discussion:These tools are a promising first step for the automation of herbivory data collection. We describe ongoing efforts to increase the accuracy of these models, allowing researchers to extract similar data and apply them to biological hypotheses.





Published Version (Please cite this version)


Publication Info

Meineke, Emily K, Carlo Tomasi, Song Yuan and Kathleen M Pryer (2020). Applying machine learning to investigate long-term insect-plant interactions preserved on digitized herbarium specimens. Applications in plant sciences, 8(6). p. e11369. 10.1002/aps3.11369 Retrieved from

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.



Carlo Tomasi

Iris Einheuser Distinguished Professor

Tomasi's research is at the intersection of computer vision, machine learning, and applied mathematics. Tomasi's current projects include image motion analysis (funded by NSF), satellite image interpretation (funded by IARPA), computer-assisted diagnosis, and object recognition (funded by Amazon). He is an ACM Fellow and has won the IEEE Computer Society Helmholtz Prize twice.


Kathleen M. Pryer

Professor of Biology

Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.