Skip to main content
Duke University Libraries
DukeSpace Scholarship by Duke Authors
  • Login
  • Ask
  • Menu
  • Login
  • Ask a Librarian
  • Search & Find
  • Using the Library
  • Research Support
  • Course Support
  • Libraries
  • About
View Item 
  •   DukeSpace
  • Duke Scholarly Works
  • Scholarly Articles
  • View Item
  •   DukeSpace
  • Duke Scholarly Works
  • Scholarly Articles
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Moving the mountain: analysis of the effort required to transform comparative anatomy into computable anatomy.

Thumbnail
View / Download
627.8 Kb
Date
2015-01
Authors
Dahdul, Wasila
Dececchi, T Alexander
Ibrahim, Nizar
Lapp, Hilmar
Mabee, Paula
Repository Usage Stats
7
views
2
downloads
Abstract
The diverse phenotypes of living organisms have been described for centuries, and though they may be digitized, they are not readily available in a computable form. Using over 100 morphological studies, the Phenoscape project has demonstrated that by annotating characters with community ontology terms, links between novel species anatomy and the genes that may underlie them can be made. But given the enormity of the legacy literature, how can this largely unexploited wealth of descriptive data be rendered amenable to large-scale computation? To identify the bottlenecks, we quantified the time involved in the major aspects of phenotype curation as we annotated characters from the vertebrate phylogenetic systematics literature. This involves attaching fully computable logical expressions consisting of ontology terms to the descriptions in character-by-taxon matrices. The workflow consists of: (i) data preparation, (ii) phenotype annotation, (iii) ontology development and (iv) curation team discussions and software development feedback. Our results showed that the completion of this work required two person-years by a team of two post-docs, a lead data curator, and students. Manual data preparation required close to 13% of the effort. This part in particular could be reduced substantially with better community data practices, such as depositing fully populated matrices in public repositories. Phenotype annotation required ∼40% of the effort. We are working to make this more efficient with Natural Language Processing tools. Ontology development (40%), however, remains a highly manual task requiring domain (anatomical) expertise and use of specialized software. The large overhead required for data preparation and ontology development contributed to a low annotation rate of approximately two characters per hour, compared with 14 characters per hour when activity was restricted to character annotation. Unlocking the potential of the vast stores of morphological descriptions requires better tools for efficiently processing natural language, and better community practices towards a born-digital morphology. Database URL: http://kb.phenoscape.org
Type
Journal article
Subject
Animals
Humans
Anatomy, Comparative
Natural Language Processing
Databases, Factual
Data Mining
Biological Ontologies
Data Curation
Permalink
https://hdl.handle.net/10161/26581
Published Version (Please cite this version)
10.1093/database/bav040
Publication Info
Dahdul, Wasila; Dececchi, T Alexander; Ibrahim, Nizar; Lapp, Hilmar; & Mabee, Paula (2015). Moving the mountain: analysis of the effort required to transform comparative anatomy into computable anatomy. Database : the journal of biological databases and curation, 2015. pp. bav040. 10.1093/database/bav040. Retrieved from https://hdl.handle.net/10161/26581.
This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.
Collections
  • Scholarly Articles
More Info
Show full item record

Scholars@Duke

Lapp

Hilmar Lapp

Dir, IT
Open Access

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy

Rights for Collection: Scholarly Articles


Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info

Make Your Work Available Here

How to Deposit

Browse

All of DukeSpaceCommunities & CollectionsAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit DateThis CollectionAuthorsTitlesTypesBy Issue DateDepartmentsAffiliations of Duke Author(s)SubjectsBy Submit Date

My Account

LoginRegister

Statistics

View Usage Statistics
Duke University Libraries

Contact Us

411 Chapel Drive
Durham, NC 27708
(919) 660-5870
Perkins Library Service Desk

Digital Repositories at Duke

  • Report a problem with the repositories
  • About digital repositories at Duke
  • Accessibility Policy
  • Deaccession and DMCA Takedown Policy

TwitterFacebookYouTubeFlickrInstagramBlogs

Sign Up for Our Newsletter
  • Re-use & Attribution / Privacy
  • Harmful Language Statement
  • Support the Libraries
Duke University