NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

dc.contributor.author

Vos, Rutger A

dc.contributor.author

Balhoff, James P

dc.contributor.author

Caravas, Jason A

dc.contributor.author

Holder, Mark T

dc.contributor.author

Lapp, Hilmar

dc.contributor.author

Maddison, Wayne P

dc.contributor.author

Midford, Peter E

dc.contributor.author

Priyam, Anurag

dc.contributor.author

Sukumaran, Jeet

dc.contributor.author

Xia, Xuhua

dc.contributor.author

Stoltzfus, Arlin

dc.date.accessioned

2023-02-07T20:33:42Z

dc.date.available

2023-02-07T20:33:42Z

dc.date.issued

2012-07

dc.date.updated

2023-02-07T20:33:39Z

dc.description.abstract

In scientific research, integration and synthesis require a common understanding of where data come from, how much they can be trusted, and what they may be used for. To make such an understanding computer-accessible requires standards for exchanging richly annotated data. The challenges of conveying reusable data are particularly acute in regard to evolutionary comparative analysis, which comprises an ever-expanding list of data types, methods, research aims, and subdisciplines. To facilitate interoperability in evolutionary comparative analysis, we present NeXML, an XML standard (inspired by the current standard, NEXUS) that supports exchange of richly annotated comparative data. NeXML defines syntax for operational taxonomic units, character-state matrices, and phylogenetic trees and networks. Documents can be validated unambiguously. Importantly, any data element can be annotated, to an arbitrary degree of richness, using a system that is both flexible and rigorous. We describe how the use of NeXML by the TreeBASE and Phenoscape projects satisfies user needs that cannot be satisfied with other available file formats. By relying on XML Schema Definition, the design of NeXML facilitates the development and deployment of software for processing, transforming, and querying documents. The adoption of NeXML for practical use is facilitated by the availability of (1) an online manual with code samples and a reference to all defined elements and attributes, (2) programming toolkits in most of the languages used commonly in evolutionary informatics, and (3) input-output support in several widely used software applications. An active, open, community-based development process enables future revision and expansion of NeXML.

dc.identifier

sys025

dc.identifier.issn

1063-5157

dc.identifier.issn

1076-836X

dc.identifier.uri

https://hdl.handle.net/10161/26583

dc.language

eng

dc.publisher

Oxford University Press (OUP)

dc.relation.ispartof

Systematic biology

dc.relation.isversionof

10.1093/sysbio/sys025

dc.subject

Computational Biology

dc.subject

Biodiversity

dc.subject

Phylogeny

dc.subject

Models, Biological

dc.subject

Classification

dc.subject

Software

dc.subject

Programming Languages

dc.subject

Informatics

dc.subject

Biological Evolution

dc.title

NeXML: rich, extensible, and verifiable representation of comparative data and metadata.

dc.type

Journal article

duke.contributor.orcid

Lapp, Hilmar|0000-0001-9107-0714

pubs.begin-page

675

pubs.end-page

689

pubs.issue

4

pubs.organisational-group

Duke

pubs.organisational-group

Staff

pubs.publication-status

Published

pubs.volume

61

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
NeXML rich, extensible, and verifiable representation of comparative data and metadata.pdf
Size:
1.2 MB
Format:
Adobe Portable Document Format
Description:
Published version