Comparative genomic data of the Avian Phylogenomics Project.
Abstract
BACKGROUND: The evolutionary relationships of modern birds are among the most challenging
to understand in systematic biology and have been debated for centuries. To address
this challenge, we assembled or collected the genomes of 48 avian species spanning
most orders of birds, including all Neognathae and two of the five Palaeognathae orders,
and used the genomes to construct a genome-scale avian phylogenetic tree and perform
comparative genomics analyses (Jarvis et al. in press; Zhang et al. in press). Here
we release assemblies and datasets associated with the comparative genome analyses,
which include 38 newly sequenced avian genomes plus previously released or simultaneously
released genomes of Chicken, Zebra finch, Turkey, Pigeon, Peregrine falcon, Duck,
Budgerigar, Adelie penguin, Emperor penguin and the Medium Ground Finch. We hope that
this resource will serve future efforts in phylogenomics and comparative genomics.
FINDINGS: The 38 bird genomes were sequenced using the Illumina HiSeq 2000 platform
and assembled using a whole genome shotgun strategy. The 48 genomes were categorized
into two groups according to the N50 scaffold size of the assemblies: a high depth
group comprising 23 species sequenced at high coverage (>50X) with multiple insert
size libraries resulting in N50 scaffold sizes greater than 1 Mb (except the White-throated
Tinamou and Bald Eagle); and a low depth group comprising 25 species sequenced at
a low coverage (~30X) with two insert size libraries resulting in an average N50 scaffold
size of about 50 kb. Repetitive elements comprised 4%-22% of the bird genomes. The
assembled scaffolds allowed the homology-based annotation of 13,000 ~ 17000 protein
coding genes in each avian genome relative to chicken, zebra finch and human, as well
as comparative and sequence conservation analyses. CONCLUSIONS: Here we release full
genome assemblies of 38 newly sequenced avian species, link genome assembly downloads
for the 7 of the remaining 10 species, and provide a guideline of genomic data that
has been generated and used in our Avian Phylogenomics Project. To the best of our
knowledge, the Avian Phylogenomics Project is the biggest vertebrate comparative genomics
project to date. The genomic data presented here is expected to accelerate further
analyses in many fields, including phylogenetics, comparative genomics, evolution,
neurobiology, development biology, and other related areas.
Type
Other articlePermalink
https://hdl.handle.net/10161/9322Published Version (Please cite this version)
10.1186/2047-217X-3-26Collections
More Info
Show full item recordScholars@Duke
Erich David Jarvis
Adjunct Professor in the Deptartment of Neurobiology
Dr. Jarvis' laboratory studies the neurobiology of vocal communication. Emphasis is
placed on the molecular pathways involved in the perception and production of learned
vocalizations. They use an integrative approach that combines behavioral, anatomical,
electrophysiological and molecular biological techniques. The main animal model used
is songbirds, one of the few vertebrate groups that evolved the ability to learn vocalizations.
The generality of the discoveries is tested in other vocal lear

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles
Works are deposited here by their authors, and represent their research and opinions, not that of Duke University. Some materials and descriptions may include offensive content. More info