Hybrid error correction and de novo assembly of single-molecule sequencing reads.

Abstract

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.

Department

Description

Provenance

Citation

Published Version (Please cite this version)

10.1038/nbt.2280

Publication Info

Koren, Sergey, Michael C Schatz, Brian P Walenz, Jeffrey Martin, Jason T Howard, Ganeshkumar Ganapathy, Zhong Wang, David A Rasko, et al. (2012). Hybrid error correction and de novo assembly of single-molecule sequencing reads. Nat Biotechnol, 30(7). pp. 693–700. 10.1038/nbt.2280 Retrieved from https://hdl.handle.net/10161/9301.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Jarvis

Erich David Jarvis

Adjunct Professor in the Deptartment of Neurobiology

Dr. Jarvis' laboratory studies the neurobiology of vocal communication. Emphasis is placed on the molecular pathways involved in the perception and production of learned vocalizations. They use an integrative approach that combines behavioral, anatomical, electrophysiological and molecular biological techniques. The main animal model used is songbirds, one of the few vertebrate groups that evolved the ability to learn vocalizations. The generality of the discoveries is tested in other vocal learning orders, such as parrots and hummingbirds, as well as non-vocal learners, such as pigeons and non-human primates. Some of the questions require performing behavior/molecular biology experiments in freely ranging animals, such as hummingbirds in tropical forest of Brazil. Recent results show that in songbirds, parrots and hummingbirds, perception and production of song are accompanied by anatomically distinct patterns of gene expression. All three groups were found to exhibit vocally-activated gene expression in exactly 7 forebrain nuclei that are very similar to each other. These structures for vocal learning and production are thought to have evolved independently within the past 70 million years, since they are absent from interrelated non-vocal learning orders. One structure, Area X of the basal ganglia's striatum in songbirds, shows large differential gene activation depending on the social context in which the bird sings. These differences may reflect a semantic content of song, perhaps similar to human language.

The overall goal of the research is to advance knowledge of the neural mechanisms for vocal learning and basic mechanisms of brain function. These goals are further achieved by combined collaborative efforts with the laboratories of Drs. Mooney and Nowicki at Duke University, who study respectively behavior and electrophysiological aspects of songbird vocal communication.


Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.