Show simple item record

Hybrid error correction and de novo assembly of single-molecule sequencing reads.

dc.contributor.author Ganapathy, G
dc.contributor.author Howard, JT
dc.contributor.author Jarvis, Erich David
dc.contributor.author Koren, Sergey
dc.contributor.author Martin, J
dc.contributor.author McCombie, WR
dc.contributor.author Phillippy, Adam M
dc.contributor.author Rasko, DA
dc.contributor.author Schatz, MC
dc.contributor.author Walenz, BP
dc.contributor.author Wang, Z
dc.coverage.spatial United States
dc.date.accessioned 2014-12-15T16:41:34Z
dc.date.issued 2012-07-01
dc.identifier http://www.ncbi.nlm.nih.gov/pubmed/22750884
dc.identifier nbt.2280
dc.identifier.uri https://hdl.handle.net/10161/9301
dc.description.abstract Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.
dc.language eng
dc.relation.ispartof Nat Biotechnol
dc.relation.isversionof 10.1038/nbt.2280
dc.subject Algorithms
dc.subject Bacteria
dc.subject Bacteriophages
dc.subject Computational Biology
dc.subject RNA
dc.subject Sequence Analysis, RNA
dc.subject Transcriptome
dc.subject Zea mays
dc.title Hybrid error correction and de novo assembly of single-molecule sequencing reads.
dc.type Journal article
pubs.author-url http://www.ncbi.nlm.nih.gov/pubmed/22750884
pubs.begin-page 693
pubs.end-page 700
pubs.issue 7
pubs.organisational-group Basic Science Departments
pubs.organisational-group Duke
pubs.organisational-group Duke Institute for Brain Sciences
pubs.organisational-group Institutes and Provost's Academic Units
pubs.organisational-group Neurobiology
pubs.organisational-group School of Medicine
pubs.organisational-group University Institutes and Centers
pubs.publication-status Published online
pubs.volume 30
dc.identifier.eissn 1546-1696


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record