Hybrid error correction and de novo assembly of single-molecule sequencing reads.

dc.contributor.author

Koren, Sergey

dc.contributor.author

Schatz, Michael C

dc.contributor.author

Walenz, Brian P

dc.contributor.author

Martin, Jeffrey

dc.contributor.author

Howard, Jason T

dc.contributor.author

Ganapathy, Ganeshkumar

dc.contributor.author

Wang, Zhong

dc.contributor.author

Rasko, David A

dc.contributor.author

McCombie, W Richard

dc.contributor.author

Jarvis, Erich D

dc.contributor.author

Adam M Phillippy

dc.coverage.spatial

United States

dc.date.accessioned

2014-12-15T16:41:34Z

dc.date.issued

2012-07-01

dc.description.abstract

Single-molecule sequencing instruments can generate multikilobase sequences with the potential to greatly improve genome and transcriptome assembly. However, the error rates of single-molecule reads are high, which has limited their use thus far to resequencing bacteria. To address this limitation, we introduce a correction algorithm and assembly strategy that uses short, high-fidelity sequences to correct the error in single-molecule sequences. We demonstrate the utility of this approach on reads generated by a PacBio RS instrument from phage, prokaryotic and eukaryotic whole genomes, including the previously unsequenced genome of the parrot Melopsittacus undulatus, as well as for RNA-Seq reads of the corn (Zea mays) transcriptome. Our long-read correction achieves >99.9% base-call accuracy, leading to substantially better assemblies than current sequencing strategies: in the best example, the median contig size was quintupled relative to high-coverage, second-generation assemblies. Greater gains are predicted if read lengths continue to increase, including the prospect of single-contig bacterial chromosome assembly.

dc.identifier

http://www.ncbi.nlm.nih.gov/pubmed/22750884

dc.identifier

nbt.2280

dc.identifier.eissn

1546-1696

dc.identifier.uri

https://hdl.handle.net/10161/9301

dc.language

eng

dc.publisher

Springer Science and Business Media LLC

dc.relation.ispartof

Nat Biotechnol

dc.relation.isversionof

10.1038/nbt.2280

dc.subject

Algorithms

dc.subject

Bacteria

dc.subject

Bacteriophages

dc.subject

Computational Biology

dc.subject

RNA

dc.subject

Sequence Analysis, RNA

dc.subject

Transcriptome

dc.subject

Zea mays

dc.title

Hybrid error correction and de novo assembly of single-molecule sequencing reads.

dc.type

Journal article

pubs.author-url

http://www.ncbi.nlm.nih.gov/pubmed/22750884

pubs.begin-page

693

pubs.end-page

700

pubs.issue

7

pubs.organisational-group

Basic Science Departments

pubs.organisational-group

Duke

pubs.organisational-group

Duke Institute for Brain Sciences

pubs.organisational-group

Institutes and Provost's Academic Units

pubs.organisational-group

Neurobiology

pubs.organisational-group

School of Medicine

pubs.organisational-group

University Institutes and Centers

pubs.publication-status

Published online

pubs.volume

30

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Hybrid error correction and de novo assembly of single-molecule sequencing reads..pdf
Size:
909.39 KB
Format:
Adobe Portable Document Format
Description:
Accepted version