A comprehensive infrastructure for big data in cancer research: Accelerating cancer research and precision medicine

dc.contributor.author

Hinkson, Izumi V

dc.contributor.author

Davidsen, Tanja M

dc.contributor.author

Klemm, Juli D

dc.contributor.author

Kerlavage, Anthony R

dc.contributor.author

Kibbe, Warren A

dc.contributor.author

Chandramouliswaran, Ishwar

dc.date.accessioned

2017-10-30T16:36:02Z

dc.date.available

2017-10-30T16:36:02Z

dc.date.issued

2017-09-21

dc.description.abstract

© 2017 Hinkson, Davidsen, Klemm, Kerlavage and Kibbe. Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.

dc.identifier.eissn

2296-634X

dc.identifier.uri

https://hdl.handle.net/10161/15685

dc.publisher

Frontiers Media SA

dc.relation.ispartof

Frontiers in Cell and Developmental Biology

dc.relation.isversionof

10.3389/fcell.2017.00083

dc.title

A comprehensive infrastructure for big data in cancer research: Accelerating cancer research and precision medicine

dc.type

Journal article

duke.contributor.orcid

Kibbe, Warren A|0000-0001-5622-7659

pubs.issue

SEP

pubs.organisational-group

Basic Science Departments

pubs.organisational-group

Biostatistics & Bioinformatics

pubs.organisational-group

Duke

pubs.organisational-group

School of Medicine

pubs.publication-status

Published

pubs.volume

5

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
A_Comprehensive_Infrastruct....pdf
Size:
5.97 KB
Format:
Adobe Portable Document Format
Description:
Published version