A comprehensive infrastructure for big data in cancer research: Accelerating cancer research and precision medicine
Abstract
© 2017 Hinkson, Davidsen, Klemm, Kerlavage and Kibbe. Advancements in next-generation
sequencing and other -omics technologies are accelerating the detailed molecular characterization
of individual patient tumors, and driving the evolution of precision medicine. Cancer
is no longer considered a single disease, but rather, a diverse array of diseases
wherein each patient has a unique collection of germline variants and somatic mutations.
Molecular profiling of patient-derived samples has led to a data explosion that could
help us understand the contributions of environment and germline to risk, therapeutic
response, and outcome. To maximize the value of these data, an interdisciplinary approach
is paramount. The National Cancer Institute (NCI) has initiated multiple projects
to characterize tumor samples using multi-omic approaches. These projects harness
the expertise of clinicians, biologists, computer scientists, and software engineers
to investigate cancer biology and therapeutic response in multidisciplinary teams.
Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data
have been generated by these projects. To address the data analysis challenges associated
with these large datasets, the NCI has sponsored the development of the Genomic Data
Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality,
ingests and harmonizes genomic data, and securely redistributes the data. During its
pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing
data access, collaboration, computational scalability, resource democratization, and
reproducibility. These NCI-led efforts are continuously being refined to better support
open data practices and precision oncology, and to serve as building blocks of the
NCI Cancer Research Data Commons.
Type
Journal articlePermalink
https://hdl.handle.net/10161/15685Published Version (Please cite this version)
10.3389/fcell.2017.00083Publication Info
Hinkson, IV; Davidsen, TM; Klemm, JD; Kerlavage, AR; & Kibbe, Warren Alden (2017). A comprehensive infrastructure for big data in cancer research: Accelerating cancer
research and precision medicine. Frontiers in Cell and Developmental Biology, 5(SEP). 10.3389/fcell.2017.00083. Retrieved from https://hdl.handle.net/10161/15685.This is constructed from limited available data and may be imprecise. To cite this
article, please review & use the official citation provided by the journal.
Collections
More Info
Show full item recordScholars@Duke
Warren Alden Kibbe
Professor in Biostatistics and Bioinformatics
Warren A. Kibbe, PhD, is chief for Translational Biomedical Informatics in the Department
of Biostatistics and Bioinformatics and Chief Data Officer for the Duke Cancer Institute.
He joined the Duke University School of Medicine in August after serving as the acting
deputy director of the National Cancer Institute (NCI) and director of the NCI’s Center
for Biomedical Informatics and Information Technology where he oversaw 60 federal
employees and more than 600 contractors, and serv

Articles written by Duke faculty are made available through the campus open access policy. For more information see: Duke Open Access Policy
Rights for Collection: Scholarly Articles