A comprehensive infrastructure for big data in cancer research: Accelerating cancer research and precision medicine

Abstract

© 2017 Hinkson, Davidsen, Klemm, Kerlavage and Kibbe. Advancements in next-generation sequencing and other -omics technologies are accelerating the detailed molecular characterization of individual patient tumors, and driving the evolution of precision medicine. Cancer is no longer considered a single disease, but rather, a diverse array of diseases wherein each patient has a unique collection of germline variants and somatic mutations. Molecular profiling of patient-derived samples has led to a data explosion that could help us understand the contributions of environment and germline to risk, therapeutic response, and outcome. To maximize the value of these data, an interdisciplinary approach is paramount. The National Cancer Institute (NCI) has initiated multiple projects to characterize tumor samples using multi-omic approaches. These projects harness the expertise of clinicians, biologists, computer scientists, and software engineers to investigate cancer biology and therapeutic response in multidisciplinary teams. Petabytes of cancer genomic, transcriptomic, epigenomic, proteomic, and imaging data have been generated by these projects. To address the data analysis challenges associated with these large datasets, the NCI has sponsored the development of the Genomic Data Commons (GDC) and three Cloud Resources. The GDC ensures data and metadata quality, ingests and harmonizes genomic data, and securely redistributes the data. During its pilot phase, the Cloud Resources tested multiple cloud-based approaches for enhancing data access, collaboration, computational scalability, resource democratization, and reproducibility. These NCI-led efforts are continuously being refined to better support open data practices and precision oncology, and to serve as building blocks of the NCI Cancer Research Data Commons.

Department

Description

Provenance

Subjects

Citation

Published Version (Please cite this version)

10.3389/fcell.2017.00083

Publication Info

Hinkson, Izumi V, Tanja M Davidsen, Juli D Klemm, Anthony R Kerlavage, Warren A Kibbe and Ishwar Chandramouliswaran (2017). A comprehensive infrastructure for big data in cancer research: Accelerating cancer research and precision medicine. Frontiers in Cell and Developmental Biology, 5(SEP). 10.3389/fcell.2017.00083 Retrieved from https://hdl.handle.net/10161/15685.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Kibbe

Warren Alden Kibbe

Professor in Biostatistics & Bioinformatics

Warren A. Kibbe, PhD, is chief for Translational Biomedical Informatics in the Department of Biostatistics and Bioinformatics and Chief Data Officer for the Duke Cancer Institute. He joined the Duke University School of Medicine in August after serving as the acting deputy director of the National Cancer Institute (NCI) and director of the NCI’s Center for Biomedical Informatics and Information Technology where he oversaw 60 federal employees and more than 600 contractors, and served as an acting Deputy Director for NCI. As an acting Deputy Director, Dr. Kibbe was involved in the myriad of activities that NCI oversees as a research organization, as a convening body for cancer research, and as a major funder of cancer research, funding nearly $4B US annually in cancer research throughout the United States. 


Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.