Screening the human exome: a comparison of whole genome and whole transcriptome sequencing.

Abstract

BACKGROUND: There is considerable interest in the development of methods to efficiently identify all coding variants present in large sample sets of humans. There are three approaches possible: whole-genome sequencing, whole-exome sequencing using exon capture methods, and RNA-Seq. While whole-genome sequencing is the most complete, it remains sufficiently expensive that cost effective alternatives are important. RESULTS: Here we provide a systematic exploration of how well RNA-Seq can identify human coding variants by comparing variants identified through high coverage whole-genome sequencing to those identified by high coverage RNA-Seq in the same individual. This comparison allowed us to directly evaluate the sensitivity and specificity of RNA-Seq in identifying coding variants, and to evaluate how key parameters such as the degree of coverage and the expression levels of genes interact to influence performance. We find that although only 40% of exonic variants identified by whole genome sequencing were captured using RNA-Seq; this number rose to 81% when concentrating on genes known to be well-expressed in the source tissue. We also find that a high false positive rate can be problematic when working with RNA-Seq data, especially at higher levels of coverage. CONCLUSIONS: We conclude that as long as a tissue relevant to the trait under study is available and suitable quality control screens are implemented, RNA-Seq is a fast and inexpensive alternative approach for finding coding variants in genes with sufficiently high expression levels.

Department

Description

Provenance

Citation

Published Version (Please cite this version)

10.1186/gb-2010-11-5-r57

Publication Info

Cirulli, Elizabeth T, Abanish Singh, Kevin V Shianna, Dongliang Ge, Jason P Smith, Jessica M Maia, Erin L Heinzen, James J Goedert, et al. (2010). Screening the human exome: a comparison of whole genome and whole transcriptome sequencing. Genome Biol, 11(5). p. R57. 10.1186/gb-2010-11-5-r57 Retrieved from https://hdl.handle.net/10161/4395.

This is constructed from limited available data and may be imprecise. To cite this article, please review & use the official citation provided by the journal.

Scholars@Duke

Singh

Abanish Singh

Assistant Professor in Psychiatry and Behavioral Sciences

With a unique skill set resulting from outstanding training, my sole aim was to help improve human health through cutting-edge translational research. Specifically, I have been interested in illuminating the mechanisms responsible for the causes and progression of the leading public health conditions, which may help with the development and enhancement of precision medicine.  As part of this endeavor, I also became interested in studying the measurement of biobehavioral risk factors and environmental stressors and their interactions with genes that may influence cardiovascular disease (CVD) risk factors and endophenotypes, adversely affecting the CVD pathways.

I joined medical research with my early research training on computational biology, high-throughput genomics, next-gen DNA sequencing, genome-wide studies, and big data analytics, which resulted in some of prominent findings on human genome (PMID: 18048317, PMID: 20223737, PMID: 20598109, PMID: 21703177). These findings included a significant contribution to the scientific community’s understanding that I made during my postdoctoral fellowship with Dr. David Goldstein at Duke Center for Human Genome Variation that how well RNA-Seq can identify human coding variants just using a small fraction of genome (transcriptome) as compared to whole genome (PMID: 20598109). This work was important not only scientifically, but also in pragmatic terms, given the high cost of sequencing.

In relatively recent work I discovered a novel CVD risk gene EBF1, where  a common genetic variant contributed to inter-individual differences in human central obesity, fasting blood glucose, diabetes, and CVD risk factors in the presence of chronic psychosocial stress (PMID: 25271088). This work demonstrated the genetic variant-specific significant path from chronic psychosocial stress to common carotid intimal–media thickness (CCIMT), a surrogate marker for atherosclerosis, via central obesity and fasting glucose. I also developed an algorithm to create a synthetic measure of stress using the proxy indicators of its components (PMID: 26202568).  Other more recent work has elucidated the race, sex, and age related differences in the EBF1 gene-by-stress interaction (PMID: 33077726), which suggests the need for careful evaluation of environmental measures in different ethnicities in cross-ethnic gene-by-stress interaction studies.

More recently, I have expanded my research interest in studying the genetic architecture of Alzheimer’s disease (AD) and the role of psychosocial stress in modifying the effect of genetic variants on the disease risks.


Unless otherwise indicated, scholarly articles published by Duke faculty members are made available here with a CC-BY-NC (Creative Commons Attribution Non-Commercial) license, as enabled by the Duke Open Access Policy. If you wish to use the materials in ways not already permitted under CC-BY-NC, please consult the copyright owner. Other materials are made available here through the author’s grant of a non-exclusive license to make their work openly accessible.