Browsing by Author "Dave, Sandeep S"
Results Per Page
Sort Options
Item Open Access A Cloud-Based Infrastructure for Cancer Genomics(2020) Panea, Razvan IoanThe advent of new genomic approaches, particularly next generation sequencing (NGS) has resulted in explosive growth of biological data. As the size of biological data keeps growing at exponential rates, new methods for data management and data processing are becoming essential in bioinformatics and computational biology. Indeed, data analysis has now become the central challenge in genomics.
NGS has provided rich tools for defining genomic alterations that cause cancer. The processing time and computing requirements have now become a serious bottleneck to the characterization and analysis of these genomic alterations. Moreover, as the adoption of NGS continues to increase, the computing power required often exceeds what any single institution can provide, leading to major restraints in the type and number of analyses that can be performed.
Cloud computing represents a potential solution to this problem. On a cloud platform, computing resources can be available on-demand, thus allowing users to implement scalable and highly parallel methods. However, few centralized frameworks exist to allow the average researcher the ability to apply bioinformatics workflows using cloud resources. Moreover, bioinformatics approaches are associated with multiple processing challenges, such as the variability in the methods or data used and the reproducibility requirements of the research analysis.
Here, we present CloudConductor, a software system that is specifically designed to harness the power of cloud computing to perform complex analysis pipelines on large biological datasets. CloudConductor was designed with five central features in mind: scalability, modularity, parallelism, reproducibility and platform agnosticism.
We demonstrate the processing power afforded by CloudConductor on a real-world genomics problem. Using CloudConductor, we processed and analyzed 101 whole genome tumor-normal paired samples from Burkitt lymphoma subtypes to identify novel genomic alterations. We identified a total of 72 driver genes associated with the disease. Somatic events were identified in both coding and non-coding regions of nearly all driver genes, notably in genes IGLL5, BACH2, SIN3A, and DNMT1. We have developed the analysis framework by implementing a graphical user interface, a back-end database system, a data loader and a workflow management system.
In this thesis, we develop the concepts and describe an implementation of automated cloud-based infrastructure to analyze genomics data, creating a fast and efficient analysis resource for genomics researchers.
Item Open Access Analysis of Epstein-Barr virus-regulated host gene expression changes through primary B-cell outgrowth reveals delayed kinetics of latent membrane protein 1-mediated NF-κB activation.(Journal of virology, 2012-10) Price, Alexander M; Tourigny, Jason P; Forte, Eleonora; Salinas, Raul E; Dave, Sandeep S; Luftig, Micah AEpstein-Barr virus (EBV) is an oncogenic human herpesvirus that dramatically reorganizes host gene expression to immortalize primary B cells. In this study, we analyzed EBV-regulated host gene expression changes following primary B-cell infection, both during initial proliferation and through transformation into lymphoblastoid cell lines (LCLs). While most EBV-regulated mRNAs were changed during the transition from resting, uninfected B cells through initial B-cell proliferation, a substantial number of mRNAs changed uniquely from early proliferation through LCL outgrowth. We identified constitutively and dynamically EBV-regulated biological processes, protein classes, and targets of specific transcription factors. Early after infection, genes associated with proliferation, stress responses, and the p53 pathway were highly enriched. However, the transition from early to long-term outgrowth was characterized by genes involved in the inhibition of apoptosis, the actin cytoskeleton, and NF-κB activity. It was previously thought that the major viral protein responsible for NF-κB activation, latent membrane protein 1 (LMP1), is expressed within 2 days after infection. Our data indicate that while this is true, LCL-level LMP1 expression and NF-κB activity are not evident until 3 weeks after primary B-cell infection. Furthermore, heterologous NF-κB activation during the first week after infection increased the transformation efficiency, while early NF-κB inhibition had no effect on transformation. Rather, inhibition of NF-κB was not toxic to EBV-infected cells until LMP1 levels and NF-κB activity were high. These data collectively highlight the dynamic nature of EBV-regulated host gene expression and support the notion that early EBV-infected proliferating B cells have a fundamentally distinct growth and survival phenotype from that of LCLs.Item Open Access Characterizing Genetic Drivers of Lymphoma through High-Throughput Sequencing(2016) Zhang, JennyThe advent of next-generation sequencing, now nearing a decade in age, has enabled, among other capabilities, measurement of genome-wide sequence features at unprecedented scale and resolution.
In this dissertation, I describe work to understand the genetic underpinnings of non-Hodgkin’s lymphoma through exploration of the epigenetics of its cell of origin, initial characterization and interpretation of driver mutations, and finally, a larger-scale, population-level study that incorporates mutation interpretation with clinical outcome.
In the first research chapter, I describe genomic characteristics of lymphomas through the lens of their cells of origin. Just as many other cancers, such as breast cancer or lung cancer, are categorized based on their cell of origin, lymphoma subtypes can be examined through the context of their normal B Cells of origin, Naïve, Germinal Center, and post-Germinal Center. By applying integrative analysis of the epigenetics of normal B Cells of origin through chromatin-immunoprecipitation sequencing, we find that differences in normal B Cell subtypes are reflected in the mutational landscapes of the cancers that arise from them, namely Mantle Cell, Burkitt, and Diffuse Large B-Cell Lymphoma.
In the next research chapter, I describe our first endeavor into understanding the genetic heterogeneity of Diffuse Large B Cell Lymphoma, the most common form of non-Hodgkin’s lymphoma, which affects 100,000 patients in the world. Through whole-genome sequencing of 1 case as well as whole-exome sequencing of 94 cases, we characterize the most recurrent genetic features of DLBCL and lay the groundwork for a larger study.
In the last research chapter, I describe work to characterize and interpret the whole exomes of 1001 cases of DLBCL in the largest single-cancer study to date. This highly-powered study enabled sub-gene, gene-level, and gene-network level understanding of driver mutations within DLBCL. Moreover, matched genomic and clinical data enabled the connection of these driver mutations to clinical features such as treatment response or overall survival. As sequencing costs continue to drop, whole-exome sequencing will become a routine clinical assay, and another diagnostic dimension in addition to existing methods such as histology. However, to unlock the full utility of sequencing data, we must be able to interpret it. This study undertakes a first step in developing the understanding necessary to uncover the genomic signals of DLBCL hidden within its exomes. However, beyond the scope of this one disease, the experimental and analytical methods can be readily applied to other cancer sequencing studies.
Thus, this dissertation leverages next-generation sequencing analysis to understand the genetic underpinnings of lymphoma, both by examining its normal cells of origin as well as through a large-scale study to sensitively identify recurrently mutated genes and their relationship to clinical outcome.
Item Open Access Determining the Role of DDX3X in Normal and Malignant Germinal Center B Cells(2019) Palus, BrookeBurkitt lymphoma (BL) is an aggressive germinal center (GC) B cell derived lymphoma. BL accounts for 40% of pediatric lymphoma cases in the United States and over half of all pediatric malignancies in Sub Saharan Africa. BL is characterized by the t(8;14) chromosomal translocation that results in MYC overexpression. Translocation of MYC alone is insufficient to induce lymphomagenesis; additional genetic mutations are required. A better functional characterization of the genetic drivers of BL will lend insight into pathways that drive Burkitt lymphomagenesis, providing an opportunity to identify novel drug targets and develop improved therapies.
In order to identify novel genetic drivers of BL our lab previously sequenced 101 BL tumors with paired normal samples. We found that DDX3X is recurrently mutated in 46% of BL tumors, making it the third most commonly mutated gene in BL. We observed that DDX3X mutations are either truncating mutations (22%) or missense mutations (88%) that cluster around the two highly conserved functional domains of the protein. Based on the non-focal distribution of missense mutations throughout the DDX3X coding sequence and the high presence of truncation mutations in BL tumors, we hypothesize that in GC B cells DDX3X acts as a tumor suppressor gene whose normal function is destroyed by BL associated mutations, facilitating lymphomagenesis.
While DDX3X is frequently mutated in many types of cancer, its role in malignancy is poorly understood. In this study we focused on elucidation of the role of DDX3X in the specific context of GC B cells from which BLs arise. To test the hypothesis that DDX3X normally functions as a tumor suppressor we modeled DDX3X deficiency in normal and malignant GC B cells using parallel in vitro and in vivo approaches. First, we created transgenic Ddx3x deficiency mouse models with and without MYC overexpression. This approach allowed us to study DDX3X in a system that models the complexities of the immune system on a genetically defined background. Second, we used genome editing to precisely delete DDX3X expression in BL cell lines. BL is a genetically complex disease and the the use of BL cell lines provides allowed us to study the role of DDX3X on a genetic background typical of BL tumors. We then characterized both DDX3X loss of function models with respect to cellular processes related to tumor development. Third, we performed cross-linking immunoprecipitation with sequencing (CLIP-Seq) to identify the RNA targets of DDX3X in the germinal center.
Our study lead to unexpected results regarding the contribution of DDX3X to Burkitt lymphomagenesis, and highlighted an important role for DDX3X in B cell development. We found that Ddx3x deficiency in a mouse model of BL increased the time to tumor development by reducing the global B cell population available for malignant transformation. In tandem we found that Ddx3x deficiency at the GC B cell stage significantly reduced the GC B cell population in multiple lymphoid tissues, regardless of MYC status. Interestingly, we found that Ddx3x deficiency in pre-B cells expanded the pre-B cell population but decreased the population of later B cell stages. We then confirmed that the reduction of GC B cells in response to Ddx3x deficiency was not due to defects in GC B cell migration, germinal center architecture, cell cycle progression, or apoptosis. These combined data suggest an essential function for DDX3X in B cell development, particularly at the pre-B cell and GC B cell stages.
Similar to in the mice, in cell culture we also found that DDX3X loss did not significantly alter apoptosis or cell cycle progression. We found some evidence that DDX3X may play a role in DNA damage repair but these results were not consistent across conditions. Lastly, we identified DDX3X targeted RNA binding partners using CLIP-Seq. Our data corroborates previously published CLIP-Seq experiments showing that DDX3X binds numerous RNAs involved in translation and RNA processing. Additionally, we identified for the first time that DDX3X binds RNAs falling in the BRCA1 and ATM gene sets. Further experimentation is needed to determine the role DDX3X plays in these pathways with relationship to Burkitt lymphomagenesis.
Item Open Access Functional Drivers of Therapeutic Response in Diffuse Large B Cell Lymphoma(2018) Davis, Nicholas SamuelDiffuse large B cell lymphoma (DLBCL) is the most common form of non-Hodgkin’s lymphoma and a leading cause of death among B cell lymphomas. The disease displays a remarkable amount of genetic and clinical heterogeneity, hampering efforts at designing effective therapeutic agents. The last major change in frontline therapy for DLBCL came in 1997 with the addition of rituximab into widespread clinical use. Much effort has been put forth to identify new therapeutic agents in DLBCL, but these have been relatively unsuccessful due to the phenotypic heterogeneity rendering them useful in only small percentage of patients, and responding patients developing resistance to single agents. To find broadly effective, safe therapeutics, we must first perform deep genomic characterization of the disease to understand its true heterogeneity. We can then study those genes and pathways with genomic alteration and evaluate their effects on the disease. Using this information, we can develop relevant drug screening tools that leverage genomic technologies to better understand therapeutic interactions with the disease and use it to predict possible therapeutic synergy. In this dissertation, I utilize this three-pronged approach to identify novel functional variation in DLBCL and use it predict and verify synergistic therapeutic combinations for therapy.
First, I focus on the genomic variation across DLBCL patients. Since DLBCL displays vast genomic heterogeneity, we assemble the largest sequencing study of the disease to date, consisting of 1,001 cases. Here, we applied exome and transcriptome sequencing to these cases with paired clinical data. We identified 150 putative driver genes that were recurrently mutated with a mean of 7.75 per sample. Using genomic alteration types, we then classified these genes as oncogenes or tumor suppressors. We then used gene expression data to classify these tumors based on the cell of origin into activated B cell like (ABC) and germinal center B cell like (GCB) subtypes, which aligned well with standardized clinical methods. We found differential mutations in 20 genes based these subtypes. We also found significant overlaps of driver genes with both co-occurrence and mutual exclusion, suggesting subnetworks of mutation. To identify functional variation, we then use a CRISPR screen to identify putative oncogenes and tumor suppressors genome wide. Using this data in combination with matched clinical data, we then create the Genomic Risk Model for predicting single patient outcomes based his or her genomic profile. This model outperforms current models and successfully predicts outcomes in most patients.
I then focus on placing two focal adhesion genes identified as recurrently alerted in DLBCL into the functional context of the disease. The first gene is RHOA, a small GTPase that is found to be recurrently mutated in DLBCL in a hotspot specific manner. Overexpression of both wildtype RHOA and the enriched R5Q mutant form both significantly increase proliferation in DLBCL cell lines. We also find that RHOA loss causes a loss of fitness in DLBCL lines, causing the cells to arrest in the G2/M phase of the cell cycle and altering cellular morphology. I then develop a mouse model that knocks out Rhoa in either the full B cell lineage or germinal center B (GCB) cells specifically. We observe a massive loss of B cells across the B lineage driven by Rhoa deletion. In the GCB restricted knockout, we find that GCB cell numbers are reduced, the dark zone to light zone regulation in the germinal center is altered, as well as actin dysregulation in these cells. The next gene we modeled was focal adhesion kinase (FAK). Though FAK is not recurrently mutated in DLBCL, it is overexpressed in GCB cells and many cancers, and it is a master regulator of the focal adhesion pathway, which is overrepresented in mutation rates. Chemical inhibition and genetic knockdown of FAK in GBC cells causes cell death and a marked reduction in B cell receptor (BCR) signaling effectors. Further work in BCR signal transduction places FAK near the intracellular interface with the BCR, as the first effector molecules in the pathway have vastly reduced activity with FAK inhibition. Mouse models of Fak knockout also show a reduction in GCB cells, dysregulation of germinal centers in secondary lymph organs, and a reduction in serum levels of secreted immunoglobulins.
Lastly, we sought to understand the role of single agent therapeutics on gene expression to identify a method to use this data to inform combination therapy predictions. Using a panel of 152 FDA-approved drugs and 6 DLBCL cell lines, we screened all lines for drug efficacy. This revealed 3 classes of drugs: pan-effective, selective, and resistant. The selective drugs displayed pathway specific resistance as well as a subtype specific sensitivity for certain drugs. RNA sequencing to quantify gene expression was then performed for all drug-cell line pairs. Overall gene expression patterns show that drugs targeting similar primary targets or targets in the same pathway induced tightly correlated gene expression patterns. We also found that changes in expression of genes like MYC were correlated with sensitivity, giving possible proxies for sensitivity and mechanisms of resistance. We also found that baseline expression of the target gene was correlated with a higher dose requirement to achieve similar viability changes. Gene expression changes of the target gene were also found to be indicative of either sensitivity or resistance in some cases. We then developed a model for using gene expression to predict dual drug synergy that we are calling combination reversal of disease gene expression (cRDGE). This model accurately predicted both single agent effectiveness as well as combination synergy in previous datasets. We then validated this method across several drug combinations and found synergy between the tested combinations in vitro. We then tested one combination, panobinostat and ruxolitinib, and found it to be highly synergistic in both the cell lines tested within the study as well as a panel of cell lines representing a wide variety of B cell malignancies. We then show how ruxolitinib alone does not reduce STAT signaling in these cells at a lower dose, but sensitization of these cells with panobinostat greatly reduces STAT activity with the combination. Using xenograft models, we then tested this combination in vivo, finding synergy of the drug combination and low hematopoietic toxicity.
Broadly, this dissertation contributes novel findings to the fields of B cell biology, lymphoma genomics, and therapeutic screening. Cancer is an incredibly challenging entity that requires an approach that integrates interrogation of genomic alterations, exploration of functional alterations, and development of new tools to identify therapeutics. Leveraging these tools to understand the molecular basis of lymphoma, its functional variation, and therapeutic interactions, we can more accurately diagnose, give prognoses, and treat patients with the disease.
Item Open Access Integrating Genomic and Biological Understandings of Disease to Improve Patient Outcome in Mature B Cell Lymphoma(2021) Happ, Lanie EAll cancers begin as normal cells that have acquired genetic alterations allowing them to evade growth control mechanisms and proliferate uncontrollably. A better delineation of the genetic events that drive cancer and their biological consequences has the potential to enable the discovery of improved diagnostic and prognostic biomarkers and the identification of new therapeutic possibilities.Lymphomas comprise nearly 50 distinct malignancies arising from immune cells. These cancers are recognized by the current standard classification system organized by the World Health Organization. Outcomes for these collectively common diseases have not improved in several decades. These lymphomas are classified primarily based on lineage. There are three fundamental problems that must be addressed in order to better connect genetic alterations to new treatments in any genetically heterogeneous disease such as mature B cell lymphoma. First, the most common drivers of disease must be determined through a systematic genetic analysis of patient tumors. Second, we need to better understand the interplay between the biological context of lineage and these genetic alterations. Finally, we need to identify the specific biological aspects of the genetic alterations that can be systematically targeted with drugs. In this dissertation, I present my work addressing these critical questions with the overall goal of converting our genomic and biological understanding of disease into actionable improvements in patient outcome. First, I focus on defining and characterizing the common genetic drivers of ocular adnexa marginal zone lymphoma (OAMZL), a rare but sometimes deadly lymphoma that has not previously been genomically characterized systematically. In addition to finding alterations in genes previously implicated as drivers in other lymphomas, including TBL1XR1, CREBBP, and TNFAIP3, we identified CABIN1 as a novel tumor suppressor gene that is recurrently mutated and deleted. Experimental validation of CABIN1 as a driver in OAMZL indicates that its deletion can lead to uncontrolled cell-growth signaling in the B cell receptor and NFAT pathways that could be targeted with new drugs. This study thus provides an unbiased identification of genetically altered genes that may play a role in the development of OAMZL and serve as potential therapeutic targets in future drug development. Next, it is important to appreciate that genetic alterations driving cancer do not exert their effects in isolation. The lineage or normal cell of origin of the cancer provides critical context for understanding many aspects of cancer, including favored growth mechanisms and inherent gene activation patterns. Here, I describe how we utilized integrative genomic approaches to identify and characterize the normal cell of origin of diffuse large B cell lymphoma (DLBCL). We utilized a combination of single cell RNA sequencing to define all normal cell types that exist in lymphoid tissues and RNA sequencing on bulk lymphoma tumor samples to characterize the expression profiles of these tumors. We further developed a computational approach that allows us to identify the single cell population that most closely resembles RNA sequencing data from bulk tumors. Using this approach, we identified candidate normal immune cell populations that most closely resemble the bulk tumor sample for different DLBCL subtypes. This work provides important clues to the underlying biology of the specific normal cell types that shape the biological consequences of various genomic alterations in the corresponding cancers. Finally, the eventual goal of all basic cancer research is to translate our understanding of the genomic and biological underpinnings of cancer into clinical advances that improve overall patient survival. In the final section of this dissertation, I develop a framework to integrate different functional genomic approaches to identify novel therapeutic opportunities for DLBCL patients with poor outcomes. First, I describe how we identified a group of DLBCL patients that do not respond to current standard therapy. We then examined the spectrum of efficacy for 152 FDA-approved cancer drugs in preclinical models of DLBCL. Taken together, these data allowed us to identify a new combination therapy targeting the genomic features that are associated with low rates of response to standard therapy. Additional in vitro and in vivo experiments validated the efficacy of this proposed novel combination therapy. Overall, this dissertation contributes to our understanding of lymphoma genetics and presents a scientific framework for translating this understanding into clinical applications for improving patient outcomes. The methods and approaches described in this dissertation are broadly applicable to other types of cancer and could be used to improve clinical outcomes for other cancer patients.
Item Open Access Integrative Computational Genomics Defines the Molecular Origins and Outcomes of Lymphoma(2016) Moffitt, Andrea BarrettLymphomas are a heterogeneous group of hematological malignancies composed of diseases with diverse molecular origins and clinical outcomes. Derived from immune cells of lymphoid origin, lymphoma can arise from lymphoid cells present anywhere in the body, from the spleen and lymph nodes to peripheral sites like the liver and intestines. Current strategies for lymphoma diagnosis involve primarily histopathological examinations of the tumor biopsy, including cytogenetics and immunophenotyping. As more data becomes available, diagnoses may increasingly depend on genomic features that define each disease. Classification of lymphoid neoplasms is generally based on the cell of origin, or the lineage of the normal cell that the cancer is thought to arise from. Lymphomas can be classified into dozens of distinct diagnostic entities, though any two patients with the same diagnosis may have very different outcomes and molecular underpinnings, so we need to understand both the commonalities of patients with the same disease and the unique features that may require personalized treatment strategies. Patient prognosis in lymphoma depends greatly on the type of lymphoma, ranging from nearly curable diseases with over 90% five-year survival rates, to most patients dying in the first year in the worse entities. Greater clarity is needed in the role of the underlying genomics that contribute to these variable treatment responses and clinical outcomes.
Next-generation sequencing approaches allow us to delve into the molecular underpinnings of lymphomas, in order to gain insight about the origin and evolution of these diseases. High-throughput sequencing protocols allow us to examine the whole genome, exome, epigenome, or transcriptome of cancer cells in tens to hundreds of patients for each disease. As cost of sequencing is reduced, and the ability to generate more data increases, we face increasing computational challenges to both process and interpret the wealth of data available in cancer genomics. Developing efficient and effective bioinformatics tools is necessary to transform billions of sequencing reads into actionable hypotheses on the role of certain genes or biological pathways in a specific cancer type or patient.
In this dissertation, I present several strategies and applications of integrative computational genomics in lymphoma, with contributions throughout the research process, from development of initial assays and quality control strategies for the sequencing data, to joint analysis of clinical and genomic data, and finally through follow-up experimental models for lymphoma.
First, I focus on two rare T cell lymphomas, hepatosplenic T cell lymphoma (HSTL) and enteropathy associated T cell lymphoma (EATL), which are both diseases with very poor clinical outcomes and a previous dearth of knowledge on the genetic basis of the diseases. We define the somatic mutation landscape of HSTL, through application of exome sequencing and find SETD2 to be the most highly mutated gene. We further utilize the exome sequencing data to investigate copy number alterations and show a significant survival difference between cases with and without certain arm-level copy number alterations. Knockdown of SETD2 in an HSTL cell line, followed by RNA sequencing, demonstrates the role of SETD2 loss in proliferation and cell cycle changes, linking the SETD2 mutations to a potential oncogenic mechanism. Furthermore, we investigate the potentially targetable mutations in the JAK-STAT pathway and demonstrate oncogenic downstream molecular phenotypes and potential druggability of these mutations. In the enteropathy associated T cell lymphoma study, we apply exome and RNA sequencing to a large EATL cohort. Our findings show a significant role for loss of function mutations in chromatin modifiers and JAK-STAT signaling genes. EATL can be separated into two subtypes, Type I and Type II, which we show to have convergent genomic features, in the face of divergent gene expression. RNA sequencing data defines a distinct separation between the two subtypes. Delving further into the role of SETD2 in these T cell lymphomas, we generate a mouse model with a conditional knockout of SETD2 in T cells and demonstrate a role for SETD2 in altering the lineage development of T cells.
To understand more about why certain genetic abnormalities are recurrent in some disease entities and not others, we turn to the cell of origin for clues. We pair two different lymphomas, Burkitt lymphoma and mantle cell lymphoma, with their associated cells of origin, germinal center B cells and naive B cells. These closely related cell types have much in common as B cells, but from studies of their transcriptomes, we know that there are many molecular differences that distinguish the two. In this work, after looking more closely at mantle cell lymphoma genomics, we look at the underlying chromatin markers that define the epigenomes of these B cells. We test the association between chromatin markers and mutation rates of genes between these two cell types and lymphomas, and find that genes with more open chromatin may have a higher mutation rate, when comparing closely related cells and lymphomas. Finally, I present my work on developing an RNA sequencing based strategy for defining the complete transcriptome of diffuse large B cell lymphoma (DLBCL). Gene expression profiling with microarray has shown the existence of two subtypes in DLBCL, activated B cell like (ABC) and germinal center B cell like (GCB). However, the role for non-coding RNAs, alternative splicing, and mutations, in these two subtypes and the larger group is previously not well understood. We develop a strand-specific RNA sequencing strategy that will allow the investigation of the total RNA transcriptome in DLBCL, including microRNAs, lncRNAs, and other important non-coding RNAs. Furthermore, we show that RNA sequencing can be used to distinguish the two subtypes, including through RNA sequencing based mutation calls, as well as through differentially expressed lncRNAs that we define for the first time in DLBCL.
Broadly, this dissertation contributes novel findings in the field of lymphoma genomics, as well as presenting a framework for computational integrative genomics that can guide future studies. The heterogeneity of lymphoma across cases requires us to dive deep into individual diseases, even rare ones, as well as appreciate the similarities and differences across lymphomas. To improve diagnoses, prognoses, and treatment options, we need to understand the molecular origins of lymphoma. Using a range of molecular and computational approaches, we can move closer to true personalized medicine at the genomic level.
Item Open Access Integrative Genomics Reveals a Role for GNA13 in Lymphomagenesis(2014) Greenough, AdrienneLymphomas comprise a diverse group of malignancies derived from immune cells. High throughput sequencing has recently emerged as a powerful and versatile method for analysis of the cancer genome and transcriptome. As these data continue to emerge, the crucial work lies in sorting through the wealth of information to hone in on the critical aspects that will give us a better understanding of biology and new insight for how to treat disease. Finding the important signals within these large data sets is one of the major challenges of next generation sequencing.
In this dissertation, I have developed several complementary strategies to describe the genetic underpinnings of lymphomas. I begin with developing a better method for RNA sequencing that enables strand-specific total RNA sequencing and alternative splicing profiling in the same analysis. I then combine this RNA sequencing technique with whole exome sequencing to better understand the global landscape of aberrations in these diseases. Finally, I use traditional cell and molecular biology techniques to define the consequences of major genetic alterations in lymphoma.
Through this analysis, I find recurrent silencing mutations in the G alpha binding protein GNA13 and associated focal adhesion proteins. I aim to describe how loss-of-function mutations in GNA13 can be oncogenic in the context of germinal center B cell biology. Using in vitro techniques including liquid chromatography-mass spectrometry and knockdown and overexpression of genes in B cell lymphoma cell lines, I determine protein binding partners and downstream effectors of GNA13. I also develop a transgenic mouse model to study the role of GNA13 in the germinal center in vivo to determine effects of GNA13 deletion on germinal center structure and cell migration.
Thus, I have developed complementary approaches that span the spectrum from discovery to context-dependent gene models that afford a better understanding of the biological function of aberrant events and ultimately result in a better understanding of disease.
Item Open Access Methods for Systematic Exploratory Analysis of Gene Expression Data with Applications to Cancer Genomics(2017) Wagner, FlorianAdvances in technologies for gene expression profiling have resulted in an unprecedented abundance of gene expression data. However, computational methods available for the exploratory analysis of such data are limited in their ability to generate an interpretable overview of biologically relevant similarities and differences among samples. This work first introduces the XL-mHG test, a sensitive and specific hypothesis test for detecting gene set enrichment, and discusses its algorithmic and statistical properties. It further introduces GO-PCA, a method for exploratory analysis of gene expression data using prior knowledge. The XL-mHG test serves as a building block for GO-PCA. The output of GO-PCA consists of functional expression signatures, designed to provide an interpretable representation of biologically meaningful variation in the data. The power and versatility of the method is demonstrated on heterogeneous human and mouse expression data. Finally, applications of the proposed methods to carcinoma and lymphoma expression data aim to demonstrate their clinical relevance. The effective utilization of prior knowledge in the exploratory analysis of gene expression data through carefully designed computational methods is essential for successfully harnessing the power of current and future platforms for gene expression profiling, with the aim of generating clinically relevant insights into complex diseases such as cancer.
Item Open Access Profiling Blood Cancer Drivers through Large-Scale Genomics(2022) Kositsky, RachelThere are 140,000 new cases of blood cancers each year in the US and even more worldwide. Understanding the molecular and genomic origins of blood cancers can refine diagnosis, predict survival, and identify appropriate treatment. Large-scale projects profiling cancer genomes prioritized common cancer types, while other blood cancer genomics studies have been completed in an ad-hoc fashion. In this thesis, I will describe advances made to systemic large-scale molecular profiling of blood cancers.
To identify cancer drivers in both protein-coding and noncoding regions of the genome, I designed two novel capture panels. In my second chapter, I describe bioinformatics approaches I developed to identify accidental sample switches, refine alignment methods, and prioritize genes as potential cancer drivers.
Translocations are a major class of blood cancer drivers, which occur when two chromosomes break and repair incorrectly by fusing to each other. Previous studies using sequencing data to identify blood cancer-related translocations had only moderate sensitivity for several of the translocations compared to the clinical test. In my third chapter, I describe the development of a new translocation caller that is more sensitive to translocations in hypermutated regions that may have poor alignment to the reference genome, which is common in B-cell lymphomas.
Many patients with relapsed/refractory large B-cell lymphoma (R/R LBCL) have had success with chimeric antigen receptor T-cell (CAR-T) products approved by the FDA in 2017. However, a significant proportion of patients fail to respond to this highly expensive therapy and suffer from severe side effects while destined for poor survival. In my fourth chapter, I apply the genomic methods described earlier to identify predictors of resistance to CAR-T cell therapy in R/R LBCL. We found that complete response and survival were associated with clinical and molecular factors in the pre-treatment tumor.
Item Open Access The Epstein-Barr virus (EBV)-induced tumor suppressor microRNA MiR-34a is growth promoting in EBV-infected B cells.(Journal of virology, 2012-06) Forte, Eleonora; Salinas, Raul E; Chang, Christina; Zhou, Ting; Linnstaedt, Sarah D; Gottwein, Eva; Jacobs, Cassandra; Jima, Dereje; Li, Qi-Jing; Dave, Sandeep S; Luftig, Micah AEpstein-Barr virus (EBV) infection of primary human B cells drives their indefinite proliferation into lymphoblastoid cell lines (LCLs). B cell immortalization depends on expression of viral latency genes, as well as the regulation of host genes. Given the important role of microRNAs (miRNAs) in regulating fundamental cellular processes, in this study, we assayed changes in host miRNA expression during primary B cell infection by EBV. We observed and validated dynamic changes in several miRNAs from early proliferation through immortalization; oncogenic miRNAs were induced, and tumor suppressor miRNAs were largely repressed. However, one miRNA described as a p53-targeted tumor suppressor, miR-34a, was strongly induced by EBV infection and expressed in many EBV and Kaposi's sarcoma-associated herpesvirus (KSHV)-infected lymphoma cell lines. EBV latent membrane protein 1 (LMP1) was sufficient to induce miR-34a requiring downstream NF-κB activation but independent of functional p53. Furthermore, overexpression of miR-34a was not toxic in several B lymphoma cell lines, and inhibition of miR-34a impaired the growth of EBV-transformed cells. This study identifies a progrowth role for a tumor-suppressive miRNA in oncogenic-virus-mediated transformation, highlighting the importance of studying miRNA function in different cellular contexts.