Integrative Modeling of Genetic and Transciptomic Data for the Identification of Allele-Specific Expression

dc.contributor.advisor

Allen, Andrew S

dc.contributor.advisor

Majoros, William H

dc.contributor.author

Zou, Xue

dc.date.accessioned

2024-06-06T13:44:57Z

dc.date.issued

2024

dc.department

Computational Biology and Bioinformatics

dc.description.abstract

The challenge of diagnosing rare genetic diseases persists despite advances in high-throughput sequencing. The limitation stems from an exome-centric diagnostic focus that often overlooks the influence of non-coding variants on gene expression. This research addresses this shortfall by leveraging allele-specific expression (ASE) analysis to detect cis-regulatory disruptions in gene expression, which could be pivotal for the diagnosis of non-exomic rare diseases.A novel computational framework, Bayesian Estimation of Allele Specific Transcript Integration across Exons (BEASTIE), was developed to refine ASE estimation. BEASTIE incorporates multiple heterozygous loci within a gene and rectifies phasing errors inherent in ASE detection. Comparative analyses reveal BEASTIE's enhanced accuracy over traditional methods, particularly in scenarios characterized by elevated heterozygosity and phasing errors. An advanced iteration, iBEASTIE, further incorporates error rates informed by genetic and genomic features, optimizing ASE estimations. In collaboration, quickBEAST—a C++ implementation of the BEASTIE model—was engineered, employing a subgrid algorithm to expedite the computation of ASE effect sizes. This tool proves essential for genome-wide analyses, evidenced by its application to 1000 Genome Project data, which aimed to map the ASE landscape and unearth novel imprinted genes. The practicality of these methods was tested in a case study of Glycogen Storage Disease (GSD), involving six probands. The integrated diagnostic pipeline—encompassing ASE, isoform, and differential expression analyses—identified a regulatory variant implicated in the disease phenotype. This finding was substantiated through CRISPR assays, verifying the computational predictions.

dc.identifier.uri

https://hdl.handle.net/10161/30883

dc.rights.uri

https://creativecommons.org/licenses/by-nc-nd/4.0/

dc.subject

Bioinformatics

dc.subject

Genetics

dc.subject

Biostatistics

dc.title

Integrative Modeling of Genetic and Transciptomic Data for the Identification of Allele-Specific Expression

dc.type

Dissertation

duke.embargo.months

12

duke.embargo.release

2025-06-06T13:44:57Z

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Zou_duke_0066D_17854.pdf
Size:
6.11 MB
Format:
Adobe Portable Document Format

Collections