Gordân, RalucaManandhar, Dinesh2018-05-312019-04-262018https://hdl.handle.net/10161/16843<p>Cellular reprogramming processes remain poorly characterized at the level of genome- wide chromatin and gene expression changes. Specifically, the extent to which re- programmed cells differ quantitatively from both the starting cells and the target cells is unknown for most reprogramming systems. In addition, direct comparisons between the genome-wide reprogramming efficiencies in systems driven by the over- expression of endogenous versus exogenous master regulator(s) are rarely performed. This thesis presents methods for comparative analyses of genome-wide gene expres- sion and chromatin accessibility data, applied to myogenic reprogramming systems in order to assess reprogramming efficiency and generate testable hypotheses for improving the reprogramming process. First, gene expression and chromatin acces- sibility profiles of MyoD-induced transdifferentiated primary human skin fibroblasts are compared to fibroblasts and myoblasts. Second, similar genome-wide changes are assessed for myogenic conversion of iPS cells driven by overexpression of en- dogenous MyoD versus exogeneous MyoD. Both studies show that (i) while many muscle marker genes are reprogrammed after MyoD overexpression, the genome-wide accessibility and gene expression profiles are still different from those of primary my- oblast or myotube cells; (ii) MyoD induces a continuum of changes in chromatin accessibility, with only a fraction of myogenic chromatin sites gaining a completely reprogrammed accessibility status; and (iii) chromatin-remodeling deficiencies are strongly correlated with incomplete gene expression reprogramming. Classification analyses comparing reprogrammed and non-reprogrammed genes or chromatin sites revealed discriminatory genetic and epigenetic features, suggesting ways to poten- tially improve the reprogramming efficiency. Genomic analysis of transgene MyoD overexpression in iPS cells, compared to endogenous MyoD activation, also showed that MyoD is more “aggressive” in its chromatin opening behavior, showing a large number of off-target chromatin opening events. To further investigate the effects chromatin remodeling events on gene expression in reprogramming studies, a novel cross-cell type gene expression prediction framework (CPGex) is also developed. By integrating and modeling the non-linear combinatorial effects of chromatin accessi- bility as well as the expression levels of regulatory TFs, CPGex is able to weigh the importance of regulatory sites or factors for downstream targeted reprogramming of specific gene(s). The methods described in this thesis can be applied to any cellular reprogramming system in order to quantitatively assess the efficiency of reprogram- ming at the chromatin accessibility and gene expression levels, as well as to generate testable hypothesis for improved genome-wide reprogramming.</p>BioinformaticsBiostatisticsArtificial intelligencecomparative chromatin and gene expression analysesCPGexcross cell type gene expression predictionendogenous MyoDMyoDmyogenic reprogrammingMethods for Comparative Analysis of Chromatin Accessibility and Gene Expression, With Applications to Cellular ReprogrammingDissertation