Using Dynamic Ensembles to Quantitatively Model Sequence-Specific RNA Cellular Activity

Limited Access
This item is unavailable until:
2026-05-19

Date

2025

Journal Title

Journal ISSN

Volume Title

Repository Usage Stats

11
views
1
downloads

Abstract

RNA molecules do not fold into a single static structure but rather sample a dynamic ensemble of various interconverting conformations. These conformations comprise the dominant ground state as well as potentially many low-populated short-lived excited conformational states. These alternative conformations can be stabilized through various cellular inputs, such as binding of proteins, metal ions, metabolites, and post-transcriptional modifications, thereby enabling transitions between the different conformational states required for multi-step biochemical reactions. Thus, RNA excited states can modulate the binding affinities, specificities as well as the kinetics of several biochemical reactions, and hence determine RNA cellular function in several essential biological processes such as the sequential assembly of ribosomal complexes, turning gene expression on or off, alternative splicing, and regulating microRNA maturation. Despite their critical importance in RNA biology, characterizing the 3D structures and biochemical properties of these fleeting states has proven challenging, primarily due to their low abundance and short lifetimes. Furthermore, the impact of RNA sequence variations on the energetics of these rare conformational states also remains largely unexplored. Ultimately, these excited states could also play an essential role in responding and adapting to the evolutionary pressures experienced by several biological systems. Thus, we lack a comprehensive, quantitative, and predictive understanding of RNA sequences, the dynamic ensembles they fold into, and their diverse biological functions. This dissertation aims to quantitatively examine the relationships between RNA sequences and the conformational landscapes they encode, and their impact on cellular activity using HIV-1 transactivation response element (TAR) RNA as a model system. Determining the three dimensional (3D) dynamic ensemble of an RNA is a difficult challenging, owing to the large amount of ensemble-averaged experimental measurements required. Even more challenging is to study how the ensemble varies with sequence. To address the first challenge, we developed a general approach to determine the 3D structure of a lowly populated (0.4%) and exceptionally short lived (~2.1 ms) RNA excited state, by combining NMR residual dipolar couplings (RDCs), mutagenesis, and computational modeling using structure prediction and molecular dynamics simulation. The excited state completely remodeled the 3D structure of the stem loop in TAR (RMSD = 7.2 ± 0.9 Å), forming a surprisingly more structured and ordered conformational ensemble enriched in non-canonical mismatches. The TAR bulge is zipped and the upper helix is completely remodeled in the excited state, which impedes the formation of the motifs recognized by the corresponding protein binding partners: Tat and the super elongation complex. This provides a structural glimpse into how the alternative TAR conformation inhibits cellular transactivation during HIV-1 replication. This finding suggests that a universe of highly structured RNA excited states remains to be uncovered, populating higher energy levels of the RNA conformational landscape with unique therapeutic biological properties. To further understand the effect of sequence on the TAR ensemble, we applied a novel NMR experiment, 1H CEST, to screen the secondary structure ensembles for a library of N=13 TAR variants, in high-throughput, examining how these mutations stabilize the active or inactive conformational states in the ensemble to varying degrees. This approach allowed us to quantitatively measure the thermodynamic propensities of TAR to adopt the inactive excited state relative to the ground state as a function of sequence. Our measurements revealed substantial variations in these energetics spanning over ~8 kcal/mol across the variants in our library. This dataset, combined with measurements of cellular transactivation for the library of variants, was used to evaluate the performance of various commonly used RNA secondary structure prediction tools. We found that programs based on statistical learning, such as CONTRAfold, outperformed conventional physics-based approaches, based on the nearest-neighbor parameters, in predicting both the energetics measured using NMR and the cellular transactivation. In a blind test of secondary structure prediction, these tools were also able to identify two novel naturally occurring variants of TAR, with a significantly higher propensity to adopt the inactive conformational state compared to the wildtype sequence, demonstrated by corresponding NMR and cellular transactivation measurements. These results demonstrate that sequence-dependent perturbations in the TAR dynamic ensemble can be used to quantitatively predict RNA cellular activity. A large number of characterized RNA excited states are enriched in non-canonical mismatches, whose base-pair dynamics are poorly understood and essential for building accurate ensemble models at atomic resolution. In particular, the TAR ES is enriched in non-canonical U-U, C-C and tandem G-A/A-G mismatches. In this regard, we also examined the local base-pairing dynamics of U-U and T-T mismatches in RNA and DNA duplex contexts, respectively. Using a survey of all deposited structures in the Protein Data Bank (PDB), we identified two alternative wobble conformations as well as two intermediate conformations previously observed for the U-U ensemble. Using a quantitative analysis of the Nuclear Overhauser Effect (NOEs), NMR spectra of the carbonyl chemical shifts, NMR relaxation dispersion (RD) measurements, and NMR RDCs, we were able to demonstrate the presence of wobble dynamics more broadly in U-U and T-T mismatches embedded in various RNA and DNA duplex environments. Through extensive molecular dynamics simulations, we observed the dynamics between the alternative wobble conformations occurs on the fast nanosecond (ns) timescale and also identified the potential role of these intermediate conformations in enabling conformational transitions between the wobble geometries. Our results suggest that wobble motions are widespread in single U-U and T-T mismatches embedded in a Watson-Crick duplex, further contributing to the dynamic plasticity of mismatches and diversifying the conformational landscapes of nucleic acids.

Department

Description

Provenance

Subjects

Biochemistry, Conformational propensities, Ensembles, NMR, RNA excited state

Citation

Citation

Geng, Ainan (2025). Using Dynamic Ensembles to Quantitatively Model Sequence-Specific RNA Cellular Activity. Dissertation, Duke University. Retrieved from https://hdl.handle.net/10161/32797.

Collections


Except where otherwise noted, student scholarship that was shared on DukeSpace after 2009 is made available to the public under a Creative Commons Attribution / Non-commercial / No derivatives (CC-BY-NC-ND) license. All rights in student work shared on DukeSpace before 2009 remain with the author and/or their designee, whose permission may be required for reuse.