Browsing by Subject "Protein design"
- Results Per Page
- Sort Options
Item Open Access Combined Computational, Experimental, and Assay-Development Studies of Protein:Protein and Protein:Small Molecule Complexes, with Applications to the Inhibition of Enzymes and Protein:Protein Interactions(2019) Frenkel, MarcelDespite the best efforts of both academia and the pharma industry, most non-resectable cancers remain uncurable and lethal. The world health organization (WHO) believes cancer to be the second leading cause of death worldwide, with roughly 9.6 million deaths in 2018. Meanwhile, the emergence of antimicrobial resistance (AMR), or superbugs, is an increasingly large medical crisis, with estimates as high as 700,000 deaths for 2018 worldwide. This number is increasing rapidly. These unmet medical needs, although distinct, are intimately related by the need for better chemistry and intelligent drug design.
Both AMR and cancer could benefit from the expansion of the druggable proteome through the inhibition of protein-protein interactions (PPIs). PPIs drive both intra- and inter-cellular communication, and therefore their inhibition is vital for disease modulation. Moreover, both AMR and cancer therapeutics suffer from the rapid emergence of drug resistance. Even great drugs that function perfectly at first frequently lose effectiveness a few months later, due to the rapid emergence of drug resistance.
Here, I discuss my contributions towards developing a PPI inhibitor to KRas, the most commonly activated oncogene in cancer. Through the use of OSPREY, a state-of-the-art computational protein and drug design (CPDD) software, and using KRas’ native ligand Raf-1 RBD as a starting point, we developed a super-binder with single-digit nanomolar affinity for KRas. The development and validation of this biologic inhibitor required the development of four novel biochemical assays to study binding to KRas and the inhibition of the KRas:Raf interaction.
I also discuss my contributions towards enhancing our ability to predict resistance mutations through the use of OSPREY. This work focused on novel mechanisms of resistance in the dihydrofolate reductase of Staphylococcus aureus (SaDHFR). Specifically, we investigated the role of plasmid-borne resistance genes in Staph, as well as the mechanism of resistance due to the emergence of the F98Y and V31L resistance mutations. We discovered a potential new mechanism of resistance based on the formation of a tricyclic NADPH configuration, which we have named chiral evasion.
Finally, I discuss lessons learned from benchmarking OSPREY and share observations that can be used by drug designers using CPDD tools to enhance the accuracy and predictive potential of their results.
In conclusion, a combination of OSPREY and biochemical assays was used towards overcoming two of the largest limitations in drug development that directly affect global human health: the development of PPI inhibitors and overcoming drug resistance. We identified a novel hot-spot in the KRas:Raf interface that can successfully be used to optimize the PPI and develop a biologic inhibitor to KRas. We generated models that explain the mechanism of inhibition of both V31L and F98Y in the context of chiral evasion through a tricyclic NADPH configuration, and we benchmarked OSPREY and observed features that can contribute towards the predictive accuracy of CPDD tools.
Item Open Access Computational Molecular Engineering Nucleic Acid Binding Proteins and Enzymes(2010) Reza, FaisalInteractions between nucleic acid substrates and the proteins and enzymes that bind and catalyze them are ubiquitous and essential for reading, writing, replicating, repairing, and regulating the genomic code by the proteomic machinery. In this dissertation, computational molecular engineering furthered the elucidation of spatial-temporal interactions of natural nucleic acid binding proteins and enzymes and the creation of synthetic counterparts with structure-function interactions at predictive proficiency. We examined spatial-temporal interactions to study how natural proteins can process signals and substrates. The signals, propagated by spatial interactions between genes and proteins, can encode and decode information in the temporal domain. Natural proteins evolved through facilitating signaling, limiting crosstalk, and overcoming noise locally and globally. Findings indicate that fidelity and speed of frequency signal transmission in cellular noise was coordinated by a critical frequency, beyond which interactions may degrade or fail. The substrates, bound to their corresponding proteins, present structural information that is precisely recognized and acted upon in the spatial domain. Natural proteins evolved by coordinating substrate features with their own. Findings highlight the importance of accurate structural modeling. We explored structure-function interactions to study how synthetic proteins can complex with substrates. These complexes, composed of nucleic acid containing substrates and amino acid containing enzymes, can recognize and catalyze information in the spatial and temporal domains. Natural proteins evolved by balancing stability, solubility, substrate affinity, specificity, and catalytic activity. Accurate computational modeling of mutants with desirable properties for nucleic acids while maintaining such balances extended molecular redesign approaches. Findings demonstrate that binding and catalyzing proteins redesigned by single-conformation and multiple-conformation approaches maintained this balance to function, often as well as or better than those found in nature. We enabled access to computational molecular engineering of these interactions through open-source practices. We examined the applications and issues of engineering nucleic acid binding proteins and enzymes for nanotechnology, therapeutics, and in the ethical, legal, and social dimensions. Findings suggest that these access and applications can make engineering biology more widely adopted, easier, more effective, and safer.
Item Open Access Computational Protein Design with Ensembles, Flexibility and Mathematical Guarantees, and its Application to Drug Resistance Prediction, and Antibody Design(2015-01-01) Gainza Cirauqui, PabloProteins are involved in all of life's processes and are also responsible for many diseases. Thus, engineering proteins to perform new tasks could revolutionize many areas of biomedical research. One promising technique for protein engineering is computational structure-based protein design (CSPD). CSPD algorithms search large protein conformational spaces to approximate biophysical quantities. In this dissertation we present new algorithms to realistically and accurately model how amino acid mutations change protein structure. These algorithms model continuous flexibility, protein ensembles and positive/negative design, while providing guarantees on the output. Using these algorithms and the OSPREY protein design program we design and apply protocols for three biomedically-relevant problems: (i) prediction of new drug resistance mutations in bacteria to a new preclinical antibiotic, (ii) the redesign of llama antibodies to potentially reduce their immunogenicity for use in preclinical monkey studies, and (iii) scaffold-based anti-HIV antibody design. Experimental validation performed by our collaborators confirmed the importance of the algorithms and protocols.
Item Open Access Efficient New Computational Protein Design Algorithms, with Applications to Drug Resistance Prediction and HIV Antibody Design(2018) Ojewole, AdegokeProteins are essential for myriad biological functions, including DNA replication, molecular transport, catalysis, and antigen recognition. Protein function is determined by three dimensional structure, which is largely determined by amino acid composition. The functional diversity of known proteins suggests that nature can support a much larger set of proteins than is currently available. Protein design aims to explore the space of possible proteins in order to create new proteins with novel or improved biological functions. Two key challenges in protein design, however, are the astronomically large number of possible protein sequences, along with the vast conformation space spanned by each protein. Computational structure-based protein design (CPD) enables the prediction of proteins with desired biochemical properties. A practical CPD method must not only efficiently tackle large sequence and conformation spaces but also use a computationally tractable yet biophysically realistic model of protein plasticity. To this end, I have developed algorithms that accurately and more efficiently search large sequence and conformational spaces to compute proteins that satisfy binding affinity, specificity, and stability requirements. Crucially, my algorithms maintain the state-of-the-art in protein design, namely: provable guarantees, continuous flexibility, and ensemble-based scoring. I applied my algorithms to two biomedically relevant problems: (i) prediction of drug resistance mutations that arise in response to four pre-clinical antibiotics, and (ii) the re-design of a monoclonal HIV antibody for improved potency and breadth of neutralization.
Item Open Access Efficient Partition Function Estimation in Computational Protein Design: Probabalistic Guarantees and Characterization of a Novel Algorithm(2015-05-07) Nisonoff, HunterBy computational protein design we mean the use of computer algorithms to design new proteins or redesign existing ones. A significant challenge in this field involves computing the partition function of the ensemble of conformations that a protein can adopt. Due to the exponentially large number of possible states, there are too many conformations to explicitly count. One solution is to employ a probabilistic algorithm to estimate the number of conformations instead. In this work we implemented such an algorithm, studied its mathematical guarantees and analyzed its properties. Additionally we proposed different approaches to improve the convergence of the algorithm.Item Open Access Novel Computational Protein Design Algorithms with Applications to Cystic Fibrosis and HIV(2014) Roberts, Kyle EugeneProteins are essential components of cells and are crucial for catalyzing reactions, signaling, recognition, motility, recycling, and structural stability. This diversity of function suggests that nature is only scratching the surface of protein functional space. Protein function is determined by structure, which in turn is determined predominantly by amino acid sequence. Protein design aims to explore protein sequence and conformational space to design novel proteins with new or improved function. The vast number of possible protein sequences makes exploring the space a challenging problem.
Computational structure-based protein design (CSPD) allows for the rational design of proteins. Because of the large search space, CSPD methods must balance search accuracy and modeling simplifications. We have developed algorithms that allow for the accurate and efficient search of protein conformational space. Specifically, we focus on algorithms that maintain provability, account for protein flexibility, and use ensemble-based rankings. We present several novel algorithms for incorporating improved flexibility into CSPD with continuous rotamers. We applied these algorithms to two biomedically important design problems. We designed peptide inhibitors of the cystic fibrosis agonist CAL that were able to restore function of the vital cystic fibrosis protein CFTR. We also designed improved HIV antibodies and nanobodies to combat HIV infections.
Item Open Access Partition function estimation in computational protein design with continuous-label Markov random fields(2017-05-04) Mukund, AdityaProteins perform a variety of biological tasks, and drive many of the dynamic processes that make life possible. Computational structure-based protein design (CSPD) involves computing optimal sequences of amino acids with respect to particular backbones, or folds, in order to produce proteins with novel functions. In particular, it is crucial to be able to accurately model protein-protein interfaces (PPIs) in order to realize desired functionalities. Accurate modeling of PPIs raises two significant considerations. First, incorporating continuous side-chain flexibility in the design process has been shown to significantly improve the quality of designs. Second, because proteins exist as ensembles of structures, many of the properties we wish to design, including binding affinity, require the computation of ensemble properties as opposed to features of particular conformations. The bottleneck in many design algorithms that attempt to handle the ensemble nature of protein structure, including the Donald Lab’s K ∗ algorithm, is the computation of the partition function, which is the sum of the Boltzmann-weighted energies of all the conformational states of a protein or protein-ligand complex. Protein design can be formulated as an inference problem on Markov random fields (MRFs), where each residue to be designed is represented by a node in the MRF and an edge is placed between nodes corresponding to interacting residues. Label sets on each vertex correspond to allowed flexibility in the underlying design problem. The aim of this work is to extend message-passing algorithms that estimate the partition function for Markov random fields with discrete label sets to MRFs with continuous label sets in order to compute the partition function for PPIs with continuous flexibility and continuous entropy.Item Open Access Protein and Drug Design Algorithms Using Improved Biophysical Modeling(2016) Hallen, Mark AndrewThis thesis focuses on the development of algorithms that will allow protein design calculations to incorporate more realistic modeling assumptions. Protein design algorithms search large sequence spaces for protein sequences that are biologically and medically useful. Better modeling could improve the chance of success in designs and expand the range of problems to which these algorithms are applied. I have developed algorithms to improve modeling of backbone flexibility (DEEPer) and of more extensive continuous flexibility in general (EPIC and LUTE). I’ve also developed algorithms to perform multistate designs, which account for effects like specificity, with provable guarantees of accuracy (COMETS), and to accommodate a wider range of energy functions in design (EPIC and LUTE).
Item Open Access RNA 3D Structure Analysis and Validation, and Design Algorithms for Proteins and RNA(2015) Jain, SwatiRNA, or ribonucleic acid, is one of the three biological macromolecule types essential for all known life forms, and is a critical part of a variety of cellular processes. The well known functions of RNA molecules include acting as carriers of genetic information in the form of mRNAs, and then assisting in translation of that information to protein molecules as tRNAs and rRNAs. In recent years, many other kinds of non-coding RNAs have been found, like miRNAs and siRNAs, that are important for gene regulation. Some RNA molecules, called ribozymes, are also known to catalyze biochemical reactions. Functions carried out by these recently discovered RNAs, coupled with the traditionally known functions of tRNAs, mRNAs, and rRNAs make RNA molecules even more crucial and essential components in biology.
Most of the functions mentioned above are carried out by RNA molecules associ- ating themselves with proteins to form Ribonucleoprotein (RNP) complexes, e.g. the ribosome or the splicesosome. RNA molecules also bind a variety of small molecules, such as metabolites, and their binding can turn on or off gene expression. These RNP complexes and small molecule binding RNAs are increasingly being recognized as potential therapeutic targets for drug design. The technique of computational structure-based rational design has been successfully used for designing drugs and inhibitors for protein function, but its potential has not been tapped for design of RNA or RNP complexes. For the success of computational structure-based design, it is important to both understand the features of RNA three-dimensional structure and develop new and improved algorithms for protein and RNA design.
This document details my thesis work that covers both the above mentioned areas. The first part of my thesis work characterizes and analyzes RNA three-dimensional structure, in order to develop new methods for RNA validation and refinement, and new tools for correction of modeling errors in already solved RNA structures. I collaborated to assemble non-redundant and quality-conscious datasets of RNA crystal structures (RNA09 and RNA11), and I analyzed the range of values occupied by the RNA backbone and base dihedral angles to improve methods for RNA structure correction, validation, and refinement in MolProbity and PHENIX. I rebuilt and corrected the pre-cleaved structure of the HDV ribozyme and parts of the 50S ribosomal subunit to demonstrate the potential of new tools and techniques to improve RNA structures and help crystallographers to make correct biological interpretations. I also extended the previous work of characterizing RNA backbone conformers by the RNA Ontology Consortium (ROC) to define new conformers using the data from the larger RNA11 dataset, supplemented by ERRASER runs that optimize data points to add new conformers or improve cluster separation.
The second part of my thesis work develops novel algorithms for structure-based
protein redesign when interactions between distant residue pairs are neglected and the design problem is represented by a sparse residue interaction graph. I analyzed the sequence and energy differences caused by using sparse residue interaction graphs (using the protein redesign package OSPREY), and proposed a novel use of ensemble-based provable design algorithms to mitigate the effects caused by sparse residue interaction graphs. I collaborated to develop a novel branch-decomposition based dynamic programming algorithm, called BWM*, that returns the Global Minimum Energy Conformation (GMEC) for sparse residue interaction graphs much faster than the traditional A* search algorithm. As the final step, I used the results of my analysis of the RNA base dihedral angle and implemented the capability of RNA design and RNA structural flexibility in osprey. My work enables OSPREY to design not only RNA, but also simultaneously design both the RNA and the protein chains in a RNA-protein interface.
Item Open Access Using Protein-Likeness to Validate Conformational Alternatives(2012) Keedy, Daniel AustinProteins are among the most complex entities known to science. Composed of just 20 fundamental building blocks arranged in simple linear strings, they nonetheless fold into a dizzying array of architectures that carry out the machinations of life at the molecular level.
Despite this central role in biology, we cannot reliably predict the structure of a protein from its sequence, and therefore rely on time-consuming and expensive experimental techniques to determine their structures. Although these methods can reveal equilibrium structures with great accuracy, they unfortunately mask much of the inherent molecular flexibility that enables proteins to dynamically perform biochemical tasks. As a result, much of the field of structural biology is mired in a static perspective; indeed, most attempts to naively model increased structural flexibility still end in failure.
This document details my work to validate alternative protein conformations beyond the primary or equilibrium conformation. The underlying hypothesis is that more realistic modeling of flexibility will enhance our understanding of how natural proteins function, and thereby improve our ability to design new proteins that perform desired novel functions.
During the course of my work, I used structure validation techniques to validate conformational alternatives in a variety of settings. First, I extended previous work introducing the backrub, a local, sidechain-coupled backbone motion, by demonstrating that backrubs also accompany sequence changes and therefore are useful for modeling conformational changes associated with mutations in protein design. Second, I extensively studied a new local backbone motion, helix shear, by documenting its occurrence in both crystal and NMR structures and showing its suitability for expanding conformational search space in protein design. Third, I integrated many types of local alternate conformations in an ultra-high-resolution crystal structure and discovered the combinatorial complexity that arises when adjacent flexible segments combine into networks. Fourth, I used structural bioinformatics techniques to construct smoothed, multi-dimensional torsional distributions that can be used to validate trial conformations or to propose new ones. Fifth, I participated in judging a structure prediction competition by using validation of geometrical and all-atom contact criteria to help define correctness across thousands of submitted conformations. Sixth, using similar tools plus collation of multiple comparable structures from the public database, I determined that low-energy states identified by the popular structure modeling suite Rosetta sometimes are valid conformations likely to be populated in the cell, but more often are invalid conformations attributable to artifacts in the physical/statistical hybrid energy function.
Unified by the theme of validating conformational alternatives by reference to high-quality experimental structures, my cumulative work advances our fundamental understanding of protein structural variability, and will benefit future endeavors to design useful proteins for biomedicine or industrial chemistry.