Using C-Alpha Geometry to Describe Protein Secondary Structure and Motifs
X-ray crystallography 3D atomic models are used in a variety of research areas to understand and manipulate protein structure. Research and application are dependent on the quality of the models. Low-resolution experimental data is a common problem in crystallography which makes solving structures and producing the reliable models that many scientists depend on difficult.
In this work, I develop new, automated tools for validation and correction of low-resolution structures. These tools are gathered under the name CaBLAM, for C-alpha Based Low-resolution Annotation Method. CaBLAM uses a unique, C-alpha-geometry-based parameter space to identify outliers in protein backbone geometry, and to identify secondary structure that may be masked by modeling errors.
CaBLAM was developed in the Python programming language as part of the Phenix crystallography suite and the open CCTBX Project. It makes use of architecture and methods available in the CCTBX toolbox. Quality-filtered databases of high-resolution protein structures, especially the Top8000, were used to construct contours of expected protein behavior for CaBLAM. CaBLAM has also been integrated into the codebase for the Richardson Lab's online MolProbity validation service.
CaBLAM succeeds in providing useful validation feedback for protein structures in the 2.5-4.0A resolution range. This success demonstrates the relative reliability of the C-alpha; trace of a protein in this resolution range. Full mainchain information can be extrapolated from the C-alpha; trace, especially for regular secondary structure elements.
CaBLAM has also informed our approach to validation for low-resolution structures. Moderation of feedback, to reduce validation overload and to focus user attention on modeling errors that are both significant and correctable, is one of our goals. CaBLAM and the related methods that have grown around it demonstrate the progress towards this goal.
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 United States License.
Rights for Collection: Duke Dissertations