The structure of proteins related by evolution is remarkably alike even when the observed sequence similarities are statistically marginal or seemingly non-existent. Similar protein substructures are found in proteins for which there is no evidence of common ancestry and no similarity in their global topology. Recent advances in the comparison of whole proteins, together with the comparison and analysis of their parts, have paved the way for the use of structural information in prediction and modeling, protein engineering, structure and sequence alignments, and investigations of protein evolution, among a host of other applications.
Two protein structures can be compared to show their similarity and the differences. The second of the two proteins is rotated and translated so as to minimize the Root Mean Square (RMS) difference between it and the first geometry. If swapping pairs of atoms would reduce the RMS error, this is done. Differences are shown in three ways. The simplest is to generate a list of atoms that have the largest difference in position; this is of limited use because some parts of a protein are very flexible, i.e. large geometric changes might be accompanied by only very small changes in energy. A more important type of difference involves changes in bond-lengths. Because covalent bonds have high force-constants, any significant change in bond length indicates a significant change in the local energy environment. The third measure of difference is change in hydrogen bond energies. In individual proteins there are often hundreds of hydrogen bonds, and the formation or loss of even a single one of these can change the heat of formation by several kcal.mol, so information about the creation or loss of a hydrogen bond can focus attention on possible problems in a structure.
With the visualization and computer graphics tools available, it becomes easy to observe and compare protein structures. To compare protein structures is to analyze two or more protein structures for similarity. Comparative analysis often, but not always, involves the direct alignment and superimposition of structures in a three-dimensional space to reveal which part of structure is conserved and which part is different at the three-dimensional level.
This structure comparison is one of the fundamental techniques in protein structure analysis. The comparative approach is important in finding remote protein homologs. Because protein structures have a much higher degree of conservation than the sequences, proteins can share common structures even without sequence similarity. Thus, structure comparison can often reveal distant evolutionary relationships between proteins, which are not feasible using the sequence-based alignment approach alone. In addition, protein structure comparison is a prerequisite for protein structural classification into different fold classes. It is also useful in evaluating protein prediction methods by comparing theoretically predicted structures with experimentally determined ones. One can always compare structures manually or by eye, which is often practiced. However, the best approach is to use computer algorithms to automate the task and thereby get more accurate results. Structure comparison algorithms all employ scoring schemes to measure structural similarities and to maximize the structural similarities measured using various criteria. The algorithmic approaches to comparing protein geometric properties can be divided into three categories: the first superposes protein structures by minimizing intermolecular distances; the second relies on measuring intramolecular distances of a structure; and the third includes algorithms that combine both intermolecular and intramolecular approaches.