Protein structure comparisons are also basic parts of methods for their structure prediction and prediction of function from their structure. Structure comparison methods have mostly three main components:
- the score function
- the alignment method
- the background distribution used for assessing significance of comparisons
Main types of comparison measures
Sequence-dependent methods of protein structure comparison consider strict one-to-one correspondence between target and model residues. In sequence-independent methods, structural superimposition is performed independently, followed by the evaluation of residue correspondence obtained from such superimposition. The usefulness of the sequence-independent approach is limited to cases where a model approximately captures the correct target fold but the amino-acid sequence threading within this fold is incorrect, e.g. when one turn shift of an alpha-helix occurs.
Any method that relies on distance measurements between reference points in the model and their respective counterparts in the reference template requires prior superimposition of the model onto template, with the results of the comparison clearly dependent on the superimposition. Finding an optimal superimposition is an ambiguous task that has multiple solutions optimizing specific parameters, therefore, all superimposition dependent methods suffer from this ambiguity.
A superimposition that minimizes the global Root Mean Square Deviation (RMSD) of the model to the template may not necessarily be the best solution as it is often compromised by a small number of significantly deviating fragments. Superimposition of a specific subset may not resolve this issue because the choice of the subset is subjective and ambiguous.
Distance-based measures of protein structure similarity
Root Mean Square Deviation (RMSD) is the most commonly used quantitative measure of the similarity between two superimposed atomic coordinates. RMSD values are presented in Å and calculated by a specific mathematical formula.
This approach is normally applied to relatively similar structures. To compare and superpose two protein structures, one of the structures has to be moved with respect to the other in such a way that the two structures have a maximum overlap in a three-dimensional space. This procedure starts with identifying equivalent residues or atoms. After residue–residue correspondence is established, one of the structures is moved laterally and vertically toward the other structure, a process known as translation, to allow the two structures to be in the same location (or same coordinate frame). The structures are further rotated relative to each other around the three-dimensional axes, during which process the distances between equivalent positions are constantly measured
This approach depends on structural internal statistics and therefore does not depend on sequence similarity between the proteins to be compared. In addition, this method does not generate a physical superposition of structures, but instead provides a quantitative evaluation of the structural similarity between corresponding residue pairs.
The method works by creating a distance matrix between residues of the same protein. In comparing two protein structures, the distance matrices from the two structures are moved relative to each other to achieve maximum overlaps. By overlaying two distance matrices, similar intramolecular distance patterns representing similar structure folding regions can be identified.
A new development in structure comparison involves combining both inter- and intramolecular approaches. In the hybrid approach, corresponding residues can be identified using the intramolecular method. Subsequent structure superposition can be performed based on residue equivalent relationships.