Threading or structural fold recognition predicts the structural fold of an unknown protein sequence by ﬁtting the sequence in to a structural database and selecting the best-ﬁtting fold. The comparison emphasizes matching of secondary structures, which are most evolutionarily conserved. Therefore, this approach can identify structurally similar proteins even without detectable sequence similarity. The algorithms can be classiﬁed into two categories, pairwise energy based and proﬁle based.
Pairwise Energy Method
In this, a protein sequence is searched for in a structural fold database to ﬁnd the best matching structural fold using energy-based criteria. The detailed procedure involves aligning the query sequence with each structural fold in a fold library. The alignment is performed essentially at the sequence proﬁle level using dynamic programming or heuristic approaches. Local alignment is often adjusted to get lower energy and thus better ﬁtting. The next step is to build a crude model for the target sequence by replacing aligned residues in the template structure with the corresponding residues in the query. The third step is to calculate the energy terms of the raw model, which include pairwise residue interaction energy, solvation energy, and hydrophobic energy. Finally, the models are ranked based on the energy terms to ﬁnd the lowest energy fold that corresponds to the structurally most compatible fold.
In this, a proﬁle is constructed for a group of related protein structures. The structural proﬁle is generated by super imposition of the structures to expose corresponding residues. Statistical information from these aligned residues is then used to construct a proﬁle. The proﬁle scores contain information for secondary structural types, the degree of solvent exposure, polarity and hydrophobicity of the amino acids. To predict the structural fold of an unknown query sequence, the query sequence is ﬁrst predicted for its secondary structure, solvent accessibility, and polarity. The predicted information is then used for comparison with propensity proﬁles of known structural folds to ﬁnd the fold that best represents the predicted proﬁle. Because threading and fold recognition detect structural homologs without completely relying on sequence similarities, they have been shown to be far more sensitive than PSI-BLAST in ﬁnding distant evolutionary relationships. In many cases, they can identify more than twice as many distant homologs than PSI-BLAST. However, this high sensitivity can also be their weakness because high sensitivity is often associated with low speciﬁcity. The predictions resulting from threading and fold recognition often come with very high rates of positives. Therefore, much caution is required in accepting the prediction results.
Fugue is a proﬁle-based fold recognition server. It has precomputed structural proﬁles compiled from multiple alignments of homologous structures, which take into account local structural environment such as secondary structure, solvent accessibility, and hydrogen bonding status.
GenThreader is a web-based program that uses a hybrid of the proﬁle and pairwise energy methods.
3D-PSSM is a web-based program that employs the structural proﬁle method to identify protein folds. The proﬁles for each protein superfamily are constructed by combining multiple smaller proﬁles.