The Chou–Fasman method is an empirical technique for the prediction of tertiary structures in proteins, originally developed in the 1970s by Peter Y. Chou and Gerald D. Fasman. The method is based on analyses of the relative frequencies of each amino acid in alpha helices, beta sheets, and turns based on known protein structures solved with X-ray crystallography. From these frequencies a set of probability parameters were derived for the appearance of each amino acid in each secondary structure type, and these parameters are used to predict the probability that a given sequence of amino acids would form a helix, a beta strand, or a turn in a protein. The method is at most about 50–60% accurate in identifying correct secondary structures, which is significantly less accurate than the modern machine learning–based techniques.
The original Chou–Fasman parameters found some strong tendencies among individual amino acids to prefer one type of secondary structure over others. Alanine, glutamate, leucine, and methionine were identified as helix formers, while proline and glycine, due to the unique conformational properties of their peptide bonds, commonly end a helix. The original Chou–Fasman parameters were derived from a very small and non-representative sample of protein structures due to the small number of such structures that were known at the time of their original work. These original parameters have since been shown to be unreliable and have been updated from a current dataset, along with modifications to the initial algorithm.
The Chou–Fasman method takes into account only the probability that each individual amino acid will appear in a helix, strand, or turn.
The Chou–Fasman method predicts helices and strands in a similar fashion, first searching linearly through the sequence for a “nucleation” region of high helix or strand probability and then extending the region until a subsequent four-residue window carries a probability of less than 1.
Turns are also evaluated in four-residue windows, but are calculated using a multi-step procedure because many turn regions contain amino acids that could also appear in helix or sheet regions. Four-residue turns also have their own characteristic amino acids; proline and glycine are both common in turns. A turn is predicted only if the turn probability is greater than the helix or sheet probabilities and a probability value based on the positions of particular amino acids in the turn exceeds a predetermined threshold.
CFSSP is an online protein secondary structure prediction server. This server predicts regions of secondary structure from the protein sequence such as alpha helix, beta sheet, and turns from the amino acid sequence. The output of predicted secondary structure is also displayed in linear sequential graphical view based on the probability of occurrence of alpha helix, beta sheet, and turns. The method implemented in CFSSP is the Chou-Fasman algorithm. CFSSP is freely accessible via ExPASy server or directly from BioGem tools. The CFSSP server is written in Perl, which runs through CGI.