Many proteins are integral membrane proteins. Most membrane proteins have hydrophobic regions which span the hydrophobic core of the membrane bi-layer and hydrophilic regions present on the outside or the inside of the membrane. Many receptor proteins have several transmembrane helices spanning the cellular membrane.
TMHMM (Transmembrane Helices Hidden Markov Models) is a membrane protein topology prediction method based on a hidden Markov model. It predicts transmembrane helices and distinguishes between soluble and membrane proteins with high degree of accuracy. Users can submit as many as 4000 protein sequences in FASTA format each time. This high degree of accuracy allowed us to predict reliably integral membrane proteins in a large collection of genomes.
TMHMM is a tool for prediction transmembrane helices based on a hidden Markov model. It compares the amino acid sequence with the database of hidden Markov models that has been created on the basis of known alpha helical secondary structures across all the cell membranes.
- protein structure analysis
- protein topology prediction
- protein secondary structure prediction
- protein transmembrane helices prediction
- membrane protein structures
- Protein Sorting Signals
The program takes proteins in FASTA format. There are two output formats, Long and short.
For the long format (default), the server gives some statistics and a list of the location of the predicted transmembrane helices and the predicted location of the intervening loop regions. If the whole sequence is labeled as inside or outside, the prediction is that it contains no membrane helices. The prediction gives the most probable location and orientation of transmembrane helices in the sequence. It is found by an algorithm called N-best (or 1-best in this case) that sums over all paths through the model with the same location and direction of the helices. In the short output format one line is produced for each protein with no graphics.
The plot shows the posterior probabilities of inside/outside/TM helix. Here one can see possible weak TM helices that were not predicted and one can get an idea of the certainty of each segment in the
The plot is obtained by calculating the total probability that a residue sits in a helix, inside, or outside summed over all possible paths through the model. Sometimes it seems like the plot and the prediction is contradictory, but that is because the plot shows probabilities for each residue, whereas the prediction is the overall most probable structure. Therefore the plot should be seen as a complementary source of information.
Predicted TM segments by TMHMM in the n-terminal region sometimes turn out to be signal peptides. The predictions obtained can either be shown as annotations on the sequence or be shown as the detailed text output from the TMHMM method.
The principle for the architecture of TMHMM is that there are separate compartments, sets of states and state transitions for modeling the TM regions, the loop regions on the cytoplasmic side (inside), and the loop regions on the periplasmic side (outside). The transitions within the TM compartments limit the lengths of these regions to somewhere around 15–35 residues. Inside and outside regions are built so that arbitrarily long loops are allowed but somewhat less likely than shorter ones. Intercompartmental transitions are restricted so that transitions directly between inside and outside regions are not allowed. An outside region followed by a TM region must be followed by an inside region and vice versa.