Bioinformatics Bioinformatics Tool Protein Databases & Analysis


Pinterest LinkedIn Tumblr

The use of profile hidden Markov models (HMM) for detecting sequence similarity is universal. Their popularity stems from the fact that a few related and aligned sequences can be used to construct a profile HMM, which can then be used to search large sequence databases to find related sequences, even those distantly related. The sensitivity of profile HMMs is achieved by the position-specific probabilistic modelling of the alignment, which incorporates not only residue conservation, but also rates of insertions and deletions. The accelerated profile HMM search algorithm in the generation of the HMMER suite has significantly reduced this computation overhead. As such, it is possible to search a typical protein based profile HMM against 100 million protein sequences in a matter of nearly 10 minutes on a single CPU. By scaling searches over multiple CPUs, this search time can be reduced to a matter of seconds. We have adopted this scaling approach to create the HMMER web server that was first launched in 2011, providing the ability to search a single sequence against profile HMM libraries or large sequence collection. Since then, this web service has increased notably in popularity.

Profile hidden Markov models (profile-HMMs) are sensitive tools for remote protein homology detection, but the main scoring algorithms, Viterbi or Forward, require considerable time to search large sequence databases.

The goal of the HMMER project is to make advanced probabilistic methods for sequence homology detection available in widely useful tools. The HMMER software suite has been widely used, particularly by protein family databases such as Pfam and InterPro and their associated search tools. HMMER 3.0, was released in early 2010, includes new technology producing roughly 100-fold speed improvements relative to previous versions of HMMER, such that HMMER3 search times are competitive with BLASTP search times. This new technology includes a combination of striped vector-parallelized alignment algorithms, a new heuristic acceleration algorithm and a ‘sparse rescaling’ method enabling the Forward and Backward profile hidden Markov model (profile HMM) algorithms to be applied using multiply/add instructions on scaled probabilities without numerical underflow.

HMMER can work with query sequences, not just profiles, just like BLAST. For example, we can search a protein query sequence against a database with pHMMER, or do an iterative search with jackHMMER. 

HMMER is a free and commonly used software package for sequence analysis written by Sean Eddy’s team and now it is running under the supervision of the European Bioinformatics Institute (EBI) in the United Kingdom. Sequences that score significantly better to the profile-HMM compared to a null model are considered to be homologous to the sequences that were used to construct the profile-HMM. The profile-HMM implementation used in the HMMER software was based on the work of Krogh and colleagues. HMMER is a console utility ported to every major operating system, including different versions of Linux, Windows, and Mac OS.

Some bioinformatics tools such as UGENE also use HMMER. HMMER can be used to replace BLASTP and PSI-BLAST for searching protein databases with single query sequences.

Write A Comment