T-Coffee (Tree-based Consistency Objective Function for Alignment Evaluation) is a very useful multiple sequence alignment software that uses a progressive approach. It produces a library of pairwise alignments to help the multiple sequence alignment. It can also combine multiple sequence alignments obtained previously and in the latest versions can use structural information from PDB files (3D-Coffee). It has advanced features to evaluate the quality of the alignments and some capacity for identifying occurrence of motifs (Mocca). It produces alignment in the aln format (Clustal) by default, but can also produce PIR, MSF, and FASTA format. The most common input formats are supported FASTA and PIR.
Its main features include;
- First, it provides the multiple alignments using certain data sources which is the library of pairwise alignments (global and local).
- Second main feature is the optimization method which provides multiple alignments that best fits in the input library.
Generate Primary library of alignments:
It consists of a set of pairwise alignments of all of the sequences to be aligned. It may also include two or more different alignments of the same pair of sequences. Then the global alignment is done using ClustalW.
Derive primary library weights:
The most reliable residue pair is obtained in this step using a weighted scheme. In this, a weight is assigned to each pair of aligned residues in the library.
In this step, all the duplicated pairs are merged into a single entry that has a weight equal to the sum of two weights, or a new entry is created for the pair being considered.
A triplet approach involving intermediate-sequence method is used
Progressive alignment strategy:
In this alignment strategy, a distance matrix is constructed using pairwise alignments between all the sequences, with the help of which a guide tree is constructed using Neighbor Joining (NJ) method (a method that first aligns the two closest sequences), the obtained pair of sequences are checked for gaps, again the next closest two sequences. This continues until all the sequences have been aligned.
T-Coffee belongs to the class of aligners known as consistency based, which may be described as slow and accurate. All the aligners of this class trade speed for increased precision.
The most recent improvement in T-Coffee has been the development of the concept of template-based multiple sequence alignment. When run as a template-based aligner, T-Coffee uses a different procedure to generate the primary library: rather than directly aligning the sequences, it associates each input sequence with a template, it then aligns every pair of templates with an appropriate aligner and projects the resulting alignments onto the original sequences.
When computation is finished, the server displays a summary page. It includes the computed MSA. The box ‘Result files’ contains all the files produced by T-Coffee during the alignment process as well as the sequences input file. The provided link ‘Download them all’ allows users to download all the output files in a single zip archive.
The graphic colored output shows the level of consistency between the final alignment and the library used by T-Coffee. The main score is the total consistency value.