Bioinformatics Bioinformatics Tool Sequence Alignment & Analysis

Matcher

Pinterest LinkedIn Tumblr

Bioinformatics tools are very complex and critical software tools in life sciences. Bioinformatics tools in today’s world range from simple basic tools to very large sophisticated software packages. In spite of increasing complexity and sophistication of the bioinformatics tools, these tools are increasingly used in the field of modern biology. 

EMBOSS Matcher identifies local similarities in two input sequences using a rigorous algorithm based on Bill Pearson’s lalign application.

Running a tool from the web form is a simple multiple steps process. 

Step 1 – Input Sequences

First Input Sequence

A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.

First Sequence File Upload

A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.

Second Input Sequence

A free text (raw) list of sequences is simply a block of characters representing several DNA/RNA or Protein sequences. A sequence can be in GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only) format. Partially formatted sequences are not accepted. Adding a return to the end of the sequence may help certain applications understand the input. By directly using data from word processors may yield unpredictable results as hidden/control characters may be present.

Second Sequence File Upload

A file containing valid sequences in any format (GCG, FASTA, EMBL (Nucleotide only), GenBank, PIR, NBRF, PHYLIP or UniProtKB/Swiss-Prot (Protein only)) can be used as input for the sequence similarity search. Word processors files may yield unpredictable results as hidden/control characters may be present in the files. It is best to save files with the Unix format option to avoid hidden Windows characters.

Step 2 – Set alignment options

Matrix

Default substitution scoring matrices.

List Matrices

Default value (Protein) is: BLOSUM62 [EBLOSUM62]

Default value (Nucleotide) is: DNAfull [EDNAFULL]

Gap Open Penalty

Pairwise alignment score for the first residue in a gap. Values 1-25

Default value (protein) is: 14 

Default value (Nucleotide) is: 16

Additional information Read more about gap penalties

Gap Extend Penalty

Pairwise alignment score for each additional residue in a gap. Values 1-8

Default value is: 4 

Alternative matches

Show additional alignments. Values 1, 2, 3, 4, 5, 10, 20 

Default value is: 1

Write A Comment