Bioinformatics Sequence Analysis

Data Sequence Analysis

Pinterest LinkedIn Tumblr

In Bioinformatics, sequence analysis is the process of using a DNA, RNA or peptide sequence to any different analytical method to understand its features, function, structure, or evolution. For this purpose, the used methodologies include sequence alignment, searches against biological databases, and others. Since the development of methods of high-throughput production of gene and protein sequences, the rate of addition of new sequences to the databases increased at high rates. Such a group of sequences does not, by itself, increase the scientist’s understanding of the biology of organisms. However, comparing these new sequences to those with known functions is a key way of understanding the biology of an organism from which the new sequence comes. Thus, sequence analysis can be used to give function to genes and proteins by the study of the similarities between the compared sequences. Nowadays, there are many tools and techniques that provide the sequence comparisons (sequence alignment) and analyze the alignment product to understand its biology.

Sequence analysis in molecular biology includes a very wide range of relevant topics:

  1. The comparison of sequences to find similarity, often to find if they are related (homologous)
  2. Identification of intrinsic features of the sequence such as active sites, post translational modification sites, gene-structures, reading frames, distributions of introns and exons and regulatory elements
  3. Identification of sequence differences and variations such as point mutations and single nucleotide polymorphism (SNP) in order to get the genetic marker.
  4. Describing the evolution and genetic diversity of sequences and organisms
  5. Identification of molecular structure from sequence alone

Methods used in sequence analysis are;

  • DNA patterns
  • Dynamic programming
  • Artificial Neural Network
  • Hidden Markov Model
  • Support Vector Machine
  • Clustering
  • Bayesian Network
  • Regression Analysis
  • Sequence mining
  • Alignment-free sequence analysis


  • BLAST Microbial Genomes
  • BLAST RefSeqGene
  • Basic Local Alignment Search Tool (BLAST)
  • Conserved Domain Search Service (CD Search)
  • Gene Expression Omnibus (GEO) BLAST
  • Genome BLAST
  • Genome Data Viewer (GDV)
  • Genome Remapping Service
  • Genome Workbench
  • Multiple Sequence Alignment Viewer
  • Open Reading Frame Finder (ORF Finder)
  • Primer-BLAST
  • ProSplign
  • Sequence Viewer
  • Splign
  • Tree Viewer
  • VecScreen

To perform sequence analysis proficiently, it is important to first understand the source of the data, i.e., the different experimental methods used for determining the biological sequence. We then need to follow analytical strategies, depending on whether the sequence is genomic, transcriptomic or proteomic. Databases currently warehousing the enormous data on these biomolecules will need to be first checked for the presence of similar sequences, which might direct experimental assays for functional investigations. Software tools and web services are often used for carrying out bioinformatics analysis. After analysis of DNA, RNA and protein sequences, it is important to understand how they are connected by protein to genome mapping. The small organic molecules or metabolites that are important for organisms to live and grow also need to be studied in the factors of their interaction with genes and proteins, through metabolic pathways. 

Write A Comment