Bioinformatics Bioinformatics Algorithm Phylogenetics

Kimura Model

Pinterest LinkedIn Tumblr

Nucleotide bases fall into two categories depending on the ring structure of the base.

  • Purines: A or G (these are two ring bases)
  • Pyrimidines: C or T (these are single ring bases)

Mutations in DNA are changes in which one base is replaced by another.

A mutation that conserves the ring number is called a transition (e.g., A -> G, G -> A, T -> C, or C -> T).

A mutation that changes the ring number is called transversion. (e.g. A -> C, A -> T, C -> G, etc.).

The number of transitions observed to occur in nature (i.e., when comparing related DNA sequences) is at least 3 times as frequent as the number of transversions.

Kimura’s Two-Parameter model incorporates these different rates for transitions and transversions.

The Kimura two parameter model provides a method for inferring evolutionary distance in which transitions and transversions are treated separately by using P that is the fraction of sequence positions differing by a transition and Q is the fraction of sequence positions differing by a transversion. This is a more sophisticated model in which mutation rates for transitions and transversion are assumed to be different, which is more realistic. According to this model, transitions occur more frequently than transversions, which, therefore, provides a more realistic estimate of evolutionary distances. The Kimura model uses the following formula:

dAB =−(1/2) ln(1−2pti − ptv)−(1/4) ln(1−2ptv)

where dAB is the evolutionary distance between sequences A and B (supposed), pti is the observed frequency for transition, and ptv is the frequency of transversion. An example of using the Kimura model can be illustrated by the comparison of sequences A and B that differ by 30%. If 20% of changes are a result of transitions and 10% of changes are a result of transversions, the evolutionary distance can be calculated using Equation1: 

dAB =− 1/2 ln(1−2×0.2−0.1)−1/4 ln(1−2×0.1)=0.40

Tajima & Nei:

A more general equation has been given by Nei in 1991. It is the general correction. Its  equation holds for the model of nucleotide substitutions with equal substitution rates between different nucleotides and does not take into account unequal rates of substitution among different nucleotide pairs.

If relatively few substitutes exist, a count of the substitution is usually sufficient.

Transitions are assumed to occur at a uniform rate a and transversions at a different uniform rate b

At any time (t) in the future, the probability that the site contains a C is defined by

PCC(t) = 1/4+(1/4)e-4bt +(1/2)e-2(a+b)t

Through manipulations of this equation, we can derive the following equation for determining K:

K = 1/2 ln[1/(1 – 2P – Q)] + 1/4 ln[1/(1 – 2Q)], where

P is the fraction of nucleotides that a simple count reveals to be transitions and Q is the fraction of nucleotides that a simple count shows to be transversions. If no distinction between transitions and transversions are made, this equation reduces to the simple Jukes-Cantor equation.

Write A Comment