Maximum likelihood is the third method used to build trees. Likelihood provides probabilities of the sequences given a model of their evolution on a particular tree. The more probable the sequences given to the tree, the more the tree is preferred.

In phylogenetics there are many parameters, including rates, differential transformation costs, and, most important, the tree itself.

The maximum likelihood method uses standard statistical techniques for inferring probability distributions to assign probabilities to particular possible phylogenetic trees. The method requires a substitution model to assess the probability of particular mutations; roughly, a tree that requires more mutations at interior nodes to explain the observed phylogeny will be assessed as having a lower probability. This is broadly similar to the maximum-parsimony method, but maximum likelihood allows additional statistical flexibility by permitting varying rates of evolution across both lineages and sites. In fact, the method requires that evolution at different sites and along different lineages must be statistically independent. Maximum likelihood is thus well suited to the analysis of distantly related sequences, but it is believed to be computationally intractable to compute due to its NP-hardness.

The “pruning” algorithm, a variant of dynamic programming, is often used to reduce the search space by efficiently calculating the likelihood of subtrees. The method calculates the likelihood for each site in a “linear” manner, starting at a node whose only descendants are leaves (that is, the tips of the tree) and working backwards toward the “bottom” node in nested sets. However, the trees produced by the method are only rooted if the substitution model is irreversible, which is not generally true of biological systems.

Some tools that use maximum likelihood to infer phylogenetic trees from variant allele frequency data (VAFs) include AncesTree and CITUP.

ML is an exhaustive method that searches every possible tree topology and considers every position in an alignment, not just informative sites.

ML works by calculating the probability of a given evolutionary path for a particular extant sequence. The probability values are determined by a substitution model (either for nucleotides or amino acids).

P(t)=1/4+3/4e−αt

Where α is the nucleotide substitution rate in the Jukes–Cantor model, which is either empirically assigned or estimated from the raw datasets. For a nucleotide to change into a different residue after time t , the probability value is determined by following formula:

P(t)=1/4−1/4e−αt

For other substitution models, the formulas are much more complex and are not described here. For a particular site, the probability of a tree path is the product of the probability from the root to all the tips, including every intermediate branch in the tree topology. Because multiplication often results in very small values, it is computationally more easy to express all probability values as natural log likelihood (lnL) values, which also converts multiplication into summation. Because ancestral characters at internal nodes are normally unknown, all possible scenarios of ancestral states have to be computed. After logarithmic conversion, the likelihood score for the topology is the sum of log likelihood of every single branch of the tree. After computing for all possible tree paths with different combinations of ancestral sequences, the tree path having the highest likelihood score is the ﬁnal topology at the site. Because all characters are assumed to have evolved independently, the log likelihood scores are calculated for each site independently. The overall log likelihood score for a given tree path for the entire sequence is the sum of log likelihood of all individual sites. The same procedure has to be repeated for all other possible tree topologies.