Tree search with Profile parsimony

Martin R. Smith

2024-05-23

Profile Parsimony (Faith & Trueman, 2001) finds the tree that is most faithful to the information contained within a given dataset. It is the ‘exact solution’ that implied weights parsimony approximates. For more information on the philosophy and mathematics of profile parsimony, see the companion vignette.

Profile Parsimony is currently implemented in “TreeSearch” for characters with up to two parsimony-informative states. (Further states are treated as ambiguous, whilst retaining as much information as possible.)

Getting started

A companion vignette gives details on installing the package and getting up and running.

Once installed, load the inapplicable package into R using

library("TreeSearch")

In order to reproduce the random elements of this document, set a random seed:

# Set a random seed so that random functions in this document are reproducible
suppressWarnings(RNGversion("3.5.0")) # Until we can require R3.6.0
set.seed(888)

View the results

In parsimony search, it is good practice to consider trees that are slightly suboptimal (Smith, 2019).

Here, we’ll take a consensus that includes all trees that are suboptimal by up to 3 bits. To sample this region of tree space well, the trick is to use large values of ratchHits and ratchIter, and small values of searchHits and searchiter, so that many runs don’t quite hit the optimal tree. In a serious study, you would want to sample many more than the 3 Ratchet hits (ratchHits) we’ll settle for here, probably using many more Ratchet iterations.

suboptimals <- MaximizeParsimony(myMatrix, betterTrees, tolerance = 3,
                                 ratchIter = 2, tbrIter = 3,
                                 maxHits = 25,
                                 concavity = "profile")

The consensus of these slightly suboptimal trees provides a less resolved, but typically more reliable, summary of the signal with the phylogenetic dataset (Smith, 2019):

par(mar = rep(0.25, 4), cex = 0.75)
table(signif(TreeLength(suboptimals, myMatrix, "profile")))
## 
## 512.118 513.229 513.897 513.966 514.739 514.849 
##       2       1       1       3       1       1
plot(ape::consensus(suboptimals))

Where next?

References

Faith, D. P., & Trueman, J. W. H. (2001). Towards an inclusive philosophy for phylogenetic inference. Systematic Biology, 50(3), 331–350. doi:10.1080/10635150118627
Nixon, K. C. (1999). The Parsimony Ratchet, a new method for rapid parsimony analysis. Cladistics, 15(4), 407–414. doi:10.1111/j.1096-0031.1999.tb00277.x
Smith, M. R. (2019). Bayesian and parsimony approaches reconstruct informative trees from simulated morphological datasets. Biology Letters, 15(2), 20180632. doi:10.1098/rsbl.2018.0632