@Alexis_RAxML recently gave a talk which ended with an interesting editorial about software quality. In it, he described the results of his investigation into the quality of existing phylogenetics software. I think he’s preparing that for publication, so I’m not going to describe his results, but IIRC he looked at things such as comment density and Valgrind results.
That is an excellent first step, but I’d like to think a little about function. There have been very many implementations of the core phylogenetic likelihood computation, but to my knowledge there haven’t been any with a comprehensive suite of unit tests for the core calculations. Does anyone know of one? (There are tests built into some packages, such as RevBayes, but these are scripts testing the broad-scale functioning of the package, and so aren’t what I’m talking about here.)
I am very glad that we are seeing the development of libraries such as Bio++ and pll, signaling a possibile shift from monolithic codebases to ones in which the core likelihood computation is isolated from the tree exploration part. I think/hope that this will lead to more creative developments on each side.
Having a phylogenetic likelihood computation library with 100% unit test coverage, and a collection of mini-examples with agreed-upon results, would seem very helpful for the field. Major bugs continue to appear, even in major inference packages implementing standard models.
Thoughts?
I note that this is related to, but different from, the topic of test data sets.
(cc @mlandis, @hoehna, @mtholder)
That’s essentially the next step after fixing all the SW quality issues. The problem is how to get hold of the numerics, since some explicit round-off analysis will be required to specify tolerances. In addition for more complex models the parameter optimization can really be tricky, I believe most of the problems are caused by the optimizers for ML and not the implementations themselves. With the IQ-Tree team in Vienna, we have been struggling hard to find an appropriate optimizer for the LG4X model that requires nested parameter optimization. There is no single best optimizier available yet. In terms of correctness, we have an internal mailing list with the IQ-Tree team over which we share numerically challenging datasets we obtain from users to make sure that our codes yield similar results for those tough cases.