Alexey Kozlov, who is working on libpll, mentioned that there might be a problem determining what level of error in the log likelihood is acceptable to pass a unit test. Here's my response...
Interesting -- yeah, I think there are a lot of issues to address to make this work. Here's my attempt at working out a philosophical answer.
- Before we know what's acceptable, I think we want to know what's true. So, if the AVX & SSE kernels differ from each other, can we find out which one is closer to the truth?
(Alex's phyly code could, in theory, help us figure out the true value, but it doesn't define what's acceptable. But I think we have to write some boring code to convert FASTA into a JSON representation of the CLVs)
- We also want to know if differences in the ratios of likelihoods are right. I also want to know what Pr(data|parameters) is, but if we can get C*Pr(data|parameters) for some unknown C, then we can get Pr(data|parameters1)/Pr(data|parameters2), and that is what is used in practice.
In terms of what we should expect:
(a) I hope that for large data sets, we can still get accurate log-likelihood DIFFERENCES (#2).
(b) For small data sets, we should be able to get accurate log-likelihoods. I once found a bug that changed a log-likelihood of a dynamic programming problem from something like 12345.123456789 to 12345.123451234. So small differences can be significant.
(c) In practice, I wrote the test harness so that if you specify the likelihood is 12345.6 then it will expect the error to be < 10^-1, but if you specify the likelihood as 12345.678 then it will expect the error to be 10^-3.
(d) I still don't know what kind of error is ultimately "acceptable". There will probably be a subjective component there. Different projects might consider different degrees of error to be a failure. But ultimately, it would be interesting just to know how accurate different methods are.
What do you think? I don't have any tests that compute log-likelihood differences yet...
Also, what do you think of using git submodules to include the test suite into the test infrastructure for specific projects? It would be nice to be able to run 'make check' and have the tests run even though they are in a separate repo.