The most recent phyloseminar is a very nice intro to phylogenetic invariants by Marta Casanellas-Rius, along with updates about her Erik+2 method.
It started a discussion over email between her and Joe Felsenstein, and I thought I'd copy the discussion here.
This is Marta's response, so Joe's original questions are shaded.
Here's the figure she describes as a PDF:
answer.pdf (390.9 KB)
I think that Dr. Casanellas-Rius has made a good case for the
usefulness of edge-invariants distance methods in cases where there
are mixtures of sets of edge lengths or other models that are not easy
to use in ML (or Bayesian) approaches.
I just want to defensively quibble about one thing. I think that in
the case where we have a single model (not a mixture of different edge
lengths) we have statistical theorems, dating back to RA Fisher,
guaranteeing the asymptotic efficiency of ML methods. So in theory ML
should do best in asymptopia.
Yes, let's stick to unmixed (also in the attached file I don't consider
mixtures). Although ML could do best in theory (and asymptotically
speaking), there is always the issue that in practice we only get local
maxima and not global maxima and this is why ML is not a perfect method.
This problem of local maxima gets worse when the are more parameters to
estimate and for example, performing ML to a GMM tree can result in
incorrect results. So I admit that yes, ML should be the best but in
practice is not, speccially for very general models as GMM.
Use of the full set of invariants should be equivalent. Why did it do
worse in Dr. Casanellas-Rius's simulations? I suggest that it is
because the measure used to judge degree of fit was not equivalent to
the one ML uses. That latter is something like a Kullback-Leibler
distance on the pattern probabilities. If one would use that one,
there would then actually be no difference between ML and
I agree that using a "full" set of invariants is statistically
consistent that is, when the length of the alignment tends to infinite,
this set of invariants allow to recover the true tree. Actually, inseatd
of using the "full" set of invariants I'd use a "local complete
intersection" (see the attached explanation).
So, yes, asymptotically and if ML implementation got global maxima,
there should be no difference between ML and a "local complete
However, I sheepishly acknowledge that this measure for
full-invariants methods has not, as far as I know, been developed.
Yes, this was developed in Casanellas, Fernandez-Sánchez, MBE 2007
but you can have a look at the figure I attach in order to see the
sifference in performance between full set of invariants and edge
(And there would be no reason to spend a lot of time developing it, if
one could achieve the same inference by just using ML).
As I said, in practice one cannot achieve the same inference of ML due
to the issue of local maxima and complex models (although I must also
say that invariants also have drawbacks).