Some questions about trees


#1

Does anyone know references or previous work to the following three questions:

  1. Consider the graph G_NNI of unrooted binary phylogenetic trees on leaf set {1,…,n} where two trees form an edge if they are one NNI apart. Similarly G_SPR (where trees are joined if they are one SPR apart). Both these graphs are regular (each tree has the same number of neighbors) so if you start anywhere then after a while you will be at each tree with (asymptotically) uniform probability - the question is what is the mixing time (time to be near uniform as a function of n) for this random walk? Note I’m not talking about data (or Bayesian methods) here - just a pure mixing time question regarding this discrete graph. Does anyone know a paper that studies this (I think I saw one once, but all my papers/notes got lost in the 2010/2011 earthquakes here!).

  2. There are various notions of the centre of a tree - e.g. the “centroid” (explained very nicely in the recent Tanglegrams paper by Matsen/Billey/Kas/Konvalinka paper on ArXiv). However does anyone know any paper that discusses what we might call a ‘leaf-centroid’. Given a tree T call a vertex v of T a leaf-centroid if each of the components of T-v contains at most half of the leaves of T. If T has no vertices of degree 2 then (just like the centroid) a tree either has a unique leaf-centroid or two adjacent leaf centroids. However even for this class ( trees without vertices of degree 2) the leaf-centroid can be different from the centroid! Just wondering if anyone has seen this notion mentioned or studied before?

  3. Anyone know an early (pre-1960s) reference to the simple but fundamental result: A collection C of nonempty subsets of X (including X) forms a hierarchy (nested family) if and only if C is the set of clusters of a rooted X-tree? It may be implicit in Linneaus (or even Aristotle!) but some more explicitly mathematical statement would be good.

Any advice will help for a book (‘mathematical phylogeny’) that I’m now half-way through writing I might ask a few further questions over the coming month or two.


#2

The most recent paper I’ve seen concerning 1. is a paper by Spade et al. (http://www.sciencedirect.com/science/article/pii/S0167715213003155). They prove upper bounds on the relaxation time of NNI and SPR walks on rooted trees. The relaxation time for rooted SPR is O(n^{5/2}) but is conjectured to be O(n^{3/2}). The relaxation time of rooted NNI is O(n^4), and should be the same for unrooted trees. They have a good summary of the relevant literature. For unrooted trees, a chain that only moves leaves has a relaxation time of O(n^2) (http://onlinelibrary.wiley.com/doi/10.1002/rsa.1029/abstract), so I wouldn’t be surprised if the relaxation time for unrooted SPR is also O(n^2) but I don’t think it has been formally shown.


#3

I might also note another Kubatko paper that uses simulation to confirm the Aldous conjecture that the relaxation time is O(n^{3/2}):

Also note that the Spade et al paper uses a slightly unusual definition of SPR.

I don’t think that I can help you with 2 or 3!


#4

Brilliant - thanks Chris, that’s very helpful!


#5

thanks Erick - hadn’t seen that paper either - that’s also very helpful… best M