The interface between Mathematics/Statistics and Phylogenetics -- Topics for a talk


Dear phylobabblers,

I got invited to give a talk at the Brazilian Mathematics Colloquium (end of July) on the interface of Maths and Stats with phylogenetics/phylodynamics.

I’m now gathering ideas from fellow mathematicians and mathematical biologists on what exactly would catch the attention of an audience of mathematicians. I have spoken to Statisticians before, but never to hardcore pure mathematicians, so that’s why I’m asking.

My initial ideas involve talking about the Kingman coalescent, and/or some of Susan Holmes’s work on the geometry of tree space and/or the connections with macroscopic ODE-based models such as the SIR.

I’m looking specially at you, @cwhidden, @ematsen and @mathmomike.




Hi Luiz,

Yes, Kingman coalescent followed by multispecies coalescent would allow you to talk about the cool (and mathematical work) done by Degnan, Rosenberg on the gene tree/species tree problem (e.g. for n>4 any species tree can have branch lengths where the most likely gene tree has a different topology) and o and the later (very elegant) folllow-up work by others incl Elizabeth Allman/Rhodes/Ane/Kubatko/Mossel/Roch on identifiability issues, and methods for consistently inferring species trees from (conflicting) gene trees.

Yes the Vogtmann-Billera-Holmes tree space could be another option - especially the CAT(0) strucuture and recent work to compute distances, find geodesics etc (perhaps the different space of ‘phylogenetic orgages’ could also be discussed - this has an interesting topological story involving CW complexes, toic cubes, etc. The latter space is arguably more relevant in some ways, but less studied.

Other topics that could form the basis of a talk to mathematicians could be (i) algebraic properties of Markov substitution model (eg. the work of the Tasmanian group in introducing Lie algebras into phylognetics) and/or (ii) the extensive work of the Berkeley crowd and Allman/Rhodes on the algebraic geometry of models. Also a maths talk on phylogenetic networks could be a further option.

Last year I had a popular-style survey paper on maths of phylogenetics in American Mathematical Monthly and also gave 10 lectures in the US at a regional maths meeting on phylogenetics at Winthrop U- though this was more in the style of a tutorial for math grad students than a serious maths talk at a conference! Both the paper and the Winthrop lectures (as two .pdfs) are available on my webpage (one under publications, the other under talks - “winthrop”) if that’s useful. Hope these brief comments help.


By the way, here is a link to Mike’s paper.

The Hadamard transform always seems to be popular among mathematicians. Felsenstein calls it the most elegant application of math to phylogenetics. I have found @mathmomike’s book to be the clearest exposition, though it, heh, leaves something to the reader. Here is a scan of my notes on the relevant section (click for full resolution):


Among the the mathematical statisticians who followed up on Kingman and developed the coalescent as a practical method include my old statistics thesis advisor, Simon Tavaré, a terribly busy man but whose student Paul Joyce at the University of Idaho at Moscow is a wonderful mathematical statistician with good insights.

Many of the current coalescent folks mentioned here by others are student of Simon or his close colleagues Robert Griffiths (from whom I learned the relevant pop gen. stochastic models, Oxford Emeritus Stats) and Peter Donnelly who was and still is in the Stats Department at Oxford when I studied there under the late Ryk Ward at the Department of Biological Anthropology. These are the pioneers and are terrible interesting and very active mathematical statisticians whose work is entirely evolutionary and phylogenetic/ pop genetics.

You might also contact the marvelous AWF Edwards (Likelihood) at Cambridge and not but not the least the brilliant David Penny and more recently Zheng Yang. All of these people I find to be both charming and helpful and really got the whole field off the ground with the wonderful foundation provided by RA Fisher.


The brilliant young phylogeneticists who populate this site inherited a wonderful foundation from the those I mention in the last post. It continues to be a wonderful area of research for those who are adequately trained in mathematical statistics.


Dear all,

Thank you very much for your contributions! The talk was on last Wednesday and feedback was quite nice. Someone mentioned Hopf algebras and their usefulness, but I wasn’t able to engage in discussion much further since I know nothing about this. Has anyone seen anything nice on this direction?

I used @mathmomike`s paper as sort of introduction and also used @ematsen and @cwhidden’s curvature paper to try and motivate young researchers [the vast majority of the audience] to engage on this kind of problem.

The talk is here [slides in English, but I spoke in Portuguese].

Once more thank you all for the very helpful suggestions!