We are trying to understand the origin of a DNA virus. The genome is rather large (>100kb). We constructed the phylogenetic relationship between all existing genomes sampled from different geographic locations and historical time. We observed several important historical splits and population expansions. We wanted to date those important time points.
Due to large sample size and genome size, we tried two approaches
- subsample the sequences and date those historical time points using BEAST.
- subsample the loci (genes), one interesting observation is that, when we build local gene trees, the gene genealogies do not always reflect the global genome tree (either due to lower mutation rate or historical recombination). In this case, the historical splits we want to date is not identifiable in the local gene tree (due to change in gene genealogy). This becomes an issue preventing using local gene trees to date historical events when there are variations in gene genealogies across the genome.
I was wondering there are ways to come around this problem?