Prunning off long branches


Dear All,

I am building a pipeline to automatically generate gene trees for about 10,000 CDS alignments (all genes from an exome). The genes were sequenced for 150 individuals in multiple species. Some individuals are worse than others and occasionally have little data in some alignments, and end up on obviously artificially inflated branches. Is anyone aware of a tool to prune those automatically? (I will also use tools to get rid of poor sequence first, but that’s a different topic.)

Many thanks, Krzysztof Kozak


Hi Krzysztof,

Have a look at ETE toolkit for tree handling (Python based). You will have to write simple script that recognises and prunes long branches. Here you have a pruning example.

You may want to look at phylomeDB phylogenetic tree reconstruction pipeline for some ideas how to improve alignments by using multiple aligners and trim poorly aligned regions automatically. Finally, the way to improve the phylogenetic reconstruction itself are also given.

Good luck!


I wrote a simple python script to pruning branches longer than a certain threshold. See the script “” in The tree libraries and are also included in the same repository, which were written by Stephen Smith.