The SNAD Sequence Renaming tool


#1

The Sequence Name Annotation Designer tool is very useful for renaming sequences obtained from BLAST output, for example. Note that characters such as “:” and “(” are not allowed in sequence names in a Nexus/Newick tree file.

>gi|260677811|gb|GU046734.1|:1-1695 Influenza A virus (A/mallard/Bavaria/35/2006(H5)) segment 4 hemagglutinin (HA) gene, complete cds ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

>gi|301322577|gb|HM849027.1|:1-1695 Influenza A virus (A/mallard/PT/28006/2007(H5N3)) segment 4 hemagglutinin (HA) gene, complete cds ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

>gi|148532753|gb|EF597262.1|:1-1684 Influenza A virus (A/mallard/Italy/1980/1993(H5N2)) hemagglutinin (HA) gene, partial cds ATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTCAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

etc…

to

> A/H5N1/mallard/Bavaria/35/2006_GU046734 ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

> A/H5N3/mallard/Portugal/28006/2007_HM849027 ATGGAGAAAATAGTGCTTCTTCTTGCAATAGTCAGTCTTGTTAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

> A/H5N2/mallard/Italy/1980/1993_EF597262 ATGGAGAAAATAGTGCTTCTTTTTGCAATAGTCAGTCTTGTCAAAAGTGACCAGATTTGCATTGGTTACCATGCAAACAA

etc…


#2

Again, apologies for the auto-tooting … DendroPy’s interop.genbank module provides this functionality for GenBank data, at any rate, with several options to customize the labeling for compatibility with most phylogenetic software.