Primate complete genomes: Neanderthal, Denisova, Chimpanzees etc


#1

Over the weekend I was reading Neanderthal Man by Svante Paabo . This book discusses comparative genomics of humans and close relatives, including the complete genome of Pan paniscus published in Nature in 2012, and the complete genome of an Altai Neanderthal in 2014. It is clear from the phylogenetic trees and other comparisons in these papers that the complete genomes can be analyzed, for example to show diversity within the Major Histocompatibility Complex in comparison to diversity in other regions of the genomes.

However, when I attempt to download the MHC region from each of the genomes to do some of my own comparisons, I find that it is not easy to get the data. The genomes are not well assembled and annotated on the public sites, so I cannot just search for the MHC gene region in Pan troglodytes, Altai Neanderthal genome, or Pan paniscus genome.

Does anyone here know if the annotated genomes are available at some other site? Papers discussing the genomes seem to indicate that the authors have access to annotation to determine which genes and gene families are shared between modern humans and Neanderthals.


#2

I have been recently working on these genomes and to my knowledge, there are no better annotations available. For genes that are not annotated, you can only call your regions with the genome coordinates.

In general for chimps, I found that the panTro3 is the best annotated and assembled genome (I would work with that one). The Pan paniscus is not well annotated. And I don’t know of any better resource for the Neanderthal genome.

It would definitely be great and worth to have the genomes of all the hominids well annotated. And if anyone has more information, I will also be happy to know about other potential resources.


#3

Going just a bit deeper, beyond human/Neanderthal/Denisova to the great apes, there is another recent paper that has a lot of good data in this area. This Prado-Martinez et al paper discusses the sequencing of dozens of great ape genomes (again none of them readily accessible as finished and annotated genomes in GenBank).

At the other end of this spectrum, there are some recent papers analyzing hundreds of human genomes (or GWAS data of SNPs) from hundreds of humans which are beginning to test for example, if previous assumptions about changes in diet affecting selection pressure one genes related to diabetes hold true or not.