Wednesday, May 21, 2014

The Pseudogene Hall of Fame

For "budget reasons" (supposedly), the Joint Genome Institute now requires all users to be registered and approved before they can use the (taxpayer-funded) https://img.jgi.doe.gov site. Fortunately, my registration was approved and I can use the excellent online genomics tools there, one of which produced the following table of Top Ten Organisms by Pseudogene Count.

Organism
Genes
Pseudogenes
60745
11398
38115
9178
38612
7186
6325
4011
31392
3818
5379
1670
4760
1589
5775
1523
5016
1507
2750
1086

Again: Don't expect the above links to work if you're not a registered JGI user. (I don't know if they will work for you or not.) This list was automagically generated by the Department of Energy's Joint Genomes Institute and I thought you might get the same kick out of it that I got. It's eye-opening to see the ratio of pseudogenes to "normal" genes in these organisms. Isn't it?

Taking the top three spots are mouse, rat, and humans. (Note: These counts should be taken with a bit of caution, as some have estimated the number of pseudogenes in the human genome to be much higher than the 7,186 shown here.) All the other spots in the chart except Arabidopsis (which is a leafy plant) are bacteria. The leprosy bacterium, which I've written about before, comes in tenth place.

If you're not familiar with the concept of pseudogenes, you might want to look at this post. Basically we're talking about genes that are thought to be disabled and no longer functional in the normal sense, although they may well be functional in some as-yet-unappreciated sense. (Otherwise, evolutionary theory says they should have been eliminated from most genomes eons ago.)

Personally, I believe pseudogenes are as much a feature of DNA as regular genes; certainly in higher life forms, they occur in great numbers. The vast majority of bacterial genomes in public databases are shown as having no pseudogenes. I find that (how shall I say?) not at all credible. Some day it will be obvious that almost every genome harbors pseudogenes; we simply lack smart enough software to detect them all right now.