Showing posts with label bifunctional thymidylate synthase. Show all posts
Showing posts with label bifunctional thymidylate synthase. Show all posts

Sunday, March 30, 2014

The primordial nature of phage genes

Recent ecological studies have shown that bacteriophage (viruses that attack bacteria) are numerically the most abundant biological entities on the planet. The estimated 1030 viruses (mostly phage) in the oceans, if stretched end to end, would span farther than the nearest 60 galaxies. These viruses are thought to cause the turnover, by virus-related death, of 20% of the ocean's biomass per day. In shotgun sequencing of marine samples, the majority of phage gene sequences are invariably found to be novel (not corresponding to any other known gene sequences). Hence, the bulk of genetic diversity on the planet may well be tied up in viral/phage "dark matter."

One of the most-studied bacteriophage classes is the so-called "T-even" (T2, T4, T6, etc.) class of phages, of which the poster child, arguably, is T4. These are phages that attack enteric bacteria (E. coli and its relatives), hence are commonly found in sewage.
T4 phage morphology.

T4 is interesting from a number of standpoints, not least of which is its distinctive head/tail morphology (see diagram). In phylogenetic studies, T4 typically shows up in basal positions on trees, meaning it is presumably ancient. Increasingly, viruses and phages are considered to be of primordial origin, possibly predating cellular life. Certainly, any theory on the origin of life has to come to grips with the fact that the major biomolecules (proteins, nucleic acids, lipids) had to exist, in some form, prior to the appearance of the first cell. Some experts suggest nucleic acids and proteins may have interacted with each other in a so-called Virus World scenario (see the excellent paper by Koonin) wherein microscopic hydrothermal pore systems (in mineral formations at the ocean floor) provided for sequestration of prebiotic processes in physical compartments that could be invaded by "selfish replicators."

Primordial interaction of proteins with nucleic acids (and their precursors) presumably gave rise to a number of artifacts that survive today, such as ribosomes (which contain over 50 small proteins in tight association with RNA), tRNA (RNA covalently bound to an amino acid),  adenine-containing cofactors (e.g. SAMe, NADPH), and viruses (capsid and other proteins bound to RNA or DNA). Conceptually, one can think of protein/nucleic-acid complexes as having diverged, at the Darwinian Threshold, along two lines: toward ribosomal life, or toward the viral world.
Bacterial cell covered with T4 virions.

The genes for certain viral capsid proteins (with colorful names like Jelly Roll Capsid) are among a number of "viral hallmark genes" that show no homology to any genes from the cellular world. Presumably, some of these genes are of truly primordial ancestry. We have a valuable clue to the origin of at least some of these genes in the case of phage T4. A number of tantalizing reports from the 1970s (see here, here, and here) suggest that the enzymes dihydrofolate reductase and thymidylate synthase (both encoded by T4 DNA) are, in fact, components of the virion baseplate and/or tail structure of T4. Hence, at least in some cases, it's conceivable that virion structural proteins began as enzymes.

What's particularly intriguing about the T4 enzymes is that T4's thymidylate synthase (ThyA), which is phylogenetically ancient, is encoded in the phage DNA immediately downstream of the gene for dihydrofolate reductase, with no intervening "junk DNA." Why is this significant? In many organisms (as I explained in an earlier post), these two enzymes occur in a single large bifunctional enzyme that's proposed to be the result of a gene fusion event. In organisms that have the double enzyme, the reductase occurs at the beginning (the N-proximal end) of the protein.

Just for fun, I took the protein sequence for T4 dihydrofolate reductase and fused it (in Notepad) with the sequence for T4 thymidylate synthase, then did a BLAST search of the fusion sequence against all the protein sequences at UniProt.org. The naturally occurring bifunctional ThyA/dihydrofolate reductase enzymes from peach, balsam, rice, castor bean, and clementine (Citrus clementina) all showed up as hits, with E-values of 10-67 or better.

This doesn't prove that the bifunctional enzymes of the peach, etc. came from T4 phage, of course, but it is consistent with the general idea that the bifunctional ThyA/dihydrofolate reductases of algae and protists could (at least in theory) have started out as phage gene fusion products.

Let's put it this way: Weirder things have been known to happen.

Saturday, March 29, 2014

Virus genes don't always come from the host

Usually, when a virus contains a certain kind of gene, and the host contains the same kind of gene, it's assumed the virus got its copy from the host. This is not a terribly safe assumption, however. In some cases it's demonstrably wrong.

For a striking example of how wrong this assumption can be, you need look no further than the tiny Chlorella alga that can be found living symbiotically inside the fresh-water ciliate Paramecium. When it's not living inside Paramecium, Chlorella is subject to infection by PBCV-1 (the Paramecium bursaria Chlorella virus).

Tiny green Chlorella cells can be seen here growing
inside Paramecium. The full Paramecium cell is
shown in the inset at lower left. (Photo by Charles Krebs.)
Both Chlorella and PBCV-1 have a gene for an enzyme called thymidylate synthase, which is the enzyme that produces thymidine monophosphate (dTMP, or just TMP), a precursor molecule for making DNA. Ordinarily, one would assume that the virus picked up the gene for this enzyme from its host at some point in the past. But there's a problem.

The only thymidylate synthase gene in Chlorella's genome codes for a protein with 508 amino acids. The PBCV-1 virus version of this gene codes for a much shorter protein with only 216 amino acids. It turns out there's a perfectly good explanation for the size difference. Like other small algae (such as Micromonas, Ostreococcus, and Bathycoccus) and certain protozoans as well, Chlorella has evolved a bifunctional enzyme. In Chlorella, the same enzyme acts as both a thymidylate synthase and as a dihydrofolate reductase. In most higher organisms, two different enzymes carry out these functions. Organisms that have the dual-function enzyme are presumed to have developed this capability through a gene fusion event sometime in the (most likely distant) past.

It turns out the PBCV-1 virus synthase not only isn't bifunctional, it carries out its thymidylate reaction by an entirely different mechanism than that used in the host enzyme. The host enzyme employs folate (but no flavins) as a cofactor, whereas PBCV-1 is strictly dependent on flavin adenine dinucleotide (FAD), as verified experimentally by Graziani et al. in 2006. We now know that many bacteria use the FAD version of this enzyme (often called ThyX, as disintguished from ThyA, the folate-only enzyme). And the FAD users all have relatively small thymidylate synthases, of about 200 to 300 amino acids.

The above scenario isn't exclusive to Chlorella and PBCV-1. It turns out, certain other small algae (Micromonas, Ostreococcus, and Bathycoccus; all happen to be salt-water algaae) have a bifunctional thymidylate kinase, yet they are subject to infection by viruses that use the much smaller, mono-functional flavin-binding enzyme.

In all these cases, the virus uses an entirely different style of enzyme than the host to carry out TMP production. There is essentially zero chance that the virus derived its enzyme from the host (or vice versa), because the reaction mechanisms of ThyA and ThyX are radically different. (For more detail on this, see the excellent review article at http://www.ncbi.nlm.nih.gov/books/NBK6401/.) These aren't orthologues; these aren't paralogues; these are entirely different enzymes.

So where did the virus get its thymidylate synthase from, if not the host?

If you take the protein sequence for the PBCV-1 thymidylate synthase and run a BLAST search at UniProt.org, the best non-viral hits (in the range of 58% identities, 77% similarities, E-value 10-69) are for the thymidylate synthases of Prochlorococcus marinus and other cyanobacteria, with cyanophages also scoring high. This makes a great deal of sense, because the photosynthetic Prochlorococcus and its relatives are thought to be some of the most ancient bacteria on earth (possibly going back 3.8 billion years). They're thought to be the ancestors of chloroplasts. At one point, they were almost certainly the predominant life form in the oceans. Since phycodnaviruses (of which PBCV-1 is a member) are thought to be quite ancient, it's entirely possible they got their thymidylate synthase from cyanobacteria. That's certainly what the protein-sequence evidence suggests.

I'll go with the evidence.