Showing posts with label Mycobacterium. Show all posts
Showing posts with label Mycobacterium. Show all posts

Friday, June 13, 2014

Thermometer Genes

Heat shock proteins are an interesting class of proteins that provide "damage control" for enzymes when temperatures rise to the point where proteins start to unfold and refold improperly. Protein 3-dimensional structure is critical to proper enzyme function, and it doesn't take much thermal jostling to mess up a protein's structure. Therefore it's not surprising cells have their own miniature repair factories for refolding heat-misfolded proteins.

Collectively, heat shock proteins are part of a group of proteins known as chaperones, some of which are bonafide refoldases and others of which aid proteins in other ways. (For example, the ClpB protein rescues proteins from an aggregated state.)

GroEL is a so-called type I chaperonin involved in protein folding, assembly, and transport. Like many heat shock proteins, it's over-expressed at high temperatures and plays a critical role in growth and survival at non-permissive temperatures. Because of its importance in many cellular processes, GroEL is ubiquitous in bacteria, with most species having a single GroEL gene, but with about 30% of  genomes having two or more GroEL copies.
GroEL mRNA of Mycobacterium intracellulare can fold into the advanced low-energy secondary structure shown here.
The question of how cells up-regulate heat shock proteins during times of thermal stress is still largely open, although we know in some cases specific transcription regulator proteins are involved (but then the question becomes: how do the regulators know to up-regulate in times of heat stress?). The answer might not be that difficult. The messenger RNAs encoding GroEL and other heat shock proteins contain a great deal of secondary structure (that is to say, the RNA folds back on itself to form thermally sensitive structures). Recently, Wan et al. surveyed RNA thermal sensitivity in yeast and found thousands of so-called "mRNA thermometers": RNA molecules that unfold in response to heat. About three quarters of yeast RNA is thermo-stable at 37 degrees C, while around 55% of RNAs are unfolded at 55 degrees. In the folded state, RNA probably requires the help of helicases or other "helpers" to unfold, but at a high enough temperature, the molecules unfold by themselves and become eligible for translation by ribosomes.

To investigate the possible role of mRNA secondary structure in GroEL regulation, I wrote scripts that check a gene for all occurrences of length-9 (so-called "9-mer") nucleotide sequences that have a corresponding reverse-complement sequence in the same gene. When I checked the GroEL gene of Mycobacterium tuberculosis (Erdman strain), I found 14 pairs of complementarity 9-mers, representing regions of the gene that could, in theory, cause secondary structure to form in mRNA. A check of the sister organism M. intracellulare (whose GroEL gene is 80% identical to the M. tuberculosis version) showed 22 such complementary pairs.

Interestingly, the mutational differences between GroEL in M. tuberculosis and M. intracellulare do not appear to be randomly distributed along the gene. In M. tuberculosis, mutations occur at a rate of  0.12698 substitutions per site inside 9-mer regions (putative stems) versus a rate of 0.19722 for non-9-mer regions, indicating that (perhaps) selection pressure is different for self-complementing regions than for other regions. I found much the same thing in M. intracellulare, where the mutation rate was 0.16414 inside 9-mers and 0.19347 elsewhere.

Tending to confirm that selection pressure is different for the "secondary structure" regions versus other regions is the (surprising) finding that in M. tuberculosis complementary regions, the ratio of non-synonymous to synonymous mutations (Kn/Ks) is 0.526, versus 0.950 for other regions. In M. intracellulare, likewise, Kn/Ks is less in complementing regions (0.635) than in non-complementing regions (0.975).

To check whether these observations apply only to Mycobacterium or might be more widely applicable, I took a look at GroEL genes in Clostridium acetobutylicum strain ATCC 824 and Clostridium lentocellum strain DSM 5427. The Clostridia are phylogenetically quite distant from Mycobacteria (as confirmed by the fact that their GroEL genes share only 50% nucleotide sequence identity). A total of 462 mutations separated the two Clostridial genes. But again, the mutations segregated non-randomly according to whether they occurred in putative regions of complementarity (secondary structure) as opposed to non-complementing regions. In C. acetobutylicum the mutation rate in complementing regions was 0.23015 substitutions per site (29/126 bases) versus 0.28618 (433/1513 bases) for non-complementing regions, while in C. lentocellum the rates were
0.19047 (24/126) substitutions per site vs. 0.28949 (438/1513). For C. acetobutylicum the Kn/Ks ratios were 0.277 in 9-mers and 1.466 otherwise. For C. lentocellum, Kn/Ks was 0.538 in 9-mers and 1.415 outside 9-mers, tending to confirm that selection pressures are different in stems than in loops.

Bottom line, the data are consistent with a scenario in which secondary structure of GroEL mRNA (and/or ssDNA) plays a role in heat-activation of the gene, such that when temperatures exceed the melting point of secondary structures, the gene is eligible for transcription and/or translation. The gene is, in effect, its own thermometer.


Thursday, April 17, 2014

The Pathogen's Playbook

When comparing pathogenic bacteria with non-pathogenic species of the same genus or family, we often find a common pattern. In the pathogen:
  • The genome is often reduced in size (particularly in endosymbionts, but also in others).
  • The genome is often shifted in the direction of higher A+T content (lower G+C content).
  • Many pseudogenes are present.
  • Often, the pathogen is a slow-grower in pure culture (if it can be cultured at all).
  • The pathogen has special nutritional needs.
An extreme case that illustrates all of these points is Mycobacterium leprae, the leprosy bacterium. It has fewer genes than its cousin, M. tuberculosis (which in turn has fewer genes than non-pathogenic Mycobacteria); its genomic G+C content is 8% lower than most other Mycobacteria; it contains over 1100 pseudogenes; it has a doubling time of two weeks; and it cannot be grown in pure culture (presumably because of fastidious nutritional requirements).

M. tuberculosis can be grown in the laboratory, but it, and its M. avium-group cousins, are very slow growers, taking anywhere from four days to two weeks to develop colonies on solid media.

It seems likely that some pathogens (certainly members of the Mycobacteria, but also the tiny Tenericutes, e.g. Mycoplasma, among many others) have evolved slow growth as a survival strategy. Certainly, organisms that have evolved an intracellular parasitic lifestyle need to be careful not to out-grow the host, if the relationship is to be a long one.

All of the factors listed above suggest a certain scenario, a "pathogen's playbook," if you will, which can be summarized as follows:
  1. The organism invades a warm-booded host.
  2. Phagocytes (white blood cells) ingest the organism.
  3. The phagocytes undergo a respiratory burst, flooding the microbe(s) with peroxides, hypochlorites, nitrous oxide, and other noxious oxidants.
  4. The flood of reactive oxygenated species triggers an SOS response in the microbe.
  5. The microbe's DNA undergoes massive damage. 
  6. Any surviving microbial cells are now pathogenic.
The SOS response is known to trigger mutagenicity. In Mycobacterium, for example, peroxides (as well as UV light) can induce up-regulation of dnaE, an error-prone polymerase. Since Mycobacteria are known to lack a MutS mismatch repair system, SOS-induced errors in DNA replication will almost certainly include uncorrected frameshift errors leading to the creation of pseudogenes. But that's a good thing, if you're a Mycobacterium interested in forming a longterm relationship with a host cell. The loss of certain genes (as long as they're not essential!) will likely slow your metabolism and make you dependent on host nutrients. Truly non-essential pseudogenes will simply be jettisoned over time, reducing the footprint of the remaining genome. Any pseudogenes that survive will likely have done so because they're now playing an essential gene-silencing role.

Let's expand on that last part. Take the dnaE gene, for example. M leprae has two copies of this gene, only one of which is functional. Suppose both copies were functional at the time of the massive pseudogenization event that converted so many of M. leprae's genes to pseudogenes 9 to 20 million years ago. After the pseudogenization event (probably a phagocytic respiratory burst), one copy of dnaE became a pseudogene. But continued transcription of the pseudogene in the forward direction means the pseudo-mRNA competes with the "normal" dnaE transcript for ribosomal attention. Transcription of the antisense strand of the disabled gene would, of course, create a messenger RNA product that could silence the normal transcript by doublestranded interaction. Either way, once the pseudogenization event is over, dnaE expression is attenuated—as it should be, once pathogenicity has been established.

Is it realistic to think M. leprae transcribes antisense strands of its pseudogenes? Given that E. coli has been found to contain ~1000 antisense transcripts, and given that we know M. leprae transcribes many of its pseudogenes, I think the answer has to be yes.

So the pattern is: infection, respiratory burst, massive mutation, silencing of many genes, and (oh by the way) creation of many brand-new gene products, some of them no doubt quite toxic to the host, as the result of gene truncation and pseudogene expression.

Saturday, April 12, 2014

The Most Deadly Pathogen of All Time

Few bacterial species have had as great an impact on humankind as the members of the Mycobacterium family, which encompass the causative agents of (among other ailments) leprosy, tuberculosis, and Crohn's Disease in humans, and Johne's Disease in farm animals. Leprosy is known from antiquity and continues to strike 200,000 or more people each year worldwide. Tuberculosis, which affects (subclinically) one in three persons worldwide, continues to kill well over a million people a year and has caused a billion deaths in the last two centuries, more than all the wars and genocides of history combined.

The association of M. avium subspecies paratuberculosis (MAP) with Crohn's Disease is still considered controversial by some, but if in fact Koch's criteria have already been met, MAP adds millions more to the toll of human misery caused by Mycobacterial infection.

Colonies of Mycobacterium have a
characteristically waxy consistency.
Shown here: colonies of M. tuberculosis.
What are these bacteria? Where did they come from? How have they managed to be so successful in causing death and disease?

The prefix "myco" means fungal, but these are not fungi we're talking about. Mycobacteria are soil- and water-borne bacteria that produce an extraordinarily complex cell wall containing not only the usual (for bacteria) peptidoglycans but also:
  • Arabinogalactan
  • Mycolic acids
  • Lipoarabinomannan
  • Extractable lipids including glycolipids, phenolic glycolipids (PGL), glycopeptidolipids (GPL), waxes, acylated trehaloses, and sulfolipids
In contrast to most cell-wall fatty acids (which contain carbon-carbon double bonds susceptible to oxidation), mycolic acids are cyclopropanated and resistant to oxidation, not to mention extremely hydrophobic. The Mycobacterial cell wall thus presents a formidable physical barrier to antibiotics, and it was with considerable dismay that physicians realized, early on, that penicillin would have no benefit in treating tuberculosis. When an antibiotic that could attack M. tuberculosis was finally discovered (streptomycin), it resulted in a 1952 Nobel Prize for Ukrainian American Selman Waksman (although in reality the discovery was made by a post-doc in Waksman's lab, Albert Schatz).

The Mycobacterial cell wall is famously complex, but it also has the curious habit of disappearing entirely, under nutrient-starvation conditions. Like many other bacteria, Mycobacteria can, under certain conditions, shed their cell walls and take on a so-called L-form morphology, in which cells (bounded only by a thin and osmotically vulnerable cell membrane containing just 7% of the usual amount of peptidoglycan) exist as protoplasts which are nonetheless able to reproduce and thrive, producing distinctive colonies on solid media and giving rise, in vivo, to tiny spherules that are often confused with Russell bodies in cancer biopsies. The medical significance of the mysterious L-forms is still debated, after more than 100 years.

The very small red filaments here are cells of  
Mycobacterium avium living inside lymph-node
macrophages in an immunocompromised individual.
One thing most Mycobacterial species have in common is slow growth. Cultures of M. tuberculosis and MAP often require weeks to develop, and M. leprae (which can't be grown in pure culture at all; it can be lab-grown only in the footpads of mice or armadillos) has the longest known generation time of any bacterium, at two weeks.

Ironically, pathogenic strains of Mycobacterium seem to have evolved slow growth as a survival strategy. (This certainly makes them hard to treat with antibiotics. Most antibiotics are effective only in disrupting the growth of actively growing cells.) The lack of DNA mismatch repair enzyme systems (MutS, MutL, and MutH) may be an outcome of the fact that slow DNA replication in these organisms, in and of itself, ensures reasonably high-fidelity replication. On the other hand, lack of a mismatch repair system could be why pseudogenes (genes inactivated due to frameshifts or other errors) abound in Mycobacterial species. M. leprae famously has over 1000 pseudogenes; M. smegmatis strain JS623 harbors over 200 pseudogenes; M. canettii (strain CIPT 140010059) and M. rhodesiae (strain NBB3) both have over 100. (For a good review of Mycobacterial DNA repair systems, see this 2011 paper.)

Unlike Yersinia pestis, the plague organism, which may be less than 20,000 years old (very young in bacterial species time), M. tuberculosis, as a species, appears to be at least 3 million years old, although this number should probably be considered a minimum age, subject to upward revision. (The species was thought to be only 35,000 years old as late as 2002, before a more detailed genetic analysis established the 3-million-year estimate of its age. The numbers should be viewed with caution, however, since they're based on mutation-rate assumptions derived from data for E. coli.)

The question of how M. tuberculosis has managed to achieve its distinctive pathogenic profile is a matter of active ongoing research, and likely will be for a long time. A recent review article reminds us: "The [complete genome] sequence of the pathogen Mycobacterium tuberculosis strain H37Rv has been available for over a decade, but the biology of the pathogen remains poorly understood."

Miscellaneous Links
List of famous T.B. victims—Brontë family, Balzac, Kafka, Thoreau, Kant, Chekhov, Orwell, Schrödinger, Vivien Leigh, Arline Feynman (wife of the famous physicist), the list goes on.
Tuberculosis in Literature and the Arts
The T.B. Blues (Jimmie Rodgers, 1931) This song, famously covered by Leon Redbone (among others), was written by Rodgers after he contracted the disease at age 27. He died eight years later.
World Health Organization TB Stats (landing page)
The Tuberculosis Systems Biology Program