Friday, April 26, 2013

Science on the Desktop

For decades, I've been hoping I'd live long enough to see a day when serious science could be done on the desktop by dedicated amateurs. Amateur astronomers know what I'm talking about. You can't do much particle physics on the desktop, and there are no affordable desktop electron microscopes (yet), but if comparative genomics is your thing? Get ready to rock and roll, my friend.

Over the weekend I discovered http://genomevolution.org and promptly went nuts. Let me take you on a tour of what's possible.

First I should explain that my background is in microbiology, and I've always had a soft spot in my heart (not literally) for organisms with ultra-tiny genomes: things like Chlamydia trachomatis, the sexually transmitted parasite. It's technically a bacterium, but you can't grow it in a dish. It requires a host cell in which to live.

It turns out there are many of these itty-bitty obligate endosymbionts (at least a dozen major families are known), and because of their small size and obligate intracellular lifestyle, they have a lot in common with mitochondria. Which is to say, like mitochondria, they're about a micron in size, they divide on their own, they have circular DNA, and they provide services to the host in exchange for living quarters.

When you look at one of these little creatures under the microscope (whether it's Chlamydia or Ehrlichia or Anaplasma or what have you), you see pretty much the same thing. (See photo.) Namely, a tiny bacterium living in cytoplasm, mimicking a mitochondrion.

When Lynn Margulis wrote her classic 1967 paper suggesting that mitochondria were once tiny bacterial endosymbionts, it seemed laughable at the time, and her ideas were widely criticized (in fact her paper was "rejected by about fifteen journals," she once recalled). Now it's taught in school, of course. But we have a long way to go before we understand how mitochondria work. And we really, really need to know how they work, because for one thing, mitochondria seem to be deeply involved in orchestrating apoptosis (programmed cell death) and various kinds of signal transduction, and until we understand how all that works, we're going to be hindered in understanding cancer.

When I discovered the tools at http://genomevolution.org, one of the first things I did, on a what-the-hell basis, was compare the genomes of two small endosymbionts, Wolbachia pipientis and Neorickettsia sennetsu. The former lives in insects; the latter, in flatworms that infect fish, bats, birds, horses, and probably lots else. Note that for a horse to get Potomac horse fever, first the Neorickettsia has to infect a tiny flatworm; then the flatworm has to be ingested by a dragonfly, caddisfly, or mayfly; then the horse has to eat (or maybe be bitten by, although only infection-by-ingestion has been demonstrated) the worm-infected fly. The parasite-of-a-parasite chain of events is not only fascinating in its own right, it suggests (to me) that parasites enable each other through shared strategies at the biochemical level, and I might as well spoil some suspense here by revealing that there's even yet another layer of parasitism (and biochemical enablement) going on in this picture, involving viruses. But we're getting ahead of ourselves.

I mentioned Wolbachia a second ago. Wolbachia is a fascinating little critter, because it's found in the reproductive tract of anywhere from 20% to 70% of all insects (plus an undetermined number of spiders, mites, crustaceans, and nematodes), but they don't cause disease, and in fact it appears many insects are unable to survive without them. Wolbachia are unusual in that the extracellular phase of their lifecycle (the part where they spread from one host to another) isn't known; no one has observed it. What's more (and this part is incredible), Wolbachia have adapted to a stem-cell niche: They live in the cells that give rise to insect egg cells. Thus, all newborn female progeny of an infected mother are infected, and all eggs pass on the Wolbachia. In this sense, the genetics of Wolbachia obey mitochondrial genetics (whereby the mother passes on the organelle and its genome).

I quickly found, via Sunday afternoon desktop genomics, that Wolbachia and Neorickettsia (and other endosymbionts: Anaplasma, Ehrlichia, etc.) have many genes in common—hundreds, in fact. And when I say "genes in common," I mean that the genes often show better-than-50% similarity in DNA base-pair matching.

It's important to put some context on this. These little organisms have DNA that encodes only 1,000 genes. (By comparison, E. coli has around 4,400 genes.) Endosymbionts lack genes for common metabolic pathways. They cannot biosynthesize amino acids, for example; instead they rely on the host to provide such nutrients ready-made. If 400 to 500 of an endosymbiont's 1,000 genes are shared across major endosymbiont families, that's a huge percentage. It suggests there's a set of core genes, numbering in the low hundreds, that encapsulate the basic "strategy" of endosymbiosis.

A little more context: Mitochondria have their own DNA and look a lot like endosymbionts. But here's the thing: Mitochondrial DNA is tiny (only about 15,000 base pairs, versus a million for an endosymbiont). It turns out, 97% of the "stuff" that makes up a mitochondrion is encoded in the nucleus of the host. If you include these nuclear genes, mitochondria actually rely on about 1,000 genes total, of which only 3% are in the organelle's DNA. Lynn Margulis would say that what happened is, the endosymbiont ancestor of today's mitochondrion originally had DNA of about a million base-pairs (1,000 genes), but some time after taking up residency in the host cell, the invader's DNA mostly migrated to the host nucleus.

Why did symbiont-to-host DNA migration stop at 97%? Why not 100%? If we look at that 3%, we find genes coding for tRNA and bacterial ribosomes (specialized protein-making machinery) plus genes for enormous, complex transmembrane enzyme systems: cytochrome c oxidase and NADH dehydrogenase. (The former is the endpoint of oxidative respiration; the latter the entry-point.) Obviously it must be advantageous for these genes to be proximal to the organelle.

But why even have an organelle (a physical compartment)? One might ask why it's necessary to have a mitochondrial parasite swimming around in the cytoplasm at all, when most of the genes are part of the host's DNA? The answer is, the stuff that goes on inside the confines of the mitochondrion needs to be contained, because it's violently toxic stuff involving superoxide radicals, redox reactions, "proton pumps," and Fenton chemistry (transition-metal peroxide reactions). A containment structure is definitely called for, to segregate this toxic chemistry from the rest of the cell.

We might ask how it is that the DNA of the protobacterial ancestor of today's mitochondria wound up in the host nucleus in the first place. Let's consider the possibilities. Protobacterial (symbiont) DNA may have transferred to the host all at once, or it might have migrated piecemeal, over time. Or both. Is it realistic that huge amounts of endosymbiont DNA could have migrated to the host nucleus all at once? Yes. It's been suggested that vacuolar phagocytosis drove invader DNA to the nucleus in a big gulp. Evidence? Wolbachia inhabits the vacuolar space.

But export of genes and gene products to the host might have occurred piecemeal as well. A little desktop exploration provides some clues. If you use GenomeView or any number of other online tools to explore the DNA of Wolbachia, several things pop out at you. First is that many Wolbachia genes are mitochondria-like: They encode for things like cytochrome c oxidase, cytochrome b, NADH dehydrogenase, succinyl-CoA synthetase, Fenton-chemistry enzymes, and a slew of oxidases and reductases (including a nitroreductase). Wolbachia is clearly engaged in providing what might be called redox-detox services for the host—the same value proposition that mitochondria offer. This makes sense, because if Wolbachia cells were a net drag on the respiratory potential of host-cell mitochondria (if they couldn't at least hold their own with respect to mitochondria), the host would die.

The second thing that jumps out at you when you look at the Wolbachia genome is the abundance of genes devoted to export processes: membrane proteins, permeases, type I, II, and IV secretion systems, ABC transporters, etc., plus at least 60 ankyrin-repeat-domain genes—all powerful evidence of specializations aimed at export of genes and gene products to the host. But the most stunning "smoking gun" of all is the presence, in Wolbachia DNA, of five reverse-transcriptase genes, plus genes for resolvases, recombinases, transposases, DNA polymerases, RNA polymerases, and phage integrases. In essence, there's a complete suite of retroviral machinery, designed for export of foreign DNA into host DNA.

An example of one of 113 phage-derived genes in Wolbachia (lower gene array). In this case, the gene matches a phage gene found in Candidatus hamiltonella (upper gene array). The two isoforms exhibit 59% DNA sequence similarity, despite widely differing GC ratios. See text for discussion.

But wait. There's more. The third thing that jumps straight in your face when you start looking at the Wolbachia genome is the presence of (are you ready?) no less than 113 genes for phage-related proteins, including major and minor capsid and HK97-style prohead proteins, plus tail proteins, baseplate, tail tube, tail tape-measure, and sheath proteins; late control gene D; phage DNA methylases; and so on. (For non-biologists: phage is the term for viruses that attack bacteria.)

In the above screenshot, I'm comparing Wolbachia DNA (lower strip) to DNA from another insect-infecting endosymbiont, Candidatus hamiltonella, which is known to contain an intact virus (phage) in its DNA. Many phage proteins in Wolbachia have corresponding matches in the Candidatus genome. In this case, we're looking at a gene (the gold-colored stretch pointed at by red arrows) that is 1440 nucleotides long, with a 59% sequence match across genomes. The match percentage is remarkably high given that the Candidatus version of this gene has a 51.7% GC content while the Wolbachia version has a 40.6% GC. Also, note that Wolbachia itself has an overall GC of 34.2%. The fact that Wolbachia's putative phage genes are significantly higher in GC content than Wolbachia's non-phage genes is good confirmation that the genes really are from phage.

It's 100% clear that viral DNA has made its way into the DNA of Wolbachia (either recently or long ago), and it's reasonable to hypothesize that Wolbachia has repurposed the retrovirus-like phage genes for packaging and exporting Wolbachia DNA to the host nucleus.

Okay, so maybe you have to be a biologist for any of this stuff to make your hairs stand on end. To me, it's a dream come true to be able to do this kind of detective work on a Sunday afternoon while sitting on the living-room couch, using nothing more than a decrepit five-year-old Dell laptop with a wireless connection. The notion that you can do comparative genomics and proteomics while watching an Ancient Aliens rerun on TV is (for me) totally cerebrum-blowing. It makes me wonder what's just around the corner.

Monday, April 22, 2013

When Vitamins Turn Deadly

The other day I did something I swore I'd never do: I paid $31.50 to Elsevier for a copy of a scientific paper. I spent a good 30 minutes looking for a free version of the paper online first, of course, using every sneaky trick I know. Then I debated with myself for 30 minutes, saying things like "You are not seriously going to pay those crooks $31.50, are you?" And answering: "No, of course not." "But you have to." "I know." "So do it." "I can't. I'd rather set my face on fire." And on and on.

After an hour I realized, of course, that the time I'd wasted not buying the paper had been worth much more than the paper itself. So I bought the paper. Just this once.

The article in question is called "Stopping the Active Intervention: CARET," by Bowen et al., Controlled Clinical Trials 24 (2003) 39–50. As you might have guessed, I'm going to spend the rest of today's blog talking about it so you don't have to buy it yourself.

The paper I bought talks about the circumstances surrounding the halting of an infamous cancer-prevention study called CARET, otherwise known as the Carotene and Retinol Efficacy Trial. It was a large National Cancer Institute trial that didn't go well.

CARET was actually one of two large NCI vitamin studies that didn't go well in the 1980s. The other was the Alpha-Tocopherol and Beta-Carotene (ATBC) Lung Cancer Prevention Study. CARET was carried out in the U.S.; ATBC took place in Finland.

I'm not going to spend a lot of time talking about the studies here (for that, see my latest blog on this subject at Big Think). Suffice it to say that both studies began enrolling participants in 1985, the Finns at a rate of about 9,000 a year, the Americans at 2,000 a year. The ATBC study reached its enrollment goal of just under 30,000 people in three years. CARET went on enrolling for nine years, peaking at 18,314 people in September 1994.

It's customary in large trials to enroll people gradually—not just for practical reasons (it's infeasible to sign up 30,000 people at once) but also because if a treatment proves to be deadly, you want to be able to halt the study before huge numbers of people have been put at risk.  Sadly, both ATBC and CARET put large groups at risk. Both should have been stopped prematurely. Only CARET was—and its halting came unnecessarily late.

The purpose of ATBC and CARET was to validate the usefulness of antioxidant supplements (Vitamins A and E, chiefly) as cancer-preventive agents. For ATBC, the study population consisted of male Finnish smokers aged 50 to 69. For CARET it was current and former smokers, male and female, aged 50 to 69, plus asbestos-exposed workers (all men) aged 45 to 74. (The asbestos workers made up 22% of the 18,314 study participants.) The experimental design in Finland was a 2x2 matrix design in which a quarter of the participants got 20 mg/day of beta-carotene, a quarter got 50 mg/day of alpha-tocopherol (Vitamin E), a quarter got both, and a quarter got placebo. Thus, treatment groups outnumbered placebo 3:1. In the U.S., CARET began with a 2x2 design but the design was changed early on so that the bulk of the study population got either placebo or a combo of 30 mg/day beta-carotene and 25,000 IU of retinol per day (thus a 1:1 ratio of treated to untreated populations).

According to the ATBC writeup that eventually appeared in The Journal of the National Cancer Institute, "An independent Data and Safety Monitoring Board convened twice annually to monitor trial progress and to study unblinded data that were relevant to intervention safety and efficacy." Which makes it all the harder to understand why the ATBC trial wasn't halted early when the beta-carotene groups separated from placebo—in the wrong direction. It was evident in Year 5 of the eight-year study that people in the treatment group were developing lung cancer at a heightened rate. In fact, that ended up being the key finding of the study: taking Vitamin E and beta-carotene leads to 18% more cancer.

Presumably, the Finns didn't halt their intervention early for the simple reason that the study's stopping criterion (whatever it happened to be; all large studies have them) wasn't met. Perhaps the divergence of treated and untreated group performance was deemed statistically non-significant. We'll never know for sure.

The ATBC results appeared in the April 14, 1996 issue of The New England Journal of Medicine. In "Stopping the Active Intervention: CARET" ($31.50 from Elsevier), we learn from the lead investigator and his coauthors:
On March 25, 1994, the NCI directed investigators of NCI-funded randomized trials involving beta-carotene to inform their data and safety monitoring boards of the ATBC trial findings, to review their data in light of these findings and to develop plans to inform their participants of the ATBC results and incorporate the findings in their consent forms.
This is an interesting statement in a couple of ways. First, it doesn't say exactly who at NCI "directed investigators" to review their data. The implication is that somebody higher up (than the lead investigator) gave an order. Whoever that person was, he or she knew the unblinded Finnish results ahead of publication. Secondly, it says safety boards were directed to "review their data." But in the CARET study (unlike ATBC), the safety board did not have access to unblinded data. How can you review data that's still blinded? And if you did so, what purpose would it serve? You'd look at the data and see that one group of people, under one set of codes, was doing worse than another group under another set of codes. You might very reasonably assume that the group doing worse was the placebo group. But you wouldn't know for sure.

This is where it gets interesting, because in "Stopping the Active Intervention: CARET" ($31.50 from Elsevier) we learn that
On March 29 and April 1 [1994], CARET’s Principal Investigator convened telephone conference calls with the SEMC during which the committee was unblinded to intervention group assignment.  
Note the phrase "the committee was unblinded." In actuality, there was no unblinding over the phone. In a separate writeup in Data Monitoring in Clinical Trials: A Case Studies Approach (Springer, 2006), page 222, former CARET Safety Committee members Anthony B. Miller, Julie Buring, and O. Dale Williams explain that in August 1994—four months after the phone call—"we unanimously agreed that we should be unblinded as to the nature of the regimens given to the coded groups." The unblinding actually happened in August, after statisticians compiled their interim report for the Safety Committee—not March.

The account given by the lead investigators in "Stopping the Active Intervention: CARET" makes it sound as if urgent action was undertaken in March 1994 to begin a stand-down of the experiment. That's not what happened. Not by a long shot.

When the Safety Committee got its first look at unblinded data (in August 1994), there was no doubt that CARET's study population was experiencing the same elevated cancer rates seen by the Finns. The Safety Committee took a vote on whether to stop the CARET trial—and found itself deadlocked. Two members of the five-person committee were in favor of stopping CARET. The others thought it would be better to go ahead. The rationale for continuing in spite of elevated cancer in the treated group (here I quote from the account given in Data Monitoring in Clinical Trials) was:
  • The statistical significance of the difference had not crossed the O’Brien–Fleming boundary (i.e., this could still be a chance finding).
  • The effect was surprisingly rapid and must mean if real that preexisting (but undiagnosed) lung cancers had had their growth accelerated by the regimen.
  • We knew of no mechanism of the action of beta-carotene that could have induced such an effect.
  • There were other chemoprevention trials using beta carotene ongoing, to stop CARET now would have an undesirable adverse effect on these trials.
  • We owed it to science to be absolutely certain of the adverse effect before stopping the trial.
The two members who favored a stop cited these reasons:
  • This was the second trial to show an adverse effect of beta-carotene chemoprevention; it was extremely unlikely to be due to chance.
  • We owed it to the participants to prevent possible further harm to them. It was perhaps particularly unfortunate that the adverse effect appeared to be present in asbestos workers as well as current smokers.
  • The adverse effects appeared not to be restricted to lung cancer; there appeared to be an adverse effect on cardiovascular disease as well.
Incredibly, even though the Finns had just shown (in a much larger study population) a definite correlation between beta-carotene usage and lung cancer in high-risk individuals, and even though CARET's own data replicated those results perfectly, and even though CARET was administering a 50% higher dose of beta-carotene to its participants than the Finns had used (possibly putting people at much greater risk), a decision was made to continue CARET, on the absurd assumption that after more data were accumulated, the bad trend-line would prove to have been nothing more than a statistical fluke.

In September 1995, a year after the first interim analysis, the Safety Committee got a look at updated results. They were even worse than before.

Many memos and meetings and conference calls and cross-country flights by NCI personnel later, a decision was finally reached in mid-December 1995 to pull the plug on CARET. The decision was approved on December 18. Then everyone adjourned for Christmas holidays. Then on January 12, 1996, letters went out (yes, by snail mail) to all CARET participants, informing them of the decision to end the study (and the reasons for the decision).
CARET found increases in all-cause mortality, cardiovascular
mortality, and (not shown here) lung cancer mortality in high-risk
individuals who took beta-carotene and retinol. (N Engl J
Med
1996;334:1150-5.)

When CARET's results were published in May 1996 in The New England Journal of Medicine, the medical world was aghast to learn that administration of beta-carotene and retinol to high-risk individuals actually increased the rate of lung cancer by 28% compared to placebo. Cardiovascular mortality, likewise, went up 26%.

The Finnish ATBC study had found an increase in lung cancer of 18%, based on a beta-carotene consumption of 20 mg/day. In the CARET study, participants had taken a 50% higher dose of beta-carotene (30 mg/day). They got 50% more cancer.

Hindsight is an evil game, and it's easy to bash NCI for not pulling the plug on CARET earlier than it did without knowing all the particulars. But frankly, what's to know? The Finns published their results in April 1994. Anyone could have looked at those results, then looked at the CARET experimental protocol (and study population), and immediately seen the red flags. Even without knowing the unblinded results, it would have been prudent to halt (or at least begin winding down) the study in April 1994, three and a half years ahead of the planned stop date. Instead, a befuddled Safety Committee (and a bureacracy-entrenched National Cancer Institute) waited until January 12, 1996 to send letters by first-class mail to people who were unnecessarily dying. In the irritatingly self-congratulatory account of this travesty in "Stopping the Active Intervention: CARET," we're constantly reminded that the study was terminated 21 months early, as if it's something to be proud of. The fact is, the study was terminated a minimum of two years later than it should have been, due to negligent disregard of the Finnish results (which were known ahead of time to NCI higher-ups). The Finns, too, should have stopped early, as soon as it became apparent that the treatment groups had diverged (in the wrong direction) from the placebo group, which is to say, in Year 5 of the 8-year study.

To argue that it was okay to wait an extra two years to stop the CARET study because the formal stopping criteria had not been met is a phony argument. When CARET finally was stopped, the stopping criterion had still not been met.

As I said before, I've provided more detail on ATBC and CARET (and other deadly vitamin trials) in a separate post at Big Think. Please read that post before deciding whether to adjust your daily vitamin routine. In the end, you might want to consider scaling back your use of Vitamins E and A if you're at high risk of cancer (and maybe even if you're not at high risk). The evidence says these supplements are harmful for at least some people. I'll post more on the subject here over the next few days.

Please share this story with your social media contacts if you found it helpful. And please see the companion post at http://bigthink.com/devil-in-the-data/the-dark-side-of-antioxidants. Thanks!

Saturday, April 20, 2013

Is Aging Caused by Oxidative Stress?

Everybody wants to know what causes aging, and what can be done about it. As it turns out, we know a lot about aging. But we know very little about how to prevent it.

Of all the theories of aging that have been proposed, the most thoroughly researched, by far, is the Free Radical Theory of Aging, which is more properly now called the Oxidative Stress Theory of Aging. It's been 60 years since Denham Harman first proposed that senescence is driven by the buildup of oxidative damage to DNA and other biological macromolecules. Since then, extensive research has verified repeatedly that as tissues age, they do, in fact, accumulate a wide variety of types of oxidative damage.

It's because of the Oxidative Stress Theory of Aging that you see so many foods these days labeled "Rich in Antioxidants." Supposedly, antioxidants (like Vitamin E and beta carotene) confer resistance to heart disease and other ailments. In the U.S., food (and supplement) makers are forbidden by law from making specific therapeutic claims around antioxidants. But this restriction is actually a boon for marketers, because the nameless benefits that accrue to antioxidants, whatever they are, will be conjured in the buyer's mind with much greater force and power than anything a label or an ad could possibly say, the same way a person watching a horror movie will make a monster even scarier (in his or her imagination) if the monster isn't actually shown on screen.

In 2010, the Federal Trade Commission made Kellogg's stop using the "Immunity" claim (tied to antioxidants) on its cereal packaging. The packaging you see here was discontinued. Notice that the "25%" badge seems to imply that there is a "Daily Value" for antioxidants. There is no such thing.
The bogeyman, in this case, goes by the name "reactive oxygen species" (ROS), which covers a lot of ground, chemically. Originally, ROS meant free radicals—breakdown products of peroxides. Which is a perfect bogeyman, because free radicals are so fleeting their concentrations in living cells can't even be measured. (They're so chemically reactive that they last for only a nanosecond or so.) These days, ROS can also refer to superoxides, aldehydes (e.g., formaldehyde), and/or anything else that's reactive and contains oxygen. (Again, a conveniently vague category of "scary stuff.")

The way bogeyman research works in science is, when a new mediator of tissue damage is first identified (for example: nitric oxide, superoxide anion, prostaglandins, leukotrienes, interleukin-6, interleukin-8, tumor necrosis factor alpha) researchers rush to measure it in a host of human and/or animal diseases. Soon, the molecule in question is "implicated" in the pathogenesis of various diseases.

The problem is, finding that XYZ bogey-molecule is implicated in this or that disease is not the same thing as finding that it causes the disease. Free radicals have been implicated in over a hundred diseases (Gutteridge, JMC, "Free radicals in disease processes: a compilation of cause and consequence," Free Radic Res Commun 1993;19:141-58). They're the cause of none of them.

Until recently, it's been impossible to show a cause-effect relationship between oxidative stress and aging. We know the two are linked, but we don't know if it's in causal fashion.

Recent work with genetically modified mice may have finally provided some much-needed clarification on the role of oxidative stress in aging. I'm referring to work done by Viviana Pérez and her colleagues at the University of Texas Health Science Center in San Antonio, Texas, published in 2009. Pérez looked into the life-extending (or -reducing) effects of various mutations involving oxidative enzymes in mice, the idea being that if you knock out certain oxidative-damage-repair genes, mice should age prematurely (if they live at all), whereas if you amplify or upregulate certain damage-repair genes, mice should show fewer signs of aging (and maybe live longer).

The Pérez results take a while to explain, but it's worth looking at carefully, because it sheds much-needed light on the question of whether aging is actually caused by oxidative damage (or the converse).

When the Pérez team looked at genetically modified mice that lacked glutathione peroxidase 1 (Gpx1, a major scavenger of intracellular peroxides), they found, surprisingly, that the mice lived to normal age and showed no pathology. However, mice deficient in glutathione peroxidase 4 (genetic marker Gpx4) were embryonic-lethal. The latter finding tends to support Free Radical (Oxidative Stress) Theory.

An enzyme called methionine sulfoxide reductase-A (MsrA) repairs oxidized methionine residues in proteins and may also function as a general antioxidant. Pérez et al. found that mice null for MsrA lived a normal lifespan even though they showed some additional sensitivity to oxidative stress.

Thioredoxin 2 (Trx2) plays an important role in repairing oxidation of cysteine residues in proteins. It turns out Trx2-null mice are embryonic-lethal, but Trx2 +/- (heterozygous) mice, with just one copy of the gene instead of the normal two, had 16% longer maximum lifespan.

There are two major superoxide dismutases that break down superoxides in cells: CuZnSOD and MnSOD (genetic markers SOD1 and SOD2). Pérez et al. found that mice lacking the former suffer a 30% reduction in mean and maximum lifespan. Mice lacking the latter die within days of birth.

To recap so far: Mice null for SOD2 or Gpx4 are non-viable, while those null for SOD1 have 30% shorter lives. These results tend to support Free Radical Theory. But knockout mice lacking MsrA or Trx2 live normal lifespans, which contradicts Free Radical Theory.

How are we to interpret these results? One problem with knockout studies is that if a certain chemical reaction doesn't occur (because the responsible enzyme system is taken away—"knocked out" genetically), it's difficult to know whether resulting harm to the host is due to a buildup of unreacted precursor molecules, or (rather) the absence of crucial end-products. The end-products of the reaction might be vital to downstream metabolism. It might not simply be that the precursors to the reaction are toxic. After all, if hydrogen peroxide (the end-product of superoxide dismutases) is an important signalling molecule, as recent work seems to indicate, you would expect abnormalities in SOD1 or SOD2 to be harmful indeed—for reasons having nothing to do with aging.

Bottom line, the finding that mice lacking SOD1, SOD2, or Gpx4 are unhealthy is not sufficient to vindicate the Oxidative Stress Theory.

The ultimate test for Oxidative Stress Theory would be to see whether mice show fewer signs of aging (e.g., less DNA damage with age)—and actually live longer—when enzymes involved in combating oxidative stress are increased (over-expressed). The Pérez team tried exactly this approach.

As mentioned before, there are two major superoxide dismutases that break down superoxides in cells: CuZnSOD and MnSOD (genetic markers SOD1 and SOD2). When mice were made to over-express SOD1 (so that they had two to five times the normal activity of the CuZnSOD enzyme), the mice were indeed more resistant to oxidative stress as measured by standard tests involving tolerance of paraquat and diquat. But the mice lived no longer than ordinary mice.

The same was observed for mice that over-expressed SOD2.

When Pérez et al.created mice that over-expressed catalase (the enzyme that degrades hydrogen peroxide to water and oxygen), they found the mice were less prone to DNA damage—but lived no longer than normal.

In mice with upregulated glutathione 4, enhanced protection against various kinds of oxidative stress was demonstrated. But the mice lived no longer than normal wild-type animals.

The Pérez group also tried over-expressing more than one antioxidative gene at once. No combination produced any lifespan extension.

To recap: mice do not live longer when they over-express antioxidant enzymes (singly or in combinations), even though they show heightened protection against DNA damage, lipid damage, and other typical signatures of oxidative stress.

Pérez et al. concluded:
We believe the fact that the lifespan was not altered in the majority [of] the knockout/transgenic mice is strong evidence against oxidative stress/damage playing a major role in the molecular mechanism of aging in mice.
It's hard to disagree with that conclusion. Some of the genetic manipulations Pérez et al. tried were inspired by fruit-fly experiments that gave much more encouraging results. But mammals are not fruit flies. And it seems unlikely to me that the results reported by Pérez et al. are some kind of fluke, limited to mice. (It seems unlikely that entirely different results would be found in humans.) Altogether, Pérez et al. tried 18 different genetic manipulations. Not one extended the life of mice.

To me, it means we can put the oxidative-stress bogeyman to bed now, and go on to worry about other things. Whatever's keeping us from living to be 120, it's not oxidative stress.



For more on this subject, see my recent post at Big Think.

Thursday, April 18, 2013

Hydrogen Peroxide and Scientific Dogma

The catalase test is simple: Add a drop of hydrogen
peroxide to a sample of bacteria on a microscope slide and see
if it fizzes. Anaerobes (even aerotolerant ones, such as Streptococcus
pyogenes) won't fizz because they lack catalase. Aerobic bacteria
produce oxygen bubbles, as on the right.
I remember the first time I was introduced (as a young bacteriology student) to the catalase test. You scrape a colony of bacteria off the surface of an agar dish, rub it onto a microscope slide, then take an ordinary eyedropper filled with 3% hydrogen peroxide and drop a big fat drop of liquid onto the little slimy smudge of bacteria. If the smudge begins to fizz vigorously, like Alka Seltzer, the bacteria are catalase-positive. If no fizzing happens, they're catalase-negative.

The fizzing happens because of an enzyme called catalase that promotes the conversion of hydrogen peroxide to water and molecular oxygen:


2 H2O2 → 2 H2O + O2

For the last hundred years, every student of bacteriology has been taught that the reason aerobic bacteria (and indeed all aerobic life forms, up to and including humans) have catalase is that hydrogen peroxide is severely toxic and must be gotten rid of, lest it form highly reactive hydroxyl radicals:

 H2O2 → OH + OH


The OH radicals, being extremely reactive chemically, will attack just about anything: DNA, RNA, proteins, lipids, mucopolysaccharides, what have you. Hydroxyl radicals are toxic. Catalase provides a way of neutralizing peroxides so that no radicals can form.

Anaerobic organisms, the story goes, lack catalase because they live in oxygen-poor environments where things like peroxides don't form. So-called strict anaerobes are actually killed by exposure to air. The reason they die when they come in contact with air (supposedly) is that in the presence of oxygen they experience an endogenous buildup of peroxides that (in the absence of catalase) go on to form toxic free radicals that, in turn, eventually cause damage to DNA, proteins, lipids, and other macromolecules.

That's the official dogma on peroxides, free radicals, and catalase.

The only trouble is, as with so much other dogma in this world, it's completely wrong.

And it would all be harmless prattle if it just applied to bacteria. But unfortunately, this bit of assumption-laden dogma about peroxides and free radicals leads to some rather fanciful notions about the role of "oxidative stress" in ordinary metabolism. The notion that peroxides and free radicals (so-called Reactive Oxygen Species) are harmful has led to the spending of billions of research dollars on"oxidative stress" and ways to stave off "oxidative damage" to DNA, proteins, etc. It has led to billions of dollars of misleading advertising around foods "rich in antioxidants." (And it has actually led to clinical trials of antioxidants like beta carotene and Vitamin E in which people died needlessly, something I'll discuss in more detail in a future post. For now, you might want to refer to this paper.)

Suppose that, rather than accepting the catalase-as-detoxifier/peroxides-as-evil theory at face value, we ask some fundamental questions. Such as:

  • What's the evidence that anaerobes exposed to air actually die of peroxide poisoning? 
  • If peroxides are poisonous, why are there no anaerobes that have catalase? (If there's survival value for aerobes to have catalase, surely there's even more survival value for an anaerobe to have it?)
  • If catalase exists to protect aerobic cells from peroxide poisoning, then we should expect catalase-knockout mutations to be lethal, or at least gravely deleterious, yes?

Let's review some basic facts. First, hydrogen peroxide is not toxic at low concentrations. It is, in the words of one researcher, "poorly reactive: it does not oxidize most biological molecules including lipids, DNA, and proteins" (Halliwell et al., "Hydrogen peroxide: Ubiquitous in cell culture and in vivo?", IUBMB Life, 50: 251–257, 2000, PDF here). Concentrated hydrogen peroxide is toxic (it's a disinfectant), but at the dilute concentrations found in living cells, hydrogen peroxide isn't doing anything harmful to DNA, proteins, or lipids. The situation is analogous to that of hydrochloric acid. At high concentrations, HCl will eat through skin. Put a couple liters into an 80,000-liter swimming pool, though, and you can drink the stuff. So it is with peroxide.

Secondly, peroxides are ubiquitous in living systems (again see the Halliwell paper). In higher life forms H2O2 is produced in vivo by monoamine oxidase, xanthine oxidases, various dismutases, and other enzymes, under homeostatic control. There's substantial evidence that hydrogen peroxide is a widely used signalling molecule (see references 21 to 26 in the Halliwell paper) and recent work has shown a role for hydrogen peroxide in reparative neovascularization. Recruitment of immune cells to wounds likewise appears to require hydrogen peroxide.

Far from being toxic, hydrogen peroxide is an important biomolecule, essential to ordinary metabolic processes. There is zero evidence that hydrogen peroxide does anything harmful in living tissues, at the concentrations normally found in vivo.

If hydrogen peroxide were toxic, we'd expect that a mutation that knocks out catalase would mean certain death for the host organism. After all, with no catalase to break down H2O2 to oxygen and water, hydrogen peroxide would simply accumulate until reaching toxic levels. But it turns out, naturally occurring catalase-negative mutants of Staphylococcus aureas have been reported (laboratory-created catalase-negative mutant strains of E. coli and other organisms are known as well). Catalase-knockout mice have been created, and they develop normally. Humans lacking normal catalase were first identified in the early 1950s when a Japanese doctor found that pouring hydrogen peroxide onto a patient's infected gums caused no foaming. The catalase-negative condition in humans is known as acatalasemia or acatalasia. It results in no pathology except an increased tendency toward periodontal infection.

Catalase does serve an important function in aerobic organisms (having nothing to do with detoxification). With catalase, the oxygen in hydrogen peroxide can be scavenged for (re)use in respiration. This is important because every oxygen molecule is worth 38 ATP molecules in the aerobic breakdown of glucose. (ATP, adenosine triphosphate, is the high-energy molecule that powers the chemical machinery of cells. Without ATP, metabolism grinds to a halt.) By comparison, anaerobic breakdown of glucose (which is to say, fermentative breakdown) yields only 2 ATP molecules per sugar molecule. It's very much in the cell's interest to recycle the oxygen from hydrogen peroxide rather than let it go to waste. Catalase makes that reuse possible.

Anaerobic bacteria obtain energy solely from fermentation. They have no use for oxygen. Therefore they have no use for catalase. A catalase gene would simply be extra genetic baggage for an anaerobe. It would confer no survival value.

It's astonishing to me that the catalase myth (the version of the myth that says catalase exists to detoxify hydrogen peroxide so as to keep the cell from dying of free-radical-induced damage) has survived for well over a hundred years without anyone questioning it. I see the myth repeated all over the Internet as if it's Gospel. Never do I see any substantiating research referenced in support of it. It's just propagated from one unquestioning drone to another.

Why? Why do myths like this take hold in science? Why do otherwise intelligent scientists cling to them and perpetuate them, regurgitating them in textbooks and handing them down to new generations of students?

I think the answer is, because it makes a good story, and as human beings we value a good story more than we value checking out the story to see if it's true (providing it's a suitably satisfying story).

Before there was science, stories were all the human race had as a way of trying to understand the universe. If someone told a good enough story, and  the story provided a satisfying-enough explanation of something, the story endured. Some of humankind's most cherished stories have survived for thousands of years. They survive, in many cases, even if they're not verifiably true.

With science, the theory is the unit of storytelling. If a theory seems to fit the facts, it's accepted. If, on closer inspection, a story is found not to fit the facts, it will still be accepted by many people, if it's a satisfying enough story. 

The ancient Greeks knew the earth was not flat. (According to Diogenes Laertius, "Pythagoras was the first who called the earth round; though Theophrastus attributes this to Parmenides, and Zeno to Hesiod.") Nevertheless it took centuries for flat-earth theory to fall into disrepute, and even to this day there are people who find flat-earth theory satisfying.

So it is with scientific theories. Good stories (and that's all scientific theories are: stories) have staying power. Even when they're wrong.


For more on the Oxidative Stress Theory of Aging (and why it's wrong), see my blog at Big Think.


Tuesday, April 09, 2013

Spontaneous Recovery from Depression

Emil Kraepelin (1856-1926), who coined the term "manic depressive," found that in contrast to patients suffering from dementia praecox (schizophrenia), those suffering manic depression had a relatively good prognosis, with 60% to 70% of patients suffering only one attack and attacks lasting, on average, seven months.

Modern drug trials for antidepressants seldom take into account the fact that people with depression often get better on their own. The typical randomized controlled trial (RCT) has a placebo arm and a treatment arm, but no non-placebo/non-treatment arm (otherwise known as a wait-list arm). It's commonly assumed that people who get better on placebo, in drug trials, are experiencing the placebo effect when in reality a certain number of people just get better on their own even without placebo. Hence, the placebo effect is almost certainly overstated.

But do people really get better on their own? In the Netherlands, researchers looked at the progress of 250 patients who had reported an episode of major depression. Two thirds of the patients were female and for 43%, it was a recurrent episode. Some patients sought treatment at the primary-care level; others sought mental-health-system care; others sought no care. The researchers found that the overwhelming majority of patients recovered (defined as "no or minimal depressive symptoms in a 3-month period"), regardless of the level of treatment.

A 2002 study in the Netherlands found that people with depression tend
to get better regardless of level of care. See text for discussion.

The median duration of major-depressive episodes was 3.0 months for those who had no professional care, 4.5 months for those who sought primary care, and 6.0 months for those who entered the mental health care system. (It's not told what percentage of patients who sought care took meds, but for this discussion it doesn't matter. The point is, most people do get better, one way or another.) The differences in mean episode duration may reflect severity (no data were given for this). The people who recovered quickly on their own may have done so because they were less depressed. It stands to reason that those who sought help at the mental-health-system level were probably more depressed, hence took longer to recover.

In any case, the point is that today, as in Kraepelin's time, many depressed patients recover, with or without medical intervention, because that's the nature of the illness. It comes and it goes.

You can find the above study online at http://bjp.rcpsych.org/content/181/3/208.full. The full reference is Spijker, et al., "Duration of major depressive episodes in the general population: Results from the Netherlands Mental Health Survey and Incidence Study (NEMESIS)," The British Journal of Psychiatry (2002) 181: 208-213


Monday, April 08, 2013

When Psychotherapy Goes Wrong

When talk therapy works, it can work wonders. When it doesn't work (which is a fair amount of the time), all bets are off, because anything can happen.

While psychiatric drugs are required to undergo efficacy testing before being released to the public, no efficacy requirements are placed on non-drug therapies. Talk therapy is just assumed to work for most people, and new therapies are routinely rolled out willy-nilly on live patients with no oversight by anyone and no guarantee of safety, much less efficacy.

That's not to say some therapies haven't been subjected to controlled testing. In recent years, Cognitive Behavioral Therapy (CBT), in particular, has been the subject of hundreds of published trials. There's reason to believe, however, that many of the CBT trials are biased and that (as with drug trials that don't come out the way the researchers wanted) unflattering trials simply go unpublished.

Cuijpers et al. in 2010 found substantial reason
to suspect publication bias in 175 talk-therapy
trials.
Notice the asymmetry in this funnel plot.
In a 2010 paper in The British Journal of Psychiatry ("Efficacy of Cognitive Behavioral Therapy and Other Psychochological Treatments for Adult Depression," 196:173-178, non-paywall-protected version here) Cuijpers et al. looked at published trials that made 175 comparisons between particular talk therapies and a control group. The funnel plot for these 175 trials (see graphic) strongly suggests publication bias. The majority of the trials (92) involved CBT.

With drug trials, we usually think in terms of a med either helping or not helping. In reality, patients tend to follow one of three trajectories: the drug helps, it does nothing, or it actually makes the condition worse. (These trajectories exist for placebos as well.) Individual trajectories typically aren't reported in the literature (unless they culminate in adverse events, like suicide). Instead, they're lumped together into an overall score that shows yea-many-points average improvement on the Hamilton scale (or whatever), for the treatment arm as a whole.

It's difficult to say how often therapy goes off the rails. But a meta-analysis of 475 talk-therapy outcome studies (reported in Smith, Glass, and Miller, The benefits of psychotherapy, Baltimore: Johns Hopkins University Press, 1980) found that 9% of the time, effect sizes were negative, meaning patients got worse in therapy. Shapiro & Shapiro found much the same thing in "Meta-analysis of comparative therapy outcome studies: A replication and refinement," Psychological Bulletin, 1982, 92, 581–604.

In a 2006 paper, Charles M. Boisvert and David Faust ("Practicing Psychologists’ Knowledge of General Psychotherapy Research Findings: Implications for Science–Practice Relations," Faculty Publications. Paper 42, available here) tracked down the 25 most highly cited researchers in the Handbook of Psychotherapy and Behavior Change (4th ed.; Bergin & Garfield, 1994) and asked each one to agree or disagree with a number of assertions, including the statement: "Approximately 10% of clients get worse as a result of therapy." The average response to that statement was 5.67 on a scale of 1 to 7, where 1 meant "I'm extremely certain that the assertion is incorrect" and 7 meant "I'm extremely certain that the assertion is correct." (For comparison's sake, the statement "Therapy is helpful to the majority of clients" scored just 6.33.)

While 10% seems to be an accepted ballpark figure for iatrogenic talk-therapy outcomes, the true number could be as high as 30% (see this paper).

If we count incorrect diagnosis as a form of patient harm, two of the most harmful therapeutic tools in psychiatry are the Rorschach Test (or ink-blotch test) and the Thematic Apperception Test. Around 70% of normal individuals who take the Rorschach Test score as if they're seriously disturbed (Professor James M. Wood, Univ. of Texas, quoted here). In 2000, an extremely thorough meta-analysis found the Rorschach Test, the Thematic Apperception Test, and human figure drawing to have so many problems, not just with repeatability but with basic validity, that they shouldn't be used any more, basically.  

In the much-cited "Psychological Treatments That Cause Harm," Perspectives on Psychological Science, March 2007 2(1):53-70 (PDF here), Emory University's Scott Lilienfeld identifies a number of different types of therapy that have been shown to have the potential to hurt patients more than they help them. Potentially harmful therapies identified by Lilienfeld include the following:

Type of Therapy Adverse Outcome(s) Source of Evidence
Critical incident stress debriefing Heightened risk for post-traumatic stress symptoms RCTs
"Scared Straight" interventions Exacerbation of conduct problems RCTs
"Facilitated Communication" False accusations of child abuse against family members Low base rate events in replicated case reports
Attachment therapies (e.g., rebirthing) Death and serious injury to children Low base rate events in replicated case reports
Recovered-memory techniques Production of false memories of trauma Low base rate events in replicated case reports
DID-oriented therapy Induction of ‘‘alter’’ personalities Low base rate events in replicated case reports
Grief counseling for bereavement Increases in depressive symptoms Meta-analysis
Expressive-experiential therapies Exacerbation of painful emotions RCTs
Boot-camp interventions for conduct disorder Exacerbation of conduct problems Meta-analysis
DARE programs Increased intake of alcohol and other substances RCTs

DID means Dissociative Identity Disorder; DARE refers to Drug Abuse and Resistance Education.

Lilienfeld is quick to point out that even a therapy that's not bringing about actual deterioration in a patient's condition can be harmful if it delays seeking out a more effective therapy. For example, it's well known that behavioral therapies tend to be more effective than nonbehavioral therapies for obsessive-compulsive disorder, generalized anxiety disorder, and phobias. Some patients have spent years trying to cure a phobia, only to find that by switching therapists (and using a more appropriate therapy) the phobia becomes manageable after one visit.

Lilienfeld and others have noted that negative outcomes tend to be far more frequent in treatment programs aimed at adolescents than those aimed at adults, especially in group-oriented programs, where social effects can overwhelm the therapy. Many examples of shockingly negative outcomes in youth programs can be found in the literature (start by reading this excellent paper by Rhule). Perhaps the most celebrated disaster (in the U.S., at least) is the Scared Straight program, which has repeatedly been shown to be counterproductive (actually increasing the odds of kids going to prison) yet has been rolled out to dozens of U.S. cities and continues to be the basis of popular TV shows. (See this meta-analysis.)


Additional Reading
See this unusual blog post that stirred an amazing 472 commenters to give their experiences with therapy gone awry. In the comment trail you'll find scores of fascinating outbound links to YouTube videos, books, forum discussions, and other resources.

If you read only one academic paper on this subject, be sure it's Scott Lilienfeld's seminal 2007 paper, "Psychological Treatments That Cause Harm."

If you've been abused in therapy: http://www.therapyabuse.org.

Bates, Yvonne, editor (2006) Shouldn’t I Be Feeling Better By Now? Client views of Therapy. Houndmills, Basingstoke, Hampshire, UK: Palgrave McMillan. A book that presents a variety of views from a variety of kinds of patients.

Sunday, April 07, 2013

What Doctors Don't Know about the Drugs They Prescribe

In this video, Dr. Ben Goldacre explains to a TED audience why publication bias has brought medical research to a crisis point that must be addressed immediately. If you enjoy the video, please share with others via social media and also visit alltrials.net to sign the AllTrials petition.

Saturday, April 06, 2013

The Mother of All Depression Studies: STAR*D

Given the amount of public funding ($35 million) that went into the six-year STAR*D study, it's utterly astounding that more people haven't heard of it. The Sequenced Treatment Alternatives to Relieve Depression study was easily the single largest study of its kind, ever, lasting six years and involving over 4,000 patients in more than 40 treatment centers.

Well over 100 scientific papers came out of the study (so it's not like you can assimilate the whole thing by reading any one article), and most are behind a paywall, which is outrageous when you consider that the entire study was funded by U.S. taxpayer dollars. Robert Whitaker, author of Mad in America and Anatomy of an Epidemic (superb books, by the way) has put one of the summary papers up on his web site. Be warned, however, that the STAR*D findings are subject to many interpretations. See this paper by H. Edmund Pigott for added perspective. Also see these STAR*D documents.

Some of the STAR*D researchers (e.g., M. Fava, S.R. Wisniewski, A.J. Rush, and M.H. Trivedi) went on to publish upwards of 15 papers a year (sometimes 20 a year) on the STAR*D results. How do you write that many papers? Short answer: with professional help. At least 35 STAR*D papers were, in fact, ghostwritten by freelance tech writer Jon Kilner, who lists the papers as "writing samples" on his web site.

The STAR*D study was unique in a number of ways. First, it was an uncontrolled study (which of course puts serious limitations on its validity). Secondly, it involved real-world patients in clinical practice, not paid volunteers. Thirdly, it tested eleven distinct "next step" treatment options, mostly involving drugs but also including talk-therapy arms. Fourthly, the primary outcome measure was remission (a fairly stringent measure of progress).

The idea of the study was to offer patients suffering from major depression intensive treatment with a well-tolerated first-line SSRI (namely Celexa) to see how many patients would go into remission with drug treatment alone; then offer those who didn't improve on Celexa a second line of treatment (involving various drug and talk-therapy options); then offer those who didn't respond to the second line of treatment a third type of treatment; and finally, offer a fourth type of treatment to anyone left.

Participants were given free medical care throughout the study. All drugs were provided at no cost by the respective manufacturers (including Viagra for those who wanted it) and patients were offered $25 as an incentive for completing follow-up assessments.

STAR*D study design (click to enlarge). While 4,041 patients were initially enrolled, only 3,671 participants actually entered the first level of the study, which involved taking citalopram (Celexa), an SSRI chosen for its supposed high efficacy and low level of side effects. Patients who failed to get better in Level 1 could progress to Level 2, then Level 3 and finally Level 4. Attrition was a huge problem, despite free drugs, free medical care, and free Viagra.

The overall study design (and the numbers of participants at each level) can be seen in the accompanying flow graph. Note that every participant got Celexa in the first level of the study. Any patient who wasn't happy with his or her current treatment option was allowed to progress to the next level of the study at any time, and the choice of treatment option was up to the patient. (All meds were open-label.) This proved to be disastrous for the cognitive therapy (CT) arms of the treatment, which saw only 101 people make it through to the end. (Not all of the 147 patients who started CT finished.) So few CT patients finished, and so many of them started taking other drugs, that no scientific papers about the talk-therapy part of the STAR*D study were ever published.

Participant attrition was a big problem for the study. Around half (48%) of the study population dropped out early, despite various inducements to stay.

What were the study's main findings? STAR*D researchers reported that for the "intent to treat" group(s), remission rates were 32.9% for Level 1 of the study, 30.6% for Level 2, 13.6% for Level 3, and 14.7% for Level 4. These numbers are likely inflated (read Pigott's analysis to learn why), but even assuming the numbers are right, it speaks to very disappointing effectiveness of conventional treatments for depression. This is especially true when you consider that studies on the natural course of depression show that half of all patients recover spontaneously in three months (see this study) to twelve months (this study) whether treated or not.

Another major finding of STAR*D was that no one treatment strategy (no one drug or combination of drugs) stood out as doing much better than any other. Changing from Celexa to Wellbutrin (bupropion), for example, saw 26% of patients either remit or get better on their assessment survey. Patients who were non-responsive to an SSRI (Celexa) and switched over to an SNRI (Effexor) got better 25% of the time. But 27% also got better when switched from the original SSRI (Celexa) to a different SSRI (Zoloft).

Another finding was that very few of the patients who scored a remission stayed in remission. The overwhelming majority of patients who got better eventually relapsed or dropped out. This fact was glossed over in all of the original STAR*D studies, then brought up by H. Edmund Pigott and others, who analyzed the data behind Figure 3 of this paper. It turns out that of 1,518 patients who remitted and stayed in the study, only 108 patients (7.1%) failed to relapse. This is a stunning indictment of the effectiveness of available treatment options for depression.

Space prohibits a thorough critique of the STAR*D trial here, but suffice it to say there were numerous problems not only with the study design itself (e.g., the study was uncontrolled; and patients were allowed to take anxiolytics, sleep aids, and other psychoactive drugs during their "treatment," confounding the results) as well as with the way the results were spin-doctored in the literature. If you're interested in the gory details, you can read an excellent critique here and also here. Plus there's an interesting paper behind Springer's paywall, mentioned in this Psychology Today blog, and there's a worthwhile Psychology Today post here.

Personally, I don't think you have to do much critiquing of the STAR*D results to see how dismal the findings were. After six years of work and $35 million spent, NIMH only confirmed what all of us already knew, which is that modern antidepressants aren't terribly effective and it doesn't much matter which one you use, because they all perform equally poorly.

But don't worry. Big Pharma will come up with something new and improved Real Soon Now.

Friday, April 05, 2013

BRAIN: A Penny for Your Thoughts

President Obama's kickoff of the BRAIN initiative was a major news item the other day. Widely lauded as the kind of program that can keep America at the forefront of science, Brain Research through Advancing Innovative Neurotechnologies (or, if you prefer, Big Ridiculous Acronyms Inspired by Nonsense) was compared to the Apollo moon program, which "gave us CAT scans" (the President said) and to the Human Genome Project, which altered economic reality as we know it by creating $140 in return for every dollar invested.

The President hailed BRAIN as "a bold new research effort to revolutionize our understanding of the human mind and uncover new ways to treat, prevent, and cure brain disorders like Alzheimer’s, schizophrenia, autism, epilepsy, and traumatic brain injury."
 
And for all this great knowledge, we're going to pay just $100 million (first year).

Imagine: For just $100 million, we're going to find new ways to treat and cure schizophrenia, autism, epilepsy, traumatic brain injury, and (oh yeah, I almost forgot) Alzheimer's.

When I first heard the President mention the $100 million figure, I couldn't help thinking of Dr. Evil's demand for money in Austin Powers: International Man of Mystery:
Dr. Evil: Shit. Oh hell, let's just do what we always do. Hijack some nuclear weapons and hold the world hostage. Yeah? Good! Gentlemen, it has come to my attention that a breakaway Russian Republic called Kreplachistan will be transferring a nuclear warhead to the United Nations in a few days. Here's the plan. We get the warhead and we hold the world ransom for... ONE MILLION DOLLARS!
 

Number Two: Don't you think we should ask for *more* than a million dollars? A million dollars isn't exactly a lot of money these days. Virtucon alone makes over 9 billion dollars a year!
 
Dr. Evil: Really? That's a lot of money. 
Did no one tell Mr. Obama, prior to the BRAIN press conference, that the National Institutes of Mental Health already spends $1.4 billion a year, of which $1.1 billion goes to research?

Does Obama not realize that the pharmaceutical industry spends $68 billion a year on research and development, much of it on brain research?

Did no one bother to tell Mr. President that between government and the pharma industry, there's been well over a trillion dollars spent on brain research in the last 20 years, and we're still not even close to understanding (much less curing) schizophrenia, autism, epilepsy, etc.?

Is our President not aware that the U.S. government spending another $100 million on brain research is like Bill Gates giving another fifty bucks to Ronald McDonald House?

Tomorrow, I want to talk, in depth, about an example of what real-world brain research (conducted with federal funding) costs. I'm going to talk about the STAR*D study, the most extensive (and expensive) single study ever conducted on the treatment of depression.

Thursday, April 04, 2013

Myth: Antidepressants Take Weeks to Work

One of the more popular myths about antidepressants is that they take weeks to work. You'll find this "fact" stated axiomatically (which is to say without supporting citations) in many scientific papers, popular articles (web and print), and package inserts. "Everyone knows" SSRIs take weeks to work. It has to be true, because everybody says it is. Right?

Wrong. In 2006, JAMA Psychiatry published a paper by Matthew J. Taylor et al. called "Early Onset of Selective Serotonin Reuptake Inhibitor Antidepressant Action: Systematic Review and Meta-analysis" (full copy here) that set out to answer this very question (the question of whether it takes weeks for antidepressants to work). Taylor and his colleagues looked at 20 reports involving 28 separate trials of antidepressants. The drugs studied in the 28 trials included fluoxetine (Prozac), paroxetine (Paxil), citalopram (Celexa), escitalopram (Lexapro), fluvoxamine (Luvox),and sertraline (Zoloft). The result:
This analysis supports the hypothesis that SSRIs begin to have observable beneficial effects in depression during the first week of treatment. The early treatment effect was seen on the primary outcome of differences in depressive symptom rating scale scores [ . . .]
All of the 28 trials kept weekly assessments of patients using either the Hamilton Depression Rating Scale (HDRS) or the Montgomery-Asberg Depression Rating Scale (MADRS). All found improvement in Week 1, and in fact the greatest weekly improvement typically occurs in Week 1, with subsequent weekly scores improving less and less as time goes on.

A meta-analysis of 47 studies by Posternak and Zimmerman, published in the Journal of Clinical Psychiatry, 2005 Feb;66(2):148-58 found that by the end of Week 2, patients are experiencing 60% of whatever total effect they're ever going to see.


This set of treatment response curves (from a study investigating combined
use of olanzapine and fluoxetine, which is to say Zyprexa and Prozac) is
typical, in that the biggest single-week gain is measured in Week 1.
 
Interestingly, there's some evidence that you may be able to speed up the onset of effects even more simply by taking aspirin along with your SSRI. Mendlewicz et al., writing in International Clinical Psychopharmacology, July 2006 21(4):227-231, reported:
Participants were treated openly during 4 weeks with 160 mg/day ASA in addition to their current antidepressant treatment. The combination SSRI-ASA was associated with a response rate of 52.4%. Remission was achieved in 43% of the total sample and 82% of the responder sample. In the responder group, a significant improvement was observed within week 1 (mean Hamilton Depression Rating Scale-21 items at day 0=29.3±4.5, at day 7=14.0±4.1). [emphasis added]
ASA is acetylsalicylic acid -- plain aspirin.

For more on the counterfactual nature of the idea that antidepressants take weeks to act, see the paper by Alex J. Mitchell in The British Journal of Psychiatry (2006) 188: 105-106 (full copy here) and also the report by Parker et al. in Aust N Z J Psychiatry February 2000 vol. 34 no. 1, 65-70 (full copy here).

Tuesday, April 02, 2013

Does Antidepressant Dose Matter?

One of the great myths about antidepressants is that higher doses are more effective than lower doses. That's not how these drugs work, though. Most studies show a flat dose-response "curve" with SSRIs. More isn't better.

This is an important point to understand, because antidepressants quite often don't work for a given patient on the first try, and then it's necessary to do some "adjusting," which means either increasing the dose, switching to another drug, or adding a second med. Fredman et al. found that among 432 attendees at a psychopharmacology course, asked what they would do for a non-responsive patient on SSRIs, most chose raising the dose as the best next-step option. But that's the wrong answer.

In "Dose-response relationship of recent antidepressants in the short-term treatment of depression," Dialogues Clin Neurosci. 2005 September; 7(3): 249–262, (full article here) Patricia Berney of Unité de Psychopharmacologie Clinique, Hôpitaux Universitaires de Genève, Chêne-Bourg, Switzerland, examines a large number of clinical trials to discover the extent to which dosage matters in antidepressant therapy. It's worth going through her discussion drug-by-drug, because if you're taking one of these drugs and your doctor suggests a higher dose, you may want to consider whether it's going to be worth the time and expense. Usually it's not.

Citalopram (Celexa)
"The short-term studies with citalopram did not show significant differences In terms of clinical efficacy across a dose range of 20 to 60 mg/day. Even a dose of 10 mg/day was effective compared with placebo. The results of the maintenance study by Montgomery et al. and the meta-analysis by the same authors support these findings. Therefore, for the majority of patients, there Is no advantage of increasing the dose of citalopram above 20 mg/day."

Escitalopram (Lexapro)
This drug is the S-stereoisomer of citalopram (Celexa), meaning that it's structurally no different from half of what's in citalopram. (Citalopram is a so-called racemic mixture of left- and right-handed versions of the same molecule.) According to Berney: "The only fixed-dose-response study with escitalopram indicates that 10 mg/day was equally as effective as 20 mg/day."

Fluoxetine (Prozac)
"The studies with fluoxetine did not show significant differences in terms of clinical efficacy across a dose range of 20 to 60 mg/day. Even a dose of 5 mg/day was effective compared with placebo. Therefore, for the majority of patients, there is no advantage of increasing the dose of fluoxetine above 20 mg/day. It might even be the case that the higher dose of 60 mg/day is less effective in major depressive disorder." The latter refers to a study by Wernicke: "Fluoxetine 20 and 40 mg/day were statistically superior to 60 mg/day." One study (Dunlop) involving 372 patients found: "Fluoxetine 20, 40, and 60 mg/day each produced an improvement that was no different from placebo on change in the HAMD total score in 355 ITT patients with LOCF at the end of 6 weeks." (ITT = Intent to Treat, LOCF = Last Observation Carried Forward.) The result with completer-case analysis, which is to say just considering the 214 patients who finished the study, was that changes in the HAMD total score "were similar to those of the ITT-LOCF analysis."

Fluvoxetine (Luvox)
"In this fixed-dose study on a large sample, only fluvoxamine 100 mg/day showed a significant therapeutic benefit over placebo at end-point analysis (on LOCF) on modified HAMD 13 items final score at the end of 6 weeks at fixed dose. Significant differences were not seen between fluvoxamine 25, 50, or 150 mg/day or placebo. On the HAMD 13-items responder analysis, the differences were significant for fluvoxamine 100 and 150 mg/day compared with placebo, but not between these two dosages on visual inspection of the figures in the publication on completer cases analysis."

Paroxetine (Paxil)
"In the publication by Dunner and Dunbar, there is a short description of a study involving 460 patients. The paroxetine 10 mg/day dose was no more effective than placebo, even on the HAMD depressed mood item. The authors reported also on a pooled analysis from a worldwide database, involving 1091 patients who remained on a fixed dose of paroxetine or placebo for at least 4 weeks, which showed no differences in terms of clinical efficacy across a dose range of 20 to 40 mg/day paroxetine. Therefore, for the majority of patients, there is no advantage in increasing the dose of paroxetine above 20 mg/day."

Note: It's also worth looking at a more recent study by Ruhé et al., "Evidence why paroxetine dose escalation is not effective in major depressive disorder: a randomized controlled trial with assessment of serotonin transporter occupancy," Neuropsychopharmacology 2009 Mar;34(4):999-1010, in which unipolar depressed patients on Paxil or placebo (double-blind) were scanned for SERT occupancy by single-photon emission-computed tomography (SPECT). The overall finding: "Paroxetine dose escalation in depressed patients has no clinical benefit over placebo dose escalation."

Sertraline (Zoloft)
"In the study by Fabre and Putman, sertraline 50 mg/day, but not 100 and 200 mg/day, was more effective than placebo at end-point analysis on change on the HAMD 17 items total score on ITT-LOCF at 6 weeks. There was no statistical analysis performed between the different doses, but inspection of the data in the publication suggests no differences."

Milnacipran (Savella)
This SNRI is approved for treating depression outside the U.S., but inside the U.S. it is approved only for fibromyalgia. Four studies were analyzed. They showed "flat dose-response relationship between 100 and 300 mg/day," with 50 mg/day being less effective than placebo.

Venlaxafine (Effexor)
This SNRI (inhibiting reuptake of both serotonin and norepinephrine) is a Top Ten antidepressant in the U.S. Patricia Berney's finding from reading the clinical trials:
In the venlafaxine studies, doses varied between 25 and 375 mg/day. A positive dose-response curve was only demonstrated with trend analysis. However, the difference between the higher dose range and placebo was not pronounced. Better efficacy could be obtained with a dose of venlafaxine above 75 mg/day in terms of remission rate. In a review concerning all aspects of antidepressant use, Preskorn mentioned an ascending then descending dose-response curve for venlafaxine in an evaluation comparing 7 dose levels between 25 and 375 mg/day with placebo, coming from fixed and flexible-dose studies. However, the major difference in terms of mean HAMD score change, ie, 2 points, was between a group of patients receiving 175 mg/day and another receiving 182 mg/day, hardly a different dose! This suggests a calculation artifact rather than a pharmacological dose-response curve.

For the majority of patients, a dose of venlafaxine 75 mg/day should be adequate.

Reboxetine (Edronax, Norebox, Prolift, Solvex, Davedax, Vestra)
A norepinephrine reuptake inhibitor, reboxetine is approved for depression outside the U.S. Said Berney: "Despite availability of several short clinical trials, we cannot comment on the dose-response relationship for reboxetine."

Duloxetine
"No positive dose-response relationship has been found for 40 to 120 mg/day."


The common thread here is obvious. Antidepressants, when they work at all, tend to work about as well whether you take a large dose or a small dose. The dose-response curve for most of these drugs is a straight horizontal line. That's the main takeaway not only from Berney's meta-analysis but a separate meta-analysis by Baker et al.,"Evidence that the SSRI dose response in treating major depression should be reassessed," Depression and Anxiety, (2003), 17(1):1-9. It's also the conclusion reached in yet another meta-analysis by Hansen et al., Med Decision Making January/February 2009, 29(1):91-103 (Said the Hansen group: "Dose was not a statistically significant predictor of categorical HAM-D response. Among comparative trials with nonequivalent doses, trends favored higher dose categories but generally were not statistically significant.")

The lack of a dose-response relationship above a certain minimum effect dose doesn't necessarily mean the drugs are doing nothing. It means that once you've achieved an in vivo concentration of the drug sufficient to saturate whatever transporter protein(s) or receptors the drug in question targets, there's nothing left for the drug to attach to. Thus any excess goes unused.

It's important to note, though, that these drugs do tend to show a strong dose-response relationship when it comes to side effects. (One 2010 meta-analysis found that across 9 different studies, "Higher doses of SSRIs were associated with significantly higher proportion of dropouts due to side-effects.")

Bottom line: Upping your dose, on any of the major antidepressants listed above, isn't a good strategy. It may worsen side effects, but it's unlikely to change your therapeutic outcome. If a particular drug isn't working for you, a far better strategy than upping the dose is to try a different drug and/or add talk therapy to the mix (if you're not already doing it).

Likewise: If you are seeing good therapeutic effect from a drug but side effects are causing problems, you should ask your doctor or nurse practitioner about dialing back the dosage to the minimum therapeutic dose, so as to keep side effects from swamping the therapeutic effect.