Saturday, January 10, 2015

Where Are the Schizophrenia Genes?

Schizophrenia is another one of those conditions (like alcoholism) that we're constantly being told is highly heritable, based on the results of twin studies and various other findings (from studies that pre-date the DNA-sequencing era). Studies "proving" the heritability of schizophrenia are, in general, rarely questioned, or even reviewed critically, due to the fact that a genetic basis for schizophrenia makes so much intuitive sense and agrees so well with ordinary experience. Everybody knows someone who "has schizophrenia in their family," and it very much does seem to travel in families.

The fact that something travels in families is not automatically indicative of a genetic connection, but most people aren't ready to believe such a thing since it goes against common sense. Nevetheless, in the past few decades, a great deal of work (much of it quite compelling) has gone into the study of things like intergenerational trauma, with researchers finding, for example, that the course of PTSD in modern-day Israeli military veterans is much different for soldiers whose parents or grandparents were Holocaust survivors. (I talk about intergenerational trauma in Of Two Minds. In fact, there's a whole chapter on Trauma.)

Social workers and criminologists have known for years that childhood physical and sexual abuse tend to be intergenerational. Yet no one seriously suggests there's a "child abuse gene" or a "child molestation gene." A lot of things that aren't purely genetic are trans-generational; familial. Schizophrenia itself may be in this category. The links between childhood trauma and schizophrenia are quite strong and well known. Indeed, the literature shows that the associations between childhood trauma and schizophrenia are actually as strong as or stronger than the associations between childhood trauma and other disorders that we know have a high correspondence to childhood trauma (including depression, PTSD, anxiety disorders, dissociative disorders, eating disorders, personality disorders, substance abuse, and sexual dysfunction, among others).

Still, the search for "schizophrenia genes" goes on, and every so often we hear claims for this or that group of scientists having discovered some new genetic feature that confers high risk for schizophrenia. The latest of these reports comes from the Schizophrenia Working Group of the Psychiatric Genomics Consortium, which published a piece in Nature Letters in July 2014 called "Biological insights from 108 schizophrenia-associated genetic loci." The report trumpets the finding of "108 conservatively defined loci that meet genome-wide significance, 83 of which have not been previously reported." This is a genome-wide association study, subject to the usual GWAS caveats.

Of the 108 loci, 75% were associated with protein-coding genes, a very high number for a GWAS study, and quite a few genes involve physiologically relevant functions.

On the surface, it sounds promising.

Where things start to fall down is in determining how much heritability risk the 108 loci can account for. The authors said: "Assuming a liability-threshold model, a lifetime risk of 1%, independent SNP effects, and adjusting for case-control ascertainment, RPS" (risk profile scores) "now explain about 7% of variation on the liability scale to schizophrenia across the samples, about half of which (3.4%) is explained by genome-wide significant loci." To parse that: We know the lifetime risk of schizophrenia is 1%. If we total up the disease risk attributable to all of the 128 markers found in the study, we can explain 7% of schizophrenia risk. But if we limit ourselves to the 108 markers that met criteria for genome-wide significance (a stringent test intended to reduce the risk of false positives), we can explain only a little more than 3% of schizophrenia risk.

So once again, it's a mixed bag: a genome-wide association study (GWAS) that's filled with interesting findings (some of which may prove to be important), but that fails, quite miserably, to explain heritability.

Many people have commented on the possible reasons why genome-wide association studies have been hit-or-miss in their ability to explain heritability, why they so often find SNPs (single nucleotide polymorphisms) that don't map to genes, why the results often don't replicate well, etc. I think if we're honest, we have to admit that GWAS isn't always the right tool for the job. It's a great tool, but you have to remember how it works. GWAS attempts to match fairly common (i.e., occurring in 1% or more of the population) single nucleotide polymorphisms (which you can think of as point mutations) in databases that have been specially built to accommodate the most common polymorphisms. In a GWAS, you aren't doing deep sequencing of people's genes; you're looking for markers that cover a rather tiny percent of the genome. These markers occur commonly, which almost by definition prevents them from being very serious defects, because serious genetic defects are rendered rare by evolution (natural selection removes them from the gene pool over time). Therefore, in a technique that, by design, looks mainly at commonly occurring SNPs, which are overwhelmingly neutral (from an evolutionary standpoint), we should not be surprised to find that the markers that get spotted in GWAS (as genome-wide significant) are essentially neutral mutations without much phenotypic effect.

Unfortunately, we're at an awkward point in the history of science, because technology has given us some powerful DNA analysis tools (and the computers needed to crunch the data), but we're not quite yet at the point where detailed, deep genetic sequencing of whole human genomes is economically feasible for a study that includes, say, a thousand cases and a thousand controls. (Right now, it costs about $10,000 to do the necessary sequencing.) When a human genome can be reliably sequenced for under $2000, we'll have the tools to do really serious, deep genetic studies of hard-to-pin-down diseases. Whether we'll have the computer power needed to do the necessary statistical analysis on the scale of large (N > 1000) case-control studies is another matter; remember that in GWAS we're dealing with perhaps 500,000 data points per genome whereas in deep sequencing we're generating billions of data points per genome. It'll probably be doable using Amazon AWS/EC2 infrastructure (or Google's compute-cloud). Another reason to invest in Amazon, probably.

I'm in the process of writing a 100,000-word book on mental illness. It's evidence-based; part science, part memoir, with over 200 footnotes (so you can refer to the scientific literature yourself). To follow the progress of the book, check back here often, and also consider adding your name to our mailing list. Thanks!