Friday, March 22, 2013

The "Good Data" Problem in Science

In science today, there's a huge problem: Way too many people are publishing way too many results that are way too positive.

This problem isn't imaginary or speculative. It's rampant and well documented. Literature surveys have repeatedly uncovered a statistically unlikely preponderance of "statistically significant" positive findings in medicine [see Reference 1 below], biology [2], ecology [3], psychology [4], economics [5], and sociology [6].

Positive findings (findings that support a hypothesis) are more likely to be submitted for publication, and more likely to be accepted for publication [7][8]. They're also published quicker than negative ones [9]. In addition, only 63% of results presented initially in Abstracts are eventually (within 9 years) published, and those that are published tend to be positive [10]. There's some evidence, too, that higher-prestige journals (or at least ones that get cited more often) publish research reporting greater "effect sizes" [11].

Why is this a problem? When only positive results are reported, knowledge is being hidden. A false picture of reality is presented (e.g., a false picture of the effectiveness of particular drugs or treatment options). Lack of informed decision-making in health care is a serious issue, since lives are at stake. But even when lives are not at stake, theories go awry when a hypothesis is verified in one experiment yet refuted in another experiment that goes unpublished because "nothing was found."

Another reason under-reporting of negative findings is a pressing problem is that meta-analyses and systematic literature reviews, which are an increasingly common and important tool for understanding the Big Problems in medicine, psychology, sociology, etc., suffer from a classic GIGO conundrum: The meta-analyses are only as good as the available input. If the input is fundamentally flawed in some way, odds are good that the meta-analysis will also be flawed. 

In some cases, scientists (or drug companies) simply fail to submit their own negative findings for publication. But research journals are also to blame for not accepting papers that report negative results. Cornell University's Daryl Bem famously raised eyebrows in 2011 when he published a paper in The Journal of Personality and Social Psychology showing that precognition exists and can be measured in the laboratory. Of course, in reality, precognition (at least of the type tested by Bem) doesn't exist, and the various teams of researchers who tried to replicate Bem's results were unable to do so. When one of those teams submitted a paper to The Journal of Personality and Social Psychology, it was flatly rejected. "We don't publish replication studies," the team was told. The researchers in question then submitted their work to Science Brevia, where it also got turned down. Then they tried Psychological Science. Again, rejection. Eventually, a different team (Galak et al.) succeeded in getting its Bem-replication results (which were negative) published in The Journal of Personality and Social Psychology, but only after the journal came under huge criticism from the research community. (Find more on this fracas here and here and here.)

Aside from the dual problems of researchers not submitting negative findings for publication and journals routinely rejecting such submissions, there's a third aspect to the problem, which can be called outcome reporting bias: putting the best possible spin on things. This takes various forms, from changing the chosen outcomes-measure after all the data are in (to make the data look better, via a different criterion-of-success; one of many criticisms of the $35 million STAR-D study of depression treatments), "cherry-picking" trials or data points (which should probably be called pea-picking in honor of Gregor Mendel, who pioneered the technique), or the more insidious phenomenon of HARKing, Hypothesizing After the Results are Known [12], all of which often occur with selective citation of concordant studies [13].

No one solution will suffice to fix this mess. What we do know is that the journals, the professional associations, government agencies like FDA, and scientists themselves have have tried in various ways (and failed in various ways) to address the issue. And so, it's still an issue.

A deterioration of ethical standards in science is the last thing any of us needs right now. Personally, I don't want to see the government step in with laws and penalties. Instead, I'd rather see professional bodies work together on:

1. A set of governing ethical concepts (a explicit, detailed Statement of Ethical Principles) that everyone in science can (voluntarily) pledge to abide by. Journals, drug companies, and researchers need to be able to point to a set of explicit guidelines and be able to go on record as supporting those specific guidelines.

2. Journals should make public the procedures by which disputes involving the publishability of results, or the adequacy of published results, can be aired and resolved.

In the long run, I think Open Access online journals will completely redefine academic publishing, and somewhere along the way, such journals will evolve a system of transparency that will be the death of dishonest or half-honest research. The old-fashioned Elsevier model will simply fade away, and with it, most of its problems.

Over the short run, though, we have a legitimate emergency in the area of pharma research. Drug companies have compromised public safety with their well-documented, time-honored, ongoing practice of withholding unfavorable results. The ultimate answer is something like what's described at AllTrials.net. It's that, or class-action hell forever.


References



  • 1.Kyzas PA, Denaxa-Kyza D, Ioannidis JPA (2007) Almost all articles on cancer prognostic markers report statistically significant results. European Journal of Cancer 43: 2559–2579. [here]
  • 2.Csada RD, James PC, Espie RHM (1996) The “file drawer problem” of non-significant results: Does it apply to biological research? Oikos 76: 591–593. [here]
  • 3.Jennions MD, Moller AP (2002) Publication bias in ecology and evolution: An empirical assessment using the ‘trim and fill’ method. Biological Reviews 77: 211–222. [here]
  • 4.Sterling TD, Rosenbaum WL, Weinkam JJ (1995) Publication decisions revisited - The effect of the outcome of statistical tests on the decision to publish and vice-versa. American Statistician 49: 108–112. [here]
  • 5.Mookerjee R (2006) A meta-analysis of the export growth hypothesis. Economics Letters 91: 395–401. [here]
  • 6.Gerber AS, Malhotra N (2008) Publication bias in empirical sociological research - Do arbitrary significance levels distort published results? Sociological Methods & Research 37: 3–30. [here]
  • 7. Song F, Eastwood AJ, Gilbody S, Duley L, Sutton AJ (2000) Publication and related biases. Health Technology Assessment 4 [here]
  • 8.Dwan K, Altman DG, Arnaiz JA, Bloom J, Chan A-W, et al. (2008) Systematic review of the empirical evidence of study publication bias and outcome reporting bias. PLoS ONE 3: e3081. [here]
  • 9.Hopewell S, Clarke M, Stewart L, Tierney J (2007) Time to publication for results of clinical trials (Review). Cochrane Database of Systematic Reviews. [here]
  • 10.Scherer RW, Langenberg P, von Elm E (2007) Full publication of results initially presented in abstracts. Cochrane Database of Systematic Reviews. [here]
  • 11.Murtaugh PA (2002) Journal quality, effect size, and publication bias in meta-analysis. Ecology 83: 1162–1166. [here]
  • 12.Kerr, Norbert L. (1998) HARKing: Hypothesizing After the Results are Known, Personality and Social Psychology Review, 2(3):196-217. [here]
  • 13.Etter JF, Stapleton J (2009) Citations to trials of nicotine replacement therapy were biased toward positive results and high-impact-factor journals. Journal of Clinical Epidemiology 62: 831–837. [here]
  • 69 comments:

    1. Anonymous4:32 PM

      I could have sworn I left a comment here.

      Anyhow, in response to the previously left comment in reply to my original comment:
      I respectfully disagree that publishing failed reactions would add any betterment to scientific progress. In fact, I would argue it would slow progress down. Too much time and effort would be spent writing up failed experiments.

      Also, I must emphasize a point I made in my previous comment:
      Publishing dishonest data and not publishing failed reactions are two different things. Not sure to which you are referring by:
      "deterioration of ethical standards in science is the last thing any of us needs right now"
      I argue it is not unethical to withhold (or not bother) publishing failed experiments. However, the former is indeed unethical.

      ReplyDelete
    2. Attend The Data Analytics Courses From ExcelR. Practical Data Analytics Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Analytics Courses.
      Data Analytics Courses
      Data Science Interview Questions

      ReplyDelete
    3. I have recently started a blog, the info you provide on this site has helped me greatly. Thanks for all of your time & work.
      Data Science Course in Hyderabad | Data Science Training in Hyderabad

      ReplyDelete

    4. https://kdp.amazon.com/community/profile.jspa?editMode=true&userID=1424531
      https://kdp.amazon.com/community/profile.jspa?editMode=true&userID=1427388
      https://kdp.amazon.com/community/profile.jspa?editMode=true&userID=1428236
      https://challenges.openideo.com/profiles/1119874714489381863241486229946090
      http://www.dead.net/member/khairyayman
      https://vimeo.com/user54212503
      https://mootools.net/forge/profile/naklafshdmam
      http://bionumbers.hms.harvard.edu/bionumber.aspx?&id=113190


      ReplyDelete
    5. I really enjoy simply reading all of your weblogs. Simply wanted to inform you that you have people like me who appreciate your work. Definitely a great post. Hats off to you! The information that you have provided is very helpful.
      Data Science Course in Bangalore

      ReplyDelete
    6. I finally found great post here.I will get back here. I just added your blog to my bookmark sites. thanks.Quality posts is the crucial to invite the visitors to visit the web page, that's what this web page is providing.
      Data Science Training in Bangalore

      ReplyDelete
    7. I have bookmarked your site since this site contains significant data in it. You rock for keeping incredible stuff. I am a lot of appreciative of this site.
      data science certification

      ReplyDelete
    8. Hello, I have browsed most of your posts. This post is probably where I got the most useful information for my research. Thanks for posting, we can see more on this. Are you aware of any other websites on this subject.
      data science course hyderabad

      ReplyDelete
    9. I think I have never seen such blogs ever before that has complete things with all
      details which I want. So kindly update this ever for us.
      data science institute in hyderabad

      ReplyDelete
    10. Bass Clef Notes: The modern staff is made up of five lines and four spaces, each of which is reserved for a specific pitch. At the beginning of each staff. bass clef

      ReplyDelete

    11. Thanks for sharing this post.It is very informative and helpful.

      SMS Bomber Online
      How to hack kahoot

      ReplyDelete
    12. Attend The Data Science Courses From ExcelR. Practical Data Science Courses Sessions With Assured Placement Support From Experienced Faculty. ExcelR Offers The Data Science Courses.
      Data Science Courses

      ReplyDelete
    13. Wow! Just when I thought your work couldn’t get any better!
      360DigiTMG big data analytics course

      ReplyDelete
    14. Thanks for posting the best information and the blog is very helpful.data science interview questions and answers

      ReplyDelete
    15. Very good message, I stumbled across your blog and wanted to say that I really enjoyed reading your articles. Anyway, I will subscribe to your feed and hope you post again soon.
      Data Analytics course in Jaipur

      ReplyDelete
    16. Great site you have here. It’s hard to find quality writing like yours nowadays. I really appreciate people like you! keep up the good work.
      Data Analytics course in Vadodara

      ReplyDelete
    17. You appear to know a lot about this. It’s like you read my thoughts!. Thanks for writing this.
      Data Science Course in Vadodara

      ReplyDelete
    18. It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.
      Data Science Course in Jaipur

      ReplyDelete
    19. Your style is really unique in comparison to other people I have read stuff from. I appreciate you for posting when you have the opportunity, Guess I will just bookmark this site.
      Data Science Course in Nagpur

      ReplyDelete
    20. I don’t even know the way I stopped up here, but I believed this submit used to be good. I do not recognize who you are but definitely you’re going to a famous blogger in the event you are not already.
      Data Science Course in Lucknow

      ReplyDelete
    21. whoa this blog is wonderful i like studying your articles. Keep up the good work.
      Data Science Course in Vadodara

      ReplyDelete
    22. I haven’t checked in here for a while since I thought it was getting boring, but the last several posts are great quality so I guess I’ll add you back to my everyday blog list.
      Data Science Course in Jaipur

      ReplyDelete
    23. You are so interesting! I do not suppose I’ve read through anything like this before. So good to discover another person with a few genuine thoughts on this topic. Thank you for starting this up. This web site is one thing that’s needed on the web, someone with a bit of originality!
      Data Science Course in Nagpur

      ReplyDelete
    24. Thanks for sharing this information. I have shared this link with others, keep posting such information keep up the good work.
      Data Science Course in Lucknow

      ReplyDelete
    25. whoa this blog is wonderful i like studying your articles. Keep up the good work.
      data scientist course in Hyderabad

      ReplyDelete
    26. I don’t even know the way I stopped up here, but I believed this submit used to be good. I do not recognize who you are but definitely you’re going to a famous blogger in the event you are not already.
      data scientist training and placement in Hyderabad

      ReplyDelete
    27. Your style is really unique in comparison to other people I have read stuff from. I appreciate you for posting when you have the opportunity, Guess I will just bookmark this site.
      data scientist training and placements

      ReplyDelete
    28. It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.
      data scientist training in Hyderabad

      ReplyDelete
    29. This comment has been removed by the author.

      ReplyDelete
    30. whoa this blog is wonderful i like studying your articles. Keep up the good work.
      data scientist course in Hyderabad

      ReplyDelete
    31. I don’t even know the way I stopped up here, but I believed this submit used to be good. I do not recognize who you are but definitely you’re going to a famous blogger in the event you are not already.
      data scientist training and placement in Hyderabad

      ReplyDelete
    32. Your style is really unique in comparison to other people I have read stuff from. I appreciate you for posting when you have the opportunity, Guess I will just bookmark this site.
      data scientist training and placements

      ReplyDelete
    33. It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.
      data scientist training in Hyderabad

      ReplyDelete
    34. Very good message. I stumbled across your blog and wanted to say that I really enjoyed reading your articles. Anyway, I will subscribe to your feed and hope you post again soon.
      Machine Learning Course in Delhi

      ReplyDelete
    35. Hi, I looked at most of your posts. This article is probably where I got the most useful information for my research. Thanks for posting, we can find out more about this. Do you know of any other websites on this topic?
      Data Science Course in Delhi

      ReplyDelete
    36. Actually I read it yesterday but I had some ideas about it and today I wanted to read it again because it is so well written.
      Business Analytics Course in Delhi

      ReplyDelete
    37. I was browsing the internet for information and found your blog. I am impressed with the information you have on this blog.
      Artificial Intelligence Course in Delhi

      ReplyDelete
    38. It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.
      Data Science Course in Delhi

      ReplyDelete
    39. Really impressed! Everything is a very open and very clear clarification of the issues. It contains true facts. Your website is very valuable. Thanks for sharing.
      Business Analytics Course in Delhi

      ReplyDelete
    40. I wanted to leave a little comment to support you and wish you the best of luck. We wish you the best of luck in all of your blogging endeavors.
      Artificial Intelligence Course in Delhi

      ReplyDelete
    41. A good blog always contains new and exciting information and as I read it I felt that this blog really has all of these qualities that make a blog.
      Machine Learning Course in Delhi

      ReplyDelete
    42. Machine Learning is an in-depth region of AI focused within the design and creation of an algorithm which identifies and finds patterns that exist in data provided as input. This innovation will set a replacement approach of managing and regulating organizations or a further , especially companies.
      Machine learning Classes in Pune

      ReplyDelete
    43. I think about it is most required for making more on this get engaged
      data scientist training and placement

      ReplyDelete
    44. I wanted to leave a little comment to support you and wish you the best of luck. We wish you the best of luck in all of your blogging endeavors.
      Data Science Course in Delhi

      ReplyDelete
    45. A good blog always contains new and exciting information and as I read it I felt that this blog really has all of these qualities that make a blog.
      Business Analytics Course in Delhi

      ReplyDelete
    46. It is late to find this act. At least one should be familiar with the fact that such events exist. I agree with your blog and will come back to inspect it further in the future, so keep your performance going.
      Artificial Intelligence Course in Delhi

      ReplyDelete
    47. Really impressed! Everything is a very open and very clear clarification of the issues. It contains true facts. Your website is very valuable. Thanks for sharing.
      Machine Learning Course in Delhi

      ReplyDelete
    48. You are so interesting! I do not suppose I’ve read through anything like this before. So good to discover another person with a few genuine thoughts on this topic. Seriously. thank you for starting this up.
      Data Science Course in Delhi

      ReplyDelete
    49. It's good to visit your blog again, it's been months for me. Well, this article that I have been waiting for so long. I will need this post to complete my college homework and it has the exact same topic with your article. Thanks, have a good day.
      Business Analytics Course in Delhi

      ReplyDelete
    50. I wanted to leave a little comment to support you and wish you the best of luck, We wish you the best of luck in all of your blogging endeavors.
      Artificial Intelligence Course in Delhi

      ReplyDelete
    51. I was browsing the internet for information and found your blog, I am impressed with the information you have on this blog.
      Machine Learning Course in Delhi

      ReplyDelete

    52. Nice post. I used to be checking constantly this blog and I am impressed! Extremely useful info particularly the ultimate section 🙂 I take care of such information a lot. I was seeking this certain information for a long time. Thank you and best of luck.
      Report writing on blood donation camp

      ReplyDelete
    53. I want to leave a little comment to support and wish you the best of luck.we wish you the best of luck in all your blogging enedevors.
      data science training in chennai

      ReplyDelete
    54. I simply must tell you that you have written an excellent and unique article that I really enjoyed reading. I’m fascinated by how well you laid out your material and presented your views. Thank you. Unique Dofollow Backlinks

      ReplyDelete
    55. Thanks for such a great post and the review, I am totally impressed! Keep stuff like this coming.
      data scientist training and placement in hyderabad

      ReplyDelete
    56. I would also motivate just about every person to save this web page for any favorite assistance to assist posted the appearance.
      data scientist training and placement

      ReplyDelete
    57. You are so interesting! I do not suppose I’ve read through anything like this before. So good to discover another person with a few genuine thoughts on this topic, thank you for starting this up.
      Data Science Course in Delhi

      ReplyDelete
    58. It's good to visit your blog again, it's been months for me. Well, this article that I have been waiting for so long, I will need this post to complete my college homework and it has the exact same topic with your article. Thanks, have a good day.
      Business Analytics Course in Delhi

      ReplyDelete
    59. I wanted to leave a little comment to support you and wish you the best of luck, I wish you the best of luck in all of your blogging endeavors.
      Artificial Intelligence Course in Delhi

      ReplyDelete
    60. I was browsing the internet for information and found your blog, i am impressed with the information you have on this blog.
      Machine Learning Course in Delhi

      ReplyDelete
    61. Highly informative article. This site has lots of information and it is useful for us. Thanks for sharing your views...

      DevOps Training in Hyderabad

      ReplyDelete
    62. I was just examining through the web looking for certain information and ran over your blog.It shows how well you understand this subject. Bookmarked this page, will return for extra. data science course in vadodara

      ReplyDelete

    Add a comment. Registration required because trolls.