Is There Something Wrong With the Scientific Method?

A recurring them on this blog is the need for empowered, engaged patients to understand what they read about science. It’s true when researching treatments for one’s condition, it’s true when considering government policy proposals, it’s true when reading advice based on statistics. If you take any journal article at face value, you may get severely misled; you need to think critically.

Sometimes there’s corruption (e.g. the fraudulent vaccine/autism data reported this month, or “Dr. Reuben regrets this happened“), sometimes articles are retracted due to errors (see the new Retraction Watch blog), sometimes scientists simply can’t reproduce a result that looked good in the early trials.

But an article a month ago in the New Yorker sent a chill down my spine tonight. (I wish I could remember which Twitter friend cited it.) It’ll chill you, too, if you believe the scientific method leads to certainty. This sums it up:

Many results that are rigorously proved and accepted start shrinking in later studies.

This is disturbing. The whole idea of science is that once you’ve established a truth, it stays put: you don’t combine hydrogen and oxygen in a particular way and sometimes you get water, and other times chocolate cake.

Reliable findings are how we’re able to shoot a rocket and have it land on the moon, or step on the gas and make a car move (predictably), or flick a switch and turn on the lights. Things that were true yesterday don’t just become untrue. Right??

Bad news: sometimes the most rigorous published findings erode over time. That’s what the New Yorker article is about.

I won’t try to teach here everything in the article; if you want to understand research and certainty, read it. (It’s longish, but great writing.) I’ll just paste in some quotes. All emphasis is added, and my comments are in [brackets].

  • All sorts of well-established, multiply confirmed findings have started to look increasingly uncertain. It’s as if our facts were losing their truth: claims that have been enshrined in textbooks are suddenly unprovable. … In the field of medicine, the phenomenon seems extremely widespread, affecting not only antipsychotics but also therapies ranging from cardiac stents to Vitamin E and antidepressants: Davis has a forthcoming analysis demonstrating that the efficacy of antidepressants has gone down as much as threefold in recent decades.
  • “This is a very sensitive issue for scientists,” [Schooler] says. “You know, we’re supposed to be dealing with hard facts, the stuff that’s supposed to stand the test of time. But when you see these trends you become a little more skeptical of things.”
  • [One factor is] publication bias, or the tendency of scientists and scientific journals to prefer positive data over null results, which is what happens when no effect is found. The bias was first identified by the statistician Theodore Sterling, in 1959, after he noticed that ninety-seven per cent of all published psychological studies with statistically significant data found the effect they were looking for.
    • [The point here is that naturally all you see published is a successful study. Lots of useful information can come from failed studies, but they never get published.]
    • [The problem is that anything can happen once, at random. That’s why it’s important that a result be replicable (repeatable by another scientist): like that light switch, if someone else tries it, you better get the same result. But the article points out that most published results are never tested by another researcher.]
  • In recent years, publication bias has mostly been seen as a problem for clinical trials, since pharmaceutical companies are less interested in publishing results that aren’t favorable. But it’s becoming increasingly clear that publication bias also produces major distortions in fields without large corporate incentives, such as psychology and ecology.
  • [But publication bias] remains an incomplete explanation. For one thing, it fails to account for the initial prevalence of positive results among studies that never even get submitted to journals. [By this point, this article was driving me nuts.]
  • [Re another cause of this problem,] In a recent review article, Palmer summarized the impact of selective reporting on his field: “We cannot escape the troubling conclusion that some—perhaps many—cherished generalities are at best exaggerated in their biological significance and at worst a collective illusion nurtured by strong a-priori beliefs often repeated.”
  • [We had two posts in October here and here in October about an Atlantic article by Dr. John Ioannidis, who is quoted in this article:] “…even after a claim has been systematically disproven”—he cites, for instance, the early work on hormone replacement therapy, or claims involving various vitamins—“you still see some stubborn researchers citing the first few studies that show a strong effect. They really want to believe that it’s true.”
  • The current “obsession” with replicability distracts from the real problem, which is faulty design [of studies].
    • In a forthcoming paper, Schooler recommends the establishment of an open-source database, in which researchers are required to outline their planned investigations [before they do them] and document all their results. [Including those that fail!]
    • [Note: Pew Research publishes all its raw data, for other researchers to scrutinize or use in other ways.]

The corker that caps it off is John Crabbe, an Oregon neuroscientist, who designed an exquisite experiment on mice sent to three different labs with incredibly uniform conditions. Read the article for details. When these mice were injected with cocaine, the reactions of the three groups of relatives were radically different. Same biology, same circumstances, seven times greater effect in one of the groups.

What?? (There’s more; read it.)

If you’re a researcher and this has happened, and it’s time to “publish or perish,” what do you do? What is reality?

The article winds down:

The disturbing implication of the Crabbe study is that a lot of extraordinary scientific data are nothing but noise. The hyperactivity of those coked-up Edmonton mice wasn’t an interesting new fact—it was a meaningless outlier, a by-product of invisible variables we don’t understand.

Implications for e-patients

Wiser people than I will have more to say, but here are my initial takeaways.

  • Don’t presume anything you read is absolute certainty. There may be hope where people say there’s none; there may not be hope where people say there is. Question, question, question.
  • Be a responsible, informed partner in medical decision making.
    • Don’t expect your physician to have perfect knowledge. How can s/he, when the “gold standard” research available may be flaky?
    • Do expect your physician to know about this issue, and to have an open mind. The two of you may be in a glorious exploration of uncertainty. Make the most of it.
  • Expect your health journalists to know about this, too. Health News Review wrote about this article last week, and I imagine Retraction Watch will too. How about the science writers for your favorite news outlet? I can’t imagine reporting on any finding without understanding this. Write to them!

Mind you, all is not lost. Reliability goes way up when you can reproduce a result. (Rockets, light bulbs, chocolate cake.) But from this moment forward, I’m considering every new scientific finding to be nothing more than a first draft, awaiting replication by someone else.

Dave deBronkart better known as “e-Patient Dave,” is one of the leading spokesperson for the e-Patient movement. A high tech executive and online community leader for many years, he was diagnosed in 2007 with Stage IV kidney cancer, with a median survival of just 24 weeks. He used the internet in every way possible to partner with his care team and beat this unbeatable disease. Today he is well. In 2008 he discovered the e-patient movement, and began studying, blogging, and speaking at conferences, and in 2009 was elected founding co-chair of the new Society for Participatory Medicine. In 2010 he released his first book: “Laugh, Sing, and Eat Like a Pig: How an empowered patient beat Stage IV cancer (and what healthcare can learn from it).” He blogs frequently at e-patients.net.

17 replies »

  1. There is NO SUCH THING as a common, universal, singular “method” that all scientist must use or follow. No such thing exists. Science textbooks promote GARBAGE. Little, if any, science is done that way. We all learned it completely wrong in 6th grade…..

  2. http://www.theatlantic.com/magazine/archive/2010/11/lies-damned-lies-and-medical-science/8269/
    Fantastic companion article in the current Atlantic on the same topic. Highlights work of John Ioannidis, who is making a career out of challenging the durability of medical study results. He basically concludes that there is nothing wrong with the scientific method per se, but society harbors unrealistic expectations about the magnitude and pace of it’s results. Somewhere in the piece he responds, when asked what the average person should do upon reading about some new study: “Ignore it.”

  3. Tina, I have no idea where you got your attitude, but you’re barking up the wrong straw man.
    If you’ll read my posts about my testimony to the meaningful use hearings, particularly those that cite Ross Koppel’s concerns about safety (especially my final testimony), you’ll see that I specifically talk about the value of patients serving as a second set of eyes for errors in the record – as Regina did.
    I know you have more things to do than subscribe to everything I do, but please also see slides 52-55 of my talk at AHRQ’s 2010 meeting of grantees and contractors. I was irritated at the reported attitude of certain EHR vendors, and I title my talk as an in-your-face retort: “Over My Dead Body: Why reliable systems matter to patients.” I imagine you’ll agree with those slides.
    Again, I invite outreach for person-to-person contact. Spitting contests in blog comments take too much time and energy – I won’t pursue this further here, especially since it’s not what this post was about. Please just know that I don’t want crappy tools shoved down anyone’s throat; I want GOOD MODERN PROCESSES that GET THE JOB DONE.
    End of topic, pour moi ici…

  4. This is Regina Holliday, data access evangelist, artist and widow. I thought I should chirp in here as our data access story got wrapped around Dave’s article reviewing measurable outcomes and the scientific method in peer reviewed journals. It does seem quite unrelated on first glance, but it really is not.
    If you listened to many of my speeches including two in front of Secretary Sebelius, you might have noticed my conclusion:
    “I eventually got a copy of Fred’s record and – despite its many errors– it became a virtual bible that we used to guide Fred’s care for the last 56 days of his life. There was not a day I didn’t reference it, and that information extended Fred’s life and helped create a fragile peace within our hearts; for there is no greater sorrow then watching your loved one suffer while you feel helpless because you don’t have the information to know what’s going on, what he needs or how to help.
    That is why I am working so hard for clarity and transparency of electronic medical records. I may not be an expert in medicine, but I am an expert on my husband. With access to his medical record, I could explain his treatment options and help ease his mind.”
    When I mention error in the record, most of those errors are actually “evidence” of errors of communication within the medical team. Oh yes, there were the occasional typos, but the record was honest and clear and told us exactly what was occurring in relation to my husband’s medical care.
    So here we can return to scientific and evidence based medicine, for my husband was diagnosed far too late for recovery and our only hope existed within the world of palliative care and extension of life with quality. The Electronic Medical Record never lied to us, it never told us we would get a surgery or chemotherapy that would not arrive. Access to the record helped us get to a point of peace within our family’s tragedy. Just like the doctors mentioned in “Letting Go” by Atul Gawande, Fred’s medical team decided something must be done for this “unfortunate 39 year old male,” a father with young children. So,instead of being a part of a realistic conversation of the extent of Fred’s disease, we were left without information as to the extent and severity of his condition and left alone to wait with our heart’s filled with empty promises.
    I often speak of the realities of Meaningful Use and data access using the analogy of the Protestant Reformation. The EMR can be the Gutenburg Bible to a patient. There may “good priests” and “bad priests” just as there are “good doctors” and “bad doctors,” but as long as we can read the word ourselves we can be part of the of the decision making process.
    Thank you, Regina

  5. David,
    Please provide the scientific evidence, if any, proving safety and efficacy of HIT systems as devices to control and distribute all care of hospitalized patients.

  6. Propensity can not spell but her points are valid.
    Having attended, I recall Regina’s comments in front of Secretary Sebelius at the roll out of the meaningful use rule in July 2010. Point of information, she stated that the doctors would not talk to her, that his care was run by an EHR, and that he died and that she could have saved him for a few months if she could see his EHR.
    Why wouldn’t the doctors not talk with her? If EHRs are so good, why did she need to tell the doctors what to do? Are you familiar with the first person report by a computer design architect who wrote of the data model that nearly killed him (in California)?

  7. “As for patients, it’s ludicrous to expect them to be intelligent consumers of healthcare. Most of them are just not smart enough to understand the data. A huge percentage of patients lack the reading skills to follow the instructions printed on their damn medicine bottles, they simply are not going to be able to say “hey doc, I thought the n in that BMJ study on beta blockers was too small, what’s your take?” At their best, most people cannot understand this crap–and sick people are, by definition, not at their best.”
    +++Great points especially after the posts on consumer directed HDHPs. If doctors can’t play doctor can consumers?

  8. The scientific method is not supposed to be immutable. At all. I think this mis-interpretation comes from the whole “science vs religion” debates, but science on its own changes quite frequently and profoundly as we learn more. For instance, the Newtonian physics that was used in ballistics was completely uprooted by quantum mechanics–though, generally, it’s still a good enough approximation of reality at our scale to put a man on the moon.
    What we are seeing in science as applied to medicine is a whole bunch of things. First of all, our working models, unlike Newtonian physics, are not a great approximation of reality (especially in psychology/neuro, this should be pretty obvious. After all, we spend 1/3 of our lives asleep and haven’t the foggiest notion why. This suggests there is something VERY big missing from our model of basic human functioning). Second, we have done a really lousy job with our experiments. There is a lot of conflict of interest, a serious set of biases, and a real problem with information dissemination and control of important variables. Third, there is a huge gap between results in the lab and in real life.
    All this strongly suggests we need to re-think how we do medical research. It suggests we should use new technology to allow us to do more in vivo experimentation with treatment protocols, and less “gold standard” nonsense. The gold standard appears to be fool’s gold, and we would learn a lot more about how drugs actually work by figuring out efficient ways to analyze real-world data collected in the course of treatment than setting up this weird false double-blind test. Second, research dollars should be directed more towards investigating fundamental clinical questions and doing less research on new drugs, etc. Really, it’s ridiculous we have no clue why people need to sleep. It’s crazy that we have no clue why the placebo effect is so powerful. There are a lot of really obvious research questions not being asked, because we waste our money on these stupid randomized double blind studies.
    As for patients, it’s ludicrous to expect them to be intelligent consumers of healthcare. Most of them are just not smart enough to understand the data. A huge percentage of patients lack the reading skills to follow the instructions printed on their damn medicine bottles, they simply are not going to be able to say “hey doc, I thought the n in that BMJ study on beta blockers was too small, what’s your take?” At their best, most people cannot understand this crap–and sick people are, by definition, not at their best.
    Which is why it’s appalling we don’t teach doctors how to interpret studies. We waste their time in school on utter nonsense like memorizing all the muscles of the eyeball, but don’t teach them advanced statistics, or analysis of data sets, which would at least give them the tools to sense if something is bunk or if it’s real. It’s criminal!

  9. Hi Propensity – we haven’t met, I think.
    Some sloppy thinking here, and I’d welcome a chance to sort it out, since you seem passionate and outspoken here. What kind of work do you do? Feel free to drop me a note to discuss – my contact info is on epatientdave.com. (Really – I’m happy to gave a good discussion, and I see no reason for reasonable people to be anonymous.)
    I presume when you say “NATTA” you mean nada, Spanish for “nothing.” Not sure what natta is.
    And I’m glad you feel I’m to be walked-on-air (exulted), though I think you meant exalted.
    You mention Regina’s husband dying from HIT associated neglect. I think your head’s up your butt on that one – do your homework – he died mainly because he wasn’t able to get treatment due to lack of insurance until it was almost too late. I suggest you read her advocacy timeline, starting with her first-ever post, eight weeks before he died. He was too far gone to be saved long before she tried to get his medical record.

  10. Some wrong with the scientific method NO it is the determaniation of the most proable.

  11. +++”But from this moment forward, I’m considering every new scientific finding to be nothing more than a first draft, awaiting replication by someone else.”
    Funny you should say that when you are promoting HIT instruments of care, especially CPOE gear, that have not any proof of safety and efficacy (ZERO, ZILCH, NATTA). Your electronic libraries are ok but often promote errors.
    I guess medical care managed by CPOE and HIT is so bad now that the sick patient needs to boot up the computer while lying in hospital bed filled excrement and see if he/she is dying while the nurses are clicking on care boxes on their HIT forms, and doctors are trying to wade through the pop-up CDS screens impeding an order for an aspirin on the CPOE machine.
    Dave, you are to be exulted for saving patients’ lives from the ills of HIT devices. Look at what happened to the e-patient artist’s spouse who died from HIT associated neglect. She could have saved his life (if she only had access) from the neglect associated with the HIT systems you are promoting! Wonderful.

  12. The scientific method is a mechanism to ‘optimise’ the search for ‘truth’. Indeed, the primary focus of a particular study is to support or refute a hypothesis. The definition for hypothesis states: “a proposed explanation for an observable phenomenon.”
    Note the word proposed. Over time, the finding that a study may ‘erode’ may simply support the advancement of knowledge or results of additional studies attending to the same hypothesis. Appropriate research indeed is a continued quest to identify new or expanded information that by definition may dispel previous information. Those studies that do not erode under such scrutiny are likely stronger in their observations and perhaps closer to the ‘truth’.
    The scientific methodology is the cookbook to evaluate observations in as ‘unbiased’ of a way as possible. It is up to the critical evaluation of peers, the public and interested parties to determine the validity of the approach to answering the scientific question.

  13. There is nothing wrong with the scientific method except our blind trust in it. The scientific method can put blinders on our vision so the questions asked are too narrow. Hence the excellent movement to engage the informed patient.
    Dr. Andrew Wakefield’s report of a connection between vaccination and autism was BAD science. Now we know it was fraudulent! (www.hubslist.org) Why did the peer review process fail so miserably and for so long? Why does it take legal action and an investigative journalist to uncover a corruption of the scientific method when fellow scientists should have skewered this a long time ago?

  14. Dave, excellent post and excellent points. The sad truth is that many physicians are just as ill-equipped to interpret these studies as the patients (well, some patients). And, your last statement:
    I’m considering every new scientific finding to be nothing more than a first draft, awaiting replication by someone else.
    has also always been true – only docs don’t always keep that in mind, either. Some may say it is avarice, but I call it the ‘bandwagon effect’.