Will Getting More Granular Help Doctors Make Better Decisions?

flying cadeuciiI’ve been thinking a lot about “big data” and how it is going to affect the practice of medicine.  It’s not really my area of expertise– but here are  a few thoughts on the tricky intersection of data mining and medicine.

First, some background: these days it’s rare to find companies that don’t use data-mining and predictive models to make business decisions. For example, financial firms regularly use analytic models to figure out if an applicant for credit will default; health insurance firms can predict downstream medical utilization based on historic healthcare visits; and the IRS can spot tax fraud by looking for fraudulent patterns in tax returns. The predictive analytic vendors are seeing an explosion of growth: Forbes recently noted that big data hardware/software and services will grow at a compound annual growth rate of 30% through 2018.

Big data isn’t rocket surgery. The key to each of these models is pattern recognition: correlating a particular variable with another and linking variables to a future result. More and better data typically leads to better predictions.

It seems that the unstated, and implicit belief in the world of big data is that when you add more variables and get deeper into the weeds, interpretation improves and the prediction become more accurate.

This is a belief system, or what some authors have termed the religion of big data. Gil Press wrote a wonderful critique on “The Government-Academia Complex and the Big Data Religion” in Forbes:

Bigger is better and data possesses unreasonable effectiveness. The more data you have, the more unexpected insights will rise from it, and the more previously unseen patterns will emerge. This is the religion of big data. As a believer, you see ethics and laws in a different light than the non-believers. You also believe that you are part of a new scientific movement which does away with annoying things such as making hypotheses and the assumptions behind traditional statistical techniques. No need to ask questions, just collect lots of data and let it speak.

I’m hesitant to say this (since doctors are always convinced that medicine is somehow an exceptional industry) but, I’m not convinced that more and better computer models will necessarily lead to better diagnoses or an improved day-to-day practice of medicine.

That’s not to say that big data won’t revolutionize healthcare.  I’m not referring to things like the personalization of genomic medicine where data analysis will be essential. Or to computerized clinical aids such as Isabel that has cracked all of complex cases that Dr. Lisa Sanders’s published in her Diagnosis column in the New York Times, beating many physicians.

But, there are many things that data will never do well.  For certain things, physician heuristics may lead to better decisions than any predictive model.

Heuristics are shortcuts, based on experience and training that allow doctors to solve problems quickly.  They are pattern maps that physicians are trained to recognize. But, heuristics have a reputation for leading to imperfect answers: Wikipedia notes that heuristics lead to solutions that “(are) not guaranteed to be optimal, but good enough for a given set of goals…. (they) ease the cognitive load of making a decision.”  Humans use them because we simply can’t process information in sequential binary fashion the way computers do.

It would be a mistake to call heuristics a sad substitute for big data.  Some cognitive scientists have made the argument, and I think they’re right, that heuristics aren’t simply a shortcut for coming to good-enough answers. For the right kinds of problems, heuristically generated answers are often better than the those generated by computers.

How can this be?

Screen Shot 2015-01-23 at 9.05.37 AM

I often think of the following cartoon in Randall Munroe’s superb recent book, What If? Serious Scientific Answers to Absurd Hypothetical Questions.  In trying to compare human and computer thinking, he rightly notes that each excels at different things.  In this cartoon, for example, humans can quickly determine what they thought happened.  Most people can tell you that the kid knocked over the vase and the cat is checking it out, without going through millions of alternate scenarios.  Monroe notes that most computers would struggle to quickly come to the same conclusion.

So, from the perspective of an emergency doctor, here are the three leading problems with the applied use of complex analytics in the clinical setting:

  • 1. The garbage in, garbage out problem.  In short, humans regularly obfuscate their medical stories and misattribute causality. You need humans to guide the patient narrative and ignore red herrings.
  • 2. If we want to be able to diagnose, screen and manage an ER full of runny-nosed kids with fevers, we simply can’t afford the time it takes for computers to sequentially process millions of data points. The challenge is at one simple and nuanced: allowing 99% of uncomplicated colds to go home while catching the one case of meningitis. It’s not something that a computer does well: it’s a question of balancing sensitivity (finding all true cases of meningitis among a sea of colds) and specificity (excluding meningitis correctly) and doctors seem to do better than computers when hundreds of cases need to be seen a day.
  • 3. There is a problem with excess information, where too much data actually opacifies the answer you’re looking for. Statisticians call this “overfitting” the data. What they mean is that as you add more and more data points to an equation or regression model, the variability of random error around each point gets factored in as well, creating “noise”. The more variables, the more noise.

The paradox is that ignoring information often leads to simpler and ultimately better decisions.

Here’s a great example from the Journal of Family Practice that I found in a super review article from a group at the Center for Adaptive Behavior and Cognition, Max Planck Institute for Human Development in Germany.

In 1997, Drs. Green and Mehr, at the family practice service of a rural hospital in Michigan, tried to introduce a complex algorithm (the Heart Disease Predictive Instrument, HDPI) to residents deciding whether to admit a patient to the cardiac care unit or the regular hospital floor. While this expert system, which relied on residents entering probabilities and variables into a calculator, did lead to better allocation decisions than before the tool was introduced, the physicians found it cumbersome.

Drs. Green and Mehr went on to developed a simple tree based on only a three yes-no questions: Did the patient have ST changes? Did he have chest pain? Were there five associated EKG changes?

The simple heuristic led to far better medical decisions: more patients were appropriately assigned to the coronary unit even though the heuristic used a fraction of the available information. Here is a chart showing the outcomes from before the issue was examined, from when the HDPI was used, and when the simple heuristic was introduced. The heuristic performed better in sensitivity and false positive (1-specificity) ratios than the probability algorithm or blind decisions.

Screen Shot 2015-01-24 at 10.55.52 AM

I don’t know how improved big data tools would fare today.  It may be that the HDPI wasn’t as advanced as a predictive algorithm used today. But, it may also be that simple tools, intuition and experience led to better and more timely decisions that any computer.  These physician heuristics represent the “adaptive unconscious” that Malcolm Gladwell writes (in his excellent book, Blink) often leads to surprisingly good and rapid decisions.

They challenge, going forward, will be to benefit from big data while not becoming a slave to it.  The implicit promise of better clinical pictures through more and more pixels– may simply be false.

Marc-David Munk, MD is CMO of a risk bearing health system in Massachusetts.

11 replies »

  1. From a doctor’s point of view (I’m one myself), I agree that knowing the exact risk of say cardiac death to such fine granularity is not particular useful (eg 6.54% vs 4%).

    For big data to be really useful today, it needs to move up the value chain.

    Instead of just predicting risk, we need to predict outcomes. For example, calculating the expected morbidity/mortality in a particular patient if I decide to start him on treatment A, vs not starting him on treatment A. Granted, this depends on many factors, but having something (even if just 70 – 80% accurate) like this can aid in clinical decision making.

    And we should also consider this from the patient’s perspective – data and visualizations can increase patient engagement. For example, as an individual, it will be very interesting for me to know that my risk of a heart attack is 8% over 10 years. And being able to see that it goes down to 5.85% after a 6 month exercise program will be a key motivating factor.

  2. The problem with multicollinearity that I face is this.

    Researcher does a model to predict severity of mitral regurgitation and includes size of the left atrium in the model. Then does some stats and says the left atrial size of “x” cm is an independent risk factor for badness.

    Now I am left measuring left atrial size in all patients some of whom, many actually, will not develop badness.

  3. Well, yes, you are correct in that nuance — “conditionally independent.” The overfitting problem goes directly to multicollinearity. I could fit some nth-degree polynomial to EXACTLY describe a data scatter. But It would describe ONLY that data set. It would have zero generalizability.

  4. “Exactly, “errors” (variation) are additive, they don’t “average out” or cancel.”

    But are they though? I mean they would be for parameters that are conditionally independent. But when there is multicollinearity, a huge problem I face when people use all sorts of measurements in imaging to make bold predictions, would there not be a similar convergence of error?

  5. Anytime I hear the word “Gauss” associated with some non-physical phenomenon, my hand slides inexorably over my wallet. Think Chebychev, baby! 😉

  6. Good point. And more than the issues of correlation of variables, a second issue the fact that each variable (even if independent) introduces some measure of standard deviation. Add enough variables, with enough standard deviation and your aggregate error can be huge.

  7. Great comment BobbyGvegas. Thanks. Nicholas Nassim Taleb’s Black Swan Theory has powerfully affected my thinking regarding the importance of small probabilities and the issues having to do with making assumptions based on bell-curve, or normal distributions.

  8. Decision making in medicine is dichotomous – you operate or you don’t. Data are continuous. You have to draw a line somewhere.

    The trouble is when people think of prediction as binary.

    What does a 3.2 % increased risk of sudden cardiac death mean? How about 6.66 %? Is this information (i.e. to the nearest decimal point) really better than a heuristic? How do we define better?

    At some point granularity is not just unhelpful but BS.


  9. Overfitting also goes to what we call “multicollinearity.” Nominally “different” variables often redundantly measure the same phenomenon to a correlative degree. An astute modeler controls for this, minimizes it — reducing the “noise.” Google “multicollinearity.” I started that Wikipedia entry, years ago. Others have since fleshed it out in broad detail.

    You can’t simply let SAS or some similar stats app do your thinking for you.

  10. “Bigger is better and data possesses unreasonable effectiveness. The more data you have, the more unexpected insights will rise from it, and the more previously unseen patterns will emerge.”

    Nicholas Nassim Taleb has jibed that “Information is overrated.”

    What he means, of course, is information that is just random noise.

    I used to work in credit risk modeling and portfolio management, and was routinely a user of fairly “big data.” We could be statistically “wrong” 99% of the time as long as the 1% where we’d guessed (“modeled”) right paid for everything and returned a profit. Which was always the case during my tenure. Record profits every year.

    This kind of thing is what the social media companies are doing.e.g., Facebook is mining and modeling everything under the sun. The data are SO big that it matters nil that they are shot through with errors and omissions. The findings are profitable to THEM and the companies to whom the sell the findings. If they mischaracterize YOU adversely, well boo-hoo to you.

    So, the utility of big data goes to the purpose for which it is used.