Why the Fragility of Health Outcomes Research May Be a Good Outcome for Health

Durably improving health is really, really hard.

I’ve discussed this in the context of drug discovery, which must contend with the ever-more-apparent reality that biology is incredibly complex, and science remarkably fragile.  I’ve discussed this in the context of patient behavior, focusing on the need to address what Sarah Cairns-Smith and I have termed the “behavior gap.”

Here, I’d like to focus on a third challenge: measuring and improving the quality of patient care.

I’ve previously highlighted the challenges faced by Peter Pronovost of Johns Hopkins in getting physicians to adhere to basic checklists, or to regularly do something as simple and as useful as washing hands, topics that have been discussed extensively and in a compelling fashion by Atul Gawande and others.

Several recent reports further highlight just how difficult it can be not only to improve quality but also to measure it.

Consider the recent JAMA article (abstract only) by Lindenauer et al. analyzing why the mortality rate of pneumonia seems to have dropped so dramatically from 2003-2009.  Originally, this had been attributed to a combination of quality initiatives (including a focus on processes of care) and clinical advances.  The new research, however, suggests a much more prosaic explanation: a change in the way hospitals assign diagnostic codes to patients; thus, while rates for hospitalization due to a primary diagnosis of pneumonia decreased by 27%, the rates for hospitalization for sepsis with a secondary diagnosis of pneumonia increased by 178%, as Sarrazin and Rosenthal highlight in an accompanying editorial (public access not available).

Why did the coding pattern change? Multiple explanations were proposed by the authors; possibilities range from the benign — changes in diagnostic guidelines, greater awareness of sepsis, etc. – to the cynical (and quite likely) — utilizing different coding to maximize reimbursement.

One key take-home is that reliable measurement of health variables is so much more of a challenge than is typically appreciated, and ensuring that we’re robustly measuring what we think we’re measuring, rather than a paraphenomenon, is going to be very important.  We’re learning this lesson the hard way in so many areas of science, and health outcomes research is unlikely to be the solitary exception.

A second and equally important lesson is to remember that in many cases in health outcome research, the people who are doing the measurements and assessments often have a significant stake in the results, introducing the very real possibility of data distortion.

More explicitly: a tremendous priority of every hospital I know is protecting their bottom line, and a key element of this is maximizing billing.  This is business as usual in medicine.  Just this week, for example, I received an email from a professional medical society, inviting clinicians to attend a webinar entitled “Maximizing Reimbursement for the Treatment of Diabetes.”

While “maximizing reimbursement” is probably not what most recipients of this email were thinking about when they applied to medical school (and please treat yourself to this wonderful TechCrunch essay by Avado CEO Dave Chase entitled “Patients Are More Than A Vessel For Billing Codes”), it’s also a fact of life, and essential to the viability of most medical practices (an increasingly difficult struggle for many — see here).

When you impose quality metrics, you introduce a powerful incentive to behave in a way that optimizes apparent performance – to game the system.  This could potentially manifest itself not only through the selective adjustment of diagnostic codes, but also through a range of other questionable activities; for example, I’ve heard stories of hospitals contriving reasons to transfer failing patients (especially transplant patients) to other facilities to preserve their own survival statistics.  (Fortunately, I never saw or heard of this happening at any hospital where I trained or practiced).

Given the evident utility of financial incentives, you might think that tying these incentives to quality measures would at least improve measures of health outcomes.  However, even this turns out to be tricky: a striking study by Jha et al. in the most recent NEJM found that the pay for performance approach did not reduce 30-day mortality rates, obviously the outcome of most importance to patients.

The most likely explanation: hospitals are graded mostly by their performance on process measures – how they do a variety of specific tasks thought to be associated with improved care, even though the evidence for a relationship between most of these process measures and improved outcomes is weak at best.  The system works in that policy makers seem to be getting the behaviors they incentivize; unfortunately, these specific behaviors don’t seem to result in improved patient outcomes.  (I’ve discussed this exact issue – tracking what you can, not what you should here; I’d also note that similarly suspect process metrics are routinely imposed by management consultancies upon their unwitting — or perhaps complicit — clients.)

In an interesting accompanying commentary, Joynt and Jha challenge the utility of what at first blush seems to be a useful (and increasingly popular) outcome measure, the 30 day readmission rate (how often discharged patients are readmitted to the hospital within a month), a key component of the Affordable Care Act (poor performing hospitals are to be penalized), and argue that, in fact, only a small fraction of readmissions are likely truly preventable.

This matters, because as Joynt and Jha point out, “The metrics that policymakers choose to use in rewarding and penalizing hospitals have a profound effect not just on what hospitals do but on what they choose not to do.”  They conclude, “the most important consequence of this policy [penalizing hospitals with high 30-day readmission rates] is the improvements in quality and safety that hospitals will forgo, and those will be far more difficult to measure.”  In other words, by insisting that hospitals focus on an outcome measure that may largely beyond their control, policy makers may actually make it more difficult for hospitals to identify parameters they truly can control and improve.

It is painfully obvious by this point that that no matter how you approach it, durably improving healthcare represents a daunting challenge.  The underlying science (from biology to outcomes) is inordinately complex, the actors (scientists, patients, physicians, administrators) are distinctly human, and even collecting a basic and adequate set of robust measurement and underlying facts is surprisingly challenging.

It’s also, of course, a great opportunity – these are truly important problems that have now captured the attention of some of our best minds.  I’m heartened by the exceptional entrepreneurial interest that’s been attracted to health care, I’m thrilled by the diversity of talent that’s now started to think hard about these problems and approach them from so many different perspectives, and I’m gratified by the way so many stakeholders – including big pharma and leading payors, but also forward-thinking universities and health organizations –  have recognized the urgent need to re-envision their traditional models, to move outside their usual comfort zone, and to collaborate in a more open, ambitious, and daring way than most might have contemplated even a decade ago.

It’s the shared recognition of the enormity of our task that ultimately will compel us to challenge our most fundamental assumptions, embrace novel possibilities, stimulate innovative collaborations, drive organizational change, inspire creative entrepreneurs, and in the end catalyze the evolution – perhaps even revolution – of our healthcare ecosystem.

What a great time to be part of — or to join — this audacious effort.

David Shaywitz is co-founder of the Harvard PASTEUR program, a research initiative at Harvard Medical School. His a strategist at a biopharmaceutical company in South San Francisco. You can follow him at his personal website. This post originally appeared on Forbes.

1 reply »

  1. great post. I agree completely that performance measurement is in its infancy. Unfortunately, it’s being implemented everywhere, with real consequences, before close to being ready. When I bring this up to various adminstrative types, I’m told some variation of “don’t let the perfect be the enemy of the good.” As if a cliche can hide the fact we really don’t know what we’re doing. (with serious unintended consequences)

    A thought: Would an ICU physician who is excellent at end of life and palliative care be graded poorly re: mortality rates comapred to one who marched forward at all costs to “save” every critically ill patient? (just to have them die on someone else’s watch?) Even something as seemingly simple as “mortality rate” isn’t that simple.