Dr. Sandeep Jauhar wrote an essay this week in the New York Times about the perils of pay-for-performance (P4P). Specifically, Dr. Jauhar discusses how P4P may have unintended consequences and create perverse incentives due to poorly designed performance measures. The point is well taken, but it’s important not to confuse the merits of P4P with the measurement issues that exist.
With respect to the latter, back in my days as Director of Measure Development for the National Committee for Quality Assurance (NCQA), I co-authored a paper with Partners’ cardiologist Tom Lee, Jim Cleeman from NHLBI, and others working with us at NCQA on the development of new HEDIS cholesterol management performance measures. In the JAMA article, “Clinical Goals and Performance Measures for Cholesterol Management in Secondary Prevention of Coronary Heart Disease,” we tried (among other things) to communicate the difference between quality improvement measures and comparative performance measures.
Although the multi-stakeholder Cardiovascular Measurement Advisory Panel and NCQA’s measurement policy-making body, the Committee on Performance Measurement supported the goals of NHLBI’s practice guidelines, we believed that there are signficant differences “between a clinical goal for the management of individual patients (LDL<100 mg/dL) and a performance measure used to evaluate the care of a population of patients (LDL<130 mg/dL).” We described several reasons including: gaps in reasearch; drug efficacy; realistic performance measures; simplicity; and implications of physician failure.
Measurement systems designed for internal quality improvement may very well be different from those used to compare provider quality for a diverse population of patients, and we should make sure to consider the differences in establishing the criteria by which clinicians are compared and reimbursed.
However, we can make those distinctions and we need to in order to drive different kinds of quality improvement forward. If we don’t create fair measures that differentiate provider performance, we will continue to lack ways of adequately compensating those who deliver care for anything but the quantity of what they produce (i.e., number of services provided).
Joshua Seidman is the president of of the Center for Information Therapy
that aims to provide the timely prescription and availability of
evidence-based health information to meet individuals’ specific needs
and support sound decision making.
Not only is there a big difference between measuring performance based on a population-based goals using generic guidelines and individual-patient-based goals using personalized guidelines, but our failure to collect and analyze hordes of detailed diagnostic and outcomes information on each patient means such personalized treatment is but a fantasy at this time.
As I wrote two years ago, instead of our sledgehammer approach to care, we’d be wise to focus on establishing a precise scalpel-like approach to diagnosis and treatment planning. For example, we might find that a HemoglobanA1c of 7.5 is perfectly OK (and maybe even beneficial) for some Type 2 diabetics, while for others 6.5 might be too high (even though they have the same blood pressure and cholesterol readings) because other factors are having a affect. Since evidence-based guidelines change as new evidence is discovered, there needs to be much more research focusing on the differences between people with the same diagnosis.
I keep thinking back to the cholesterol question. What qualifies action? Does telling a 35 year old non smoking male with an LDL of 140 to lose some weight and come back in 6 months count as much as throwing him on drugs? and what happens if (as is likely), it doesn’t work? PFP may very well lead to perverse incentives in a healthy population (ie it looks great for your numbers to statinize all these men). In sicker populations, the number of exceptions might be high. again, we, as physicians, do respond to economic stimuli (duh), and poorly done P4P will poison the process for years to come.
Agreed with Josh. While Dr. Jauhur raises plenty of valid issues with P4P (and other points including public reporting which is really a seperate issue), I kind of get the sense that he echoes the sentiments of alot of physicians who acknowledge their are issues with the quality of care in general but are very reluctant to give any real creedance to an externally imposed administrative/procedural responses to address these quality deficiencies.
Physicians do need to be involved with the development, deployment, and feedback/improvement process (and are particularly with the measures being created) but not the days of not measuring, reporting, and potentially paying on differences in outcomes have passed into the halycon days of American medicine.
Doesn’t all this boil down to the difficulty evaluating quality of medical care? There usually are so many individual factors in a given situation that one size does not fit all.
By that I do not mean that the quality of care cannot be assessed – it just seems to be much more work and more complex than looking just at a few indicators.
What really could work is anonymous peer review. If you want to review a physician’s work, send a couple (maybe 5) anonymized notes to, say two physicians of the same specialty (that are deemed qualified for peer review e.g. by the specialty board). (Of course, as a first step, one would have to look whether there is interrater reliability.)
Of course, reviewing notes means working with an incomplete and distorted mirror image of care (for instance, you cannot evaluate communication with the patient very well, or accuracy of history and physical findings, and for procedures, you can judge the indication, but not the technique) … but the reviewers could find and point out:
-medical/academic errors (obvious misdiagnosis, obvious poor medication choice)
-overuse (or underuse/misuse) of dx tests
Obviously, this would have to be done in a rather cooperative than confrontational manner (it would be a hard sell anyway) … but if you truly want to measure performance, do it with detailed information (short of videotaping doctors) and not with a few rigid parameters. One cannot really conclude that someone is a good driver just by checking that he/she does not speed and does use directional signals 2 sec. prior to a turn.