So the quality movement has been making slow strides and the first vestiges of a pay for performance system has appeared in California and has been going for a couple of years in Massachusetts. But not so fast! You may (as I did) have missed, while you were recovering from your labor day exertions, the September 3 JAMA article from several leading Boston doctors which explained that pay for quality and performance won’t work. (You can see the abstract here). I quote a chunk of their press release below so you get the idea:
"Measuring a physician’s quality of care by numerical standards — such as adherence to a disease management protocol or a treatment outcome — is often invalid for a variety of reasons, say the authors of a study in the Sept. 3 Journal of the American Medical Association.
While not a general nationwide practice, several payers around the country are using quantitative quality measures as a basis for reimbursement bonuses (for Blue Cross and Blue Shield of Massachusetts’ program, see MD Practice Alert, July 30, 2003). Some medical groups also reward higher quality with higher pay. "Quality" for such incentive payments usually means adherence to well-recognized disease management or preventive care protocols or procedures. Quality measured in this way is beginning to be available on some Web-based "physician report cards" that increasingly may be the way some patients, such as those on consumer-directed health care plans, choose doctors.
Bruce Landon, M.D., researcher at the Harvard Medical School Department of Health Care Policy, and lead author of the JAMA study, says that although it looked at the use of such quantitative measurement (also called "physician clinical performance assessment" or PCPA) for credentialing doctors, many of the cautions raised in the study "are relevant for ‘paying for quality.’"
While PCPA can be valuable and is improving, Landon and his co-authors say, it has several common problems, some of which are:
–Insufficient sample size in an individual doctor’s practice. The authors suggest that 100 patients may be an appropriate sample (patients with the same disease treated by the same physician), but note that the National Committee for Quality Assurance says a 35-patient sample is adequate. "The proportion of all physicians for whom sample sizes are large enough to permit valid PCPA is unknown at this time," Landon writes.
–Systematic differences in populations of patients, who may differ in adequacy of insurance, general health status and other ways. "Health plans typically don’t adjust for health status or sociodemographic characteristics," Landon notes, although their reimbursement bonuses deal with patients who have the same insurance. To solve the problem of differing health statuses, some PCPA measures may include only "ideal candidates," he adds, but that approach could create sample-size problems.
–Poor reflection of entire practice. Obviously, adherence to one or two protocols is only a small part of what any given doctor does. Studies have shown that adherence to one protocol is a poor predictor of adherence to another not used to evaluate physicians.
Cost. "Collection [of PCPA data] in the outpatient setting would be substantially more expensive [than collecting valid hospital quality data] because of the multiple different locations and lack of funding mechanism to pay for this type of performance assessment activity," Landon says.
–Potential conflicts with quality improvement. PCPA activities may differ depending on whether they’re conducted to assess physicians’ competence or to foster quality improvement. Conflicts with patient communication and other unmeasured aspects of care also could arise, he adds. Groups focusing on a given kind of quality improvement "might pay less attention to other important features of quality that are not being measured."
–Lack of evidence-based measures for many specialties.
–Challenges in defining minimum thresholds for acceptable care.
"Many health plans," Landon says, "use arbitrary thresholds (e.g., the top 25%), when in fact there might not be much difference [in performance] between those that receive the bonus and those that don’t." Lack of uniformity among payer bonuses also is a problem, he says. "There are often so many measures from different plans that the signal to increase quality can get lost in all the noise."
The last two sentences of the abstract indicate that they are not happy with the ways they are being assessed. "We conclude that important technical barriers stand in the way of using physician clinical performance assessment for evaluating the competency of individual physicians. Overcoming these barriers will require considerable additional research and development." And their last sentence is a thing of beauty. "Even then, for some uses, physician clinical performance assessment at the individual physician level may be technically impossible to accomplish in a valid and fair way."
Matt Quinn, who’s been working in health quality data assessment for some years now, and who’s vigilance saved me from missing this work of art, commented. "I guess that means that efforts to measure performance and inform consumers just aren’t worth it and that everyone involved should just continue to assume that all docs provide consistently excellent quality care that adheres to evidence-based guidelines." I’m sure Matt would agree that the correct performance assessment of no other human process has ever had to overcome this magnitude of challenge!
I’m reminded of Gene Wilder as the sheep-loving struck-off MD in Woody Allen’s film Everything you ever wanted to know about sex but were too afraid to ask. He’s working as a waiter and when too many customers start complaining and it all gets too much, he shouts "Don’t treat me like that–I’m a Doctor! I’m a Doctor!"