Tag: Comparative Effectiveness Research

Consumer Groups Shut Out of Comparative Effectiveness Board

Sep 27, 2010• 4

By MERRILL GOOZNER

The Government Accountability Office last week appointed two “faster cures” patient advocates and a former insurance company executive now on the AARP board to the three slots reserved for patient and consumer representatives on the Patient-Centered Outcomes Research Institute board, which will oversee comparative effectiveness research under health care reform.

The reform legislation passed last March gave GAO the job of appointing the 17 public members, which also includes five representatives of private payers, five physicians, and three industry representatives (one each for drugs, devices and diagnostic manufacturers). A full list can be found here.

The three “patient and consumer” representatives are:

Ellen Sigal, chair, Friends of Cancer Research.

Sigal is an outspoken advocate for more money for cancer research. Her board is comprised largely of fellow executives in the research community, including staff from the American Cancer Society, Research America!, and the American Society of Clinical Oncology, which represents cancer docs. She serves on numerous non-profit boards, including the Reagan-Udall Foundation set up by the Food and Drug Administration to expedite drug development; and has served on numerous Institute of Medicine panels investigating new ways of conducting cancer research that can lead to faster access to new medicines.

Will Comparative Effectiveness Research Really Make a Difference If the Public Doesn’t Want It?

Aug 13, 2010• 8

By KENT BOTTLES, MD

Not long ago I was lucky to be invited to a New England Healthcare Institute discussion entitled “From Evidence to Practice: Making Comparative Effectiveness Research Findings Work for Providers and Patients “ in Washington, DC.

How to disseminate and implement Comparative Effectiveness Research (CER) so that patient care is really improved was the first topic tackled by the expert panel and the moderator, Clifford Goodman of The Lewin Group.

The target audiences for CER findings include: patients, disabled patients, providers, policy makers, health plans, medical device companies, pharmaceutical companies, hospital administrators, academic researchers, community physicians, professional societies, and regulators.

Michael McGinnis, MD, of the Institute of Medicine, offered clusters as a way to organize these different targets: Cluster 1 (patients, providers, policy makers), Cluster 2 (control levers like payers, purchasers, system managers, professional societies, regulators) and Cluster 3 (researchers and those concerned with methodology).

Seth Frazier, Vice President of Transformation at Geisinger, was the first of many to point out the gap between the academic literature of CER and what patients and providers need at the point of care. He noted that providers need actionable recommendations that can be integrated into the flow of the clinic and hospital and that much of the evidence-based medicine product is not usable in this practical way. This observation reminded me of the gap between the public and the health care experts that Drew Altman of the Kaiser Family Foundation documented so effectively and the Kristen Carmen Health Affairs survey that said patients regard evidence-based medicine as a barrier to what they want.

A Tale of Two Diseases: Repairing Comparative Effectiveness Research

Jun 28, 2010• 4

By DAVID E. WILLIAMS

Writing in the New England Journal of Medicine (Identifying and Eliminating the Roadblocks to Comparative-Effectiveness Research) three authors share their experience in running a head-to-head trial of Avastin (bevacizumab) versus Lucentis (ranibizumab) for wet age-related macular degeneration (AMD). They describe the barriers they faced and suggest that they will need to be removed for comparative effectiveness research –as envisioned under ARRA– to succeed. They make good points and may well be correct in their policy recommendations.

However the case of Avastin and Lucentis is unusual. The products are made by the same manufacturer and are essentially identical. Avastin and Lucentis are marketed separately by Genentech mainly to allow the company to capture a return on investment from its R&D. The issue is that a regular dose of Avastin (e.g., for lung cancer) can be divided up into many doses for the eye. Since the products are sold by volume it turns out that Avastin is cheap when used for wet AMD, even though it’s pricey when used for cancer. As I’ve suggested previously, Genentech should be able to charge Lucentis prices for Avastin when it’s used in the eye. So there are quite a lot of people –starting with the manufacturer itself– who didn’t really want this study to go forward. That’s less likely to be the case with other studies.Continue reading…

The Evidentiary Basis for a Clinically Meaningful Benefit

Jun 1, 2010• 10

By NORTIN M. HADLER, MD and ROBERT McNUTT, MD

We entered the 21^st century awash in “evidence” and determined to anchor the practice of medicine on the evidentiary basis for benefit. There is the sense of triumph; in one generation we had displaced the dominance of theory, conviction and hubris at the bedside. The task now is to make certain that evidence serves no agenda other than the best interests of the patient.

Evidence-based medicine is the conscientious and judicious use of current best evidence from clinical care research in the management of individual patients^”.^[1,2]

But, what does “judicious” mean? What does “current best” mean? If the evidence is tenuous, should it hold sway because it is currently the best we have? Or should we consider the result “negative” pending a more robust demonstration of benefit? Ambiguity is intolerable when defining evidence because of the propensity of people to decide to do something rather than nothing. [3] Can we and our patients make “informed” medical decisions on thin evidentiary ice? How thin? Does tenuous evidence mean that no one is benefited or that the occasional individual may be benefited or that many may be benefited but only a very little bit?Continue reading…

“Comparative Effectiveness Research” and Kindred Delusions

Jan 11, 2010• 30

By NORTIN HADLER, MD

Last summer President Obama signed the American Recovery and Reinvestment Act into law. Tucked into the legislation was $1.1 billion to support comparative effectiveness research (CER). The legislation charged the Institute of Medicine with defining CER. Its Committee on Comparative Effectiveness Research Prioritization rapidly came up with,

…the generation and synthesis of evidence that compares the benefits and harms of alternative methods to prevent, diagnose, treat and monitor a clinical condition, or to improve the delivery of care. The purpose of CER is to assist consumers, clinicians, purchasers, and policy makers to make informed decisions that will improve health care at both the individual and population levels.

The Committee then elicited over 2500 opinions from 1500 stakeholders and produced a list of the 100 highest-ranked topics for CER (www.iom.edu/cerpriorities). Proposals to undertake CER are pouring forth from investigators across the land. There is no doubt that an enormous amount of data will be generated by 2015. But there is every reason to doubt whether many inferences can be teased out of these data that will actually advantage patients, consumers, or the health of the nation.

I am no Luddite. For me “evidence based medicine” is not a shibboleth; it’s an axiom. Furthermore, having trained as a physical biochemist, I am comfortable with the most rigorous of the quantitative sciences let alone biostatistics. However, you can’t compare treatments for effectiveness unless you are quite certain that one of the comparators is truly efficacious. There must be a group of patients for whom one treatment has unequivocal and important efficacy. Otherwise, the comparison might discern differences in relative ineffectiveness.

The academic epidemiologists who spearheaded the CER agenda are aware of the analytic challenges but are convinced these can be overcome. I would argue that CER can never succeed as the primary mechanism to assure the provision of rational health care. It has a role as a secondary mechanism, a surveillance method to fine tune the provision of rational health care, once such is established.

The difference between efficacy and effectiveness

My assertion may seem counter-intuitive. After all, we hear every day about pharmaceuticals that are licensed by the FDA because of a science that supports the assertion of benefit. In epidemiology-speak, the science that the FDA reviews does not speak to the effectiveness of the drug, but to its efficacy. The science of efficacy tests the hypothesis that a particular drug or other intervention works in a particular group of similar patients. CER asks whether an intervention works better than other interventions in practice where the patients and the doctors are heterogeneous. The rational for the CER movement is the perceived limitations of efficacy research. I argue that the limitations of efficacy research are much more readily overcome than the limitations on CER.

Efficacy research

The gold standard of efficacy research is the randomized controlled trial (RCT). In a RCT, patients with a particular disease are randomly assigned to receive either a study intervention or a comparator (often a placebo). After a pre-determined interval, the previously defined clinical outcome is compared in the active and control limbs of the trial. If there is no difference, one can argue that the intervention offers no demonstrable clinical benefit to patients such as those in the study. If there is a difference, the contrary argument is tenable.

This elegant approach to establishing clinical utility has its roots in antiquity, at least as far back as Avicenna. The modern era commences after World War II and escalates dramatically after 1962 when the Kefauver-Harris Amendment to the laws regulating the US Food and Drug Administration mandated demonstration of efficacy before pharmaceuticals could be licensed. Modern biostatistics has probed every nuance of the RCT paradigm. The result is a highly sophisticated understanding of the limitations of the RCT, an understanding that has fueled the call for CER:

The more homogeneous the study population, the more likely any efficacy will be demonstrated and the more compelling any assertion as to its lacking. However, the homogeneity compromises the ability to assume the result generalizes to different kinds of patients.
Many important clinical outcomes are either infrequent or occur late in the course of disease. It is difficult to maintain and fund RCTs that require years or decades before one can hope to see a difference between the active and control limbs. The compromise is to study “surrogate” outcomes, measures that in theory reflect the disease process, but are not themselves clinically important outcomes. Thus we have thousands of studies of blood pressure, cholesterol, blood sugar, PSA and the like but comparatively few studies that use heart attacks, death from prostate cancer, or other untoward clinical outcomes as the end-point.
How big a difference between the active and control limbs is important? Biostatistics has dictated that we should pay attention to any difference that is unlikely to happen by chance too often. “Too often” traditionally is considered no more than 5% of the time, but that’s a matter risk-taking philosophy. What are we to make of a difference that is clinically very small, even if it is unlikely to happen by chance more than 5% of the time? Is it possible that the small effect will be important, perhaps less small, when the constraints of homogeneity are removed in practice? In practice, drugs licensed for one disease are even tried for other “off label” indications where effectiveness may emerge.
The corollary limitation relates to the negative trial. If there is no demonstrable difference, does that mean that there is no effect? Or could the effect have been too small to detect because of the duration of the trial or the size or homogeneity of the population studied? Even a very small effect, advantaging only the occasional patient, can translate into many benefited people when tens of thousands are treated.
Devices and surgical procedures are used practice; rigorous testing as to efficacy is not a statutory requirement. Maybe in the “real world” a treatment that was never studied or studied in a limited fashion turns out to really advantage patients in practice, or advantage some patients – or not.

CER to the rescue?

The methodology employed for CER is not the RCT. CER is an exercise in “observational research”. CER examines real world data sets to deduce benefit or lack thereof. This entails the development of large-scale, clinical and administrative networks to provide the observational data. Then biostatistics must come to grips with issues that make defining the heterogeneity of populations recruited into RCTs seem trivial. In the RCT, the volunteers can be examined and questioned individually and in detail and the criteria for admission into the trial defined a priori. Nothing about the validity of diagnosis, clinical course, interventions, coincident diseases, personal characteristics or outcomes can be assumed in observational data sets. There must be efforts at validating all such crucial variables. No matter how compulsively this is done, CER demands judgments about the importance of each of these variables. It is argued that some of these limitations are overcome because CER is not attempting to ask whether a particular intervention works in practice, but whether it works better than another option also in practice. It is even suggested that encouraging or introducing particular interventions or practice styles into some practice communities and not others would facilitate CER. Perhaps.

The object lesson of interventional cardiology

Interventional cardiology for coronary artery disease is the engine of the American “health care” enterprise. Angioplasties, stents of various kinds, and coronary artery bypass grafting (CABG) have attained “entitlement” status. There are thousands of RCTs comparing one with another, generally leading to much ado about very small differences, usually in surrogate measures such as costliness or patency of the stent. But there are very few RCTs comparing the invasive intervention with non-invasive best medical care of the day: 3 for CABG and 4 for angioplasty with or without stenting. In these large and largely elegant RCTs, the likelihood of death or a heart attack if treated invasively is no different from the likelihood if treated medically. Whether anyone might be spared some degree of chest pain by submitting to an invasive treatment is arguable since the results are neither compelling nor consistent. Yet, interventional cardiology remains the engine of the American “health care” enterprise. It carries on despite the RCTs because its advocates launch such arguments as “We do it differently” or “The RCTs were keenly focused on particular populations of patients and we reserve these interventions for others we deem appropriate.” These arguments walk a fine line between hubris and quackery.

So many invasive procedures are done to the coronary arteries of the young and the elderly that interventional cardiology has long lent itself to CER. We know from observational studies that that it does not seem to matter much if the heart attack patient has an invasive intervention quickly or it is delayed or not at all. We know from observational studies, and even trials rewarding some but not all hospitals for getting doctors to adhere to the “guidelines” for managing heart disease, that adherence does not make much of a difference. Do the results of this CER mean that we need to further improve the efficiency and quality of the performance of invasive treatments as many would argue? Or can we hope that more exacting CER can parse out some meaningful indication from large data sets, some compelling inference that only particular people with particular conditions are advantaged and therefore are the only candidates for interventional cardiology?

Or are we using the promise of CER to postpone calling a halt to the ineffective and inefficacious engine of American “health care”. The available science is consistent with the argument that interventional cardiology is not contributing to the health of the patient. I would argue that interventional cardiology should be halted until someone can demonstrate substantial efficacy and a meaningful benefit-to-risk ratio in some subset. Then CER can ask whether the benefit demonstrated in the efficacy trial translates to benefit in common practice.

Efficacy research is the horse; CER is the cart

Interventional cardiology for coronary artery disease is but one of many object lessons. There is much in common practice that has never been shown to be efficacious in any subset of patients. Some practices take up residence in the common sense despite having never been studied. Some practices, like interventional cardiology, persist because intellectual and fiscal interests are vested in the entrenchment despite the results of efficacy trials. CER can not inform efficacy, and CER can not inform effectiveness unless there is an example of efficacious therapy against which practices are compared. Otherwise, CER can be comparing degrees of ineffectiveness.

The way forward is to design efficacy trials that are more efficient in providing gold standards for comparison and as efficient in defining false starts that are not allowed into common practice until the approach is superseded by one of demonstrated efficacy. This is not all that difficult to do. Let’s return to the limitations of efficacy trials listed above:

Homogeneity of study populations is not a limitation for the quest for a meaningful standard of efficacy. At least we will know the intervention is good for someone.
Surrogate measures are useful to bolster the hypothesis that something might work. They have a dismal track record for testing the hypothesis that something does work. Clinically important outcomes must be invoked for such a test. If it is not feasible because the clinical outcome is too slow to develop or too infrequent, compromise is not an option. The intervention can not be studied at all, or it can not be studied until an appropriate subpopulation can be identified, or one must bite the bullet and undertake a lengthy RCT.
Surrogate outcomes are not the only way that RCT results can lead to spurious clinical assumptions. “Composite outcomes” are even worse. RCTs in cardiology are notorious for an outcome such as “death from heart disease or heart attack or the need for another procedure.” When these studies are closely read, one learns that any difference detected is almost exclusively in “the need for another procedure” which is a highly subjective and interactive outcome that can speak to preconceptions on the part of the doctor or the patient rather than the efficacy of the intervention.
Modern epidemiology is so wedded to the notion of statistical significance that concern about the statistical significance of “What?” is overwhelmed. “What?” is the clinical significance? Just because the difference observed between the active and control limbs of the RCT wouldn’t have happened by chance too often does not mean that the difference is clinically important even in the occasional patient. I’ll illustrate this by touching the Third Rail that the debate over the clinical utility of mammography has become. Malmö is a city in Sweden where women were invited to volunteer for a RCT; half would be offered routine screening mammography for a decade and the other half encouraged see their physicians whenever they had concern about the health of their breasts. That’s the difference between screening and diagnostic protocols; in screening one is agreeing to a test simply as a matter of course, in diagnostics one agrees to the testing in response to a clinical complaint. Back to the Malmö RCT. Over 40,000 women between age 40 and 60 volunteered for the RCT. Invasive cancer was detected in statistically significantly more women who were in the screened group than in the diagnostic group. Impressed? How about if I told you that 7 of 2000 women screened for a year were found to have invasive breast cancer and 5 of 2000 women in the diagnostic group for a year were found to have invasive breast cancer. Was all the screening worth this difference in absolute number of additional cancers detected? I could have told you that screening detected 40% more cancers but you won’t be swayed by the relative increase now that you know the absolute increase was 0.1%, will you? Would you consider the screening valuable if I told you that for every woman whose invasive breast cancer was treated so that they lived long enough to die from something else at a ripe old age, another two were treated unnecessarily since they died from something else before their breast cancer could be their reaper? How about all the false positive mammograms and false positive biopsies? There is a debate about mammography because it is a very marginal test that clearly is not doing as well as the common sense assumes.
How small an effect can we detect in a RCT? Theoretically we can detect a very small effect. Theoretically we can detect an effect even smaller than the Malmö result. In order to do so, you need to randomize a large, homogeneous population whose size is determined by the level of statistical significance you choose and the nature of the health effect you seek. Death is the least equivocal outcome, for example. The quest for the small effect is the mantra of modern epidemiology. However, I consider such “small effectology” a sophism. No human population is homogeneous; we differ one from another in obvious, often measurable ways but also in less obvious, immeasurable ways. When we randomize individuals in any homogeneous population into a treatment group and a control group we assume that all the immeasurable differences randomize 50:50 or if not the randomization errors counterbalance. The smaller effect we are seeking, the more likely we are to be fooled by randomization errors that account for the difference rather than the treatment. That’s why so many small effects that emerge from RCTs do not reproduce.

Evidence Based Medicine can be more than a Shibboleth

The philosophical challenge in the design of efficacy trials relates to the notion of “clinically significant.” How high should we set the bar for the absolute difference in outcome between the treated and control groups in the RCT to be considered compelling? One way to get one’s mind around this question is to convert the absolute difference into a more intuitively appealing measure, the Number Needed to Treat (NNT). If the outcome is readily measured and unequivocal, such as death or stroke or heart attack, I would find the intervention valuable if I had to treat 20 patients to spare 1. Few students of efficacy would be persuaded if we had to treat more than 50 to spare 1. Between 20 and 50 delineates the communitarian ethic; smaller effects are ephemeral. For an outcome that is more difficult to measure than death or the like, an outcome that relates to symptoms or quality of life, I would argue for a more stringent bar.

If we applied this logic to RCTs, the trials would be far more efficient (in investigator/volunteer time, materiel, and cost) and the results far more reliable. If we applied this logic to RCTs, we would eliminate trials designed only to license agents no better than those already licensed (“me too” trials) and trials designed only for marketing purposes (“seed” trials). If we only licensed clinically efficacious interventions going forward, we could turn to CER to understand their effectiveness in practice. If we applied this logic retrospectively, to the trials that have already accumulated, we would soon realize how much of what is common practice is on the thinnest of evidentiary ice, how much has fallen through and how much supports an enterprise that is known to be inefficacious. It would take great transparency and political will to apply this razor retrospectively. We, the people, deserve no less.

Nortin M. Hadler, MD, MACP, FACR, FACOEM (AB Yale University, MD Harvard Medical School) trained at the Massachusetts General Hospital, the National Institutes of Health in Bethesda, and the Clinical Research Centre in London. He joined the faculty of the University of North Carolina in 1973 and was promoted to Professor of Medicine and Microbiology/Immunology in 1985. He serves as Attending Rheumatologist at the University of North Carolina Hospitals.

For 30 years he has been a student of “the illness of work incapacity”; over 200 papers and 12 books bear witness to this interest. He has lectured widely, garnered multiple awards, and served lengthy Visiting Professorships in England, France, Israel and Japan. He has been elected to membership in the American Society for Clinical Investigation and the National Academy of Social Insurance. He is a student of the approach taken by many nations to the challenges of applying disability and compensation insurance schemes to such predicaments as back pain and arm pain in the workplace. He has dissected the fashion in which medicine turns disputative and thereby iatrogenic in the process of disability determination, whether for back or arm pain or a more global illness narrative such as is labeled fibromyalgia. He is widely regarded for his critical assessment of the limitations of certainty regarding medical and surgical management of the regional musculoskeletal disorders. Furthermore, he has applied his critical razor to much that is considered contemporary medicine at its finest.

So Much For Comparative Effectiveness

Nov 20, 2009• 82

By MERRILL GOOZNER

The Obama administration’s commitment to cost control in health care can now be summed up in four words: Not on our watch.

Health and Human Services Secretary Kathleen Sebelius told American women this week that they have nothing to learn from the science that led to the U.S. Preventive Services Task Force guidelines on mammography.Insurance companies won’t change their payment policies, and the independent doctors and scientists who made up the USPSTF task force “do not set federal policy” or determine what services are covered by the federal government.”

What a golden opportunity has been missed to educate Americans about the implications of their health care choices. Otis W. Brawley, the chief medical officer of the American Cancer Society, in an op-ed in today’s Washington Post condemning the USPSTF guidelines, confirms that mass screening would only save at a maximum 600 out of the 4,000 women under 50 who die of breast cancer annually. What he failed to point out is that 1.14 million American women would have to be screened annually for ten years to achieve that goal. To cover the entire cohort (all women between 40 and 49) to replicate that benefit every year would require screening 11.4 million women annually. The cost, at $200 per mammogram (my initial estimate was accurate, according to this New York Times business section article), would come to $2.24 billion annually for the health care system.

Controlling Health Care Costs: How to “Bend the Curve”

Nov 10, 2009• 5

Stephen Shortell

By STEPHEN SHORTELL

As Congress nears passage of the first substantial health care reform in decades, there is an ominous challenge: No reform will be sustainable unless we slow the rapid growth of health care spending.

Health care costs are rising at a staggering pace. Expenditures have been increasing at 2.7% per year faster than the rest of the economy over the past 30 years. In 1980 the US spent about 8% of GDP on health care. We now spend over 17%. We need to rein in growth of health care spending to levels no higher than overall economic growth — or ideally “bend down” the growth curve to an even lower figure.

How do we “bend the curve”? What are the best ways to slow the growth of health care costs, thus making other reforms sustainable?There are three major areas in which reforms will help bring health care spending under control.Prevention: US health care is burdened by diseases that are preventable. If we can improve lifestyle issues – nutrition, exercise, obesity, tobacco use – we will lower the future incidence of diabetes, heart disease, cancer, and other costly maladies. Current health reform proposals that allocate $10 billion for a Prevention and Wellness Fund represent a major step in the right direction. Disease prevention likely provides the greatest return on investment regarding health care costs of anything we do.

Hospital and Physician Behavior: Hospitals have no incentives to prevent unnecessary hospitalization. Physicians, paid mostly by fee-for-service, have every incentive to order more tests and procedures. Neither is rewarded directly for making – or keeping – patients healthy. Key to controlling health care costs in the future will be to realign these incentives.

This will require performance measurement and public reporting for both cost and quality. Provided that predetermined quality criteria are met, hospitals and physicians who can provide better care for less money would share in the savings.

Senate Healthcare Bill Amendment Allocates Your Tax Dollars To Quacks

Aug 4, 2009• 37

By DR. VAL

With healthcare costs spiraling out of control, and major rationing efforts under consideration – can we really afford to allow purveyors of pseudoscience to use up scarce Medicare/Medicaid resources? It’s hard to imagine that Obama’s administration would approve of extending “health professional” status to people with an online degree and a belief in magic – but a new amendment would allow just that. What happened to our “restoring science to its rightful place” and why are we emphasizing comparative effectiveness research if we will use tax dollars to pay for things that are known to be ineffective?I hope someone reads and removes this amendment pronto (h/t to David Gorski at Science Based Medicine):Here’s the language that Sen. Harkin has slipped into the 615 page Senate version of the health care reform bill:

Explaining Runaway Costs: The Lobster or the Salad?

Jul 28, 2009• 52

By BOB WACHTER, MD

Have you found yourself ‘splaining to friends and family why the healthcare system is so damn expensive? I’ve been teaching health policy for a couple of decades, and I’m surprised that my two favorite stories haven’t yet surfaced in all the discourse. Here they are, in the hopes that they help you, or someone you love, understand why medical care is bankrupting our country.

Let’s start with the Expensive Lunch Club, a story I first heard from Alain Enthoven, the legendary Stanford health economist. It goes like this:

You’ve just moved to a new town and stroll into a restaurant on the main drag for lunch. None of the large tables are empty, so you sit down at a table nearly filled with other customers. The menu is nice and varied. The waiter approaches you and asks for your order. You’re not that hungry, so you ask for a Caesar salad. You catch the waiter looking at you sideways, but you don’t think too much of it. He moves on to take the order of the person sitting to your right.

“And what can I get for you today, sir?”

“Oh, the lobster sounds great. I’ll have that.”

You’re taken aback, since the restaurant doesn’t seem very fancy, and your tablemate is dressed rather shabbily. The waiter proceeds to the next customer.

“And you, ma’am?”

“The lobster sounds good,” she says. “And I’ll take a small filet mignon on the side.”

Now you’re completely befuddled. You tap your neighbor on the shoulder and ask him what’s going on.

“Oh, I guess nobody told you,” he whispers. “This is a lunch club. We add up the bill at the end of the meal, and divide it by the number of people at the table. That’s how your portion is determined.”

You frantically call back the waiter and change your order to the lobster.

“If the waiter makes a 15% tip on the total bill and you ask him to recommend a dish,” Enthoven asked our health econ class, a glint in his eye, “do you think he’ll recommend the salad or the lobster?”

“And if most of the lunch business in town is in the form of these lunch clubs, do you think you’ll find more restaurants specializing in lobster or in salad?”

I have always found this story to be the best way of explaining how the fee-for-service incentive system drives health inflation – and how it isn’t just the hospitals, or the providers, or the patients who are the problem. It’s everyone.

The second story involves one of the great innovations in the annals of surgery: laparoscopic cholecystectomy, or “lap choley” for short. As you may recall, the old procedure for removing a gall bladder involved an “open cholecystectomy,” a traditional “up to the elbows” surgical procedure. It was a nasty operation: patients stayed in the hospital for a week, recuperated for a month, and ended up with a scar that began in their mid-abdomen and didn’t end till it reached Fresno. The surgery was exquisitely painful, and had a high complication rate and a non-trivial mortality rate. And it was hecka expensive.

In the late ‘80s, along came lap choley, in which the surgeon makes a few inch-long slits in the abdomen, then inserts narrow mechanical arms that can cut and sew while allowing him to monitor the patient’s innards through a tiny camera. With this revolutionary “keyhole” procedure, patients had shorter hospital stays (1-2 days instead of a week), a much shorter convalescence, and a far lower complication rate (and negligible mortality). And costs were reduced by about 25 percent.

This was innovation – the new procedure was safer, less painful, and far less expensive. So what do you think happened to national expenditures for surgical management of gallstone disease after the advent of lap choley?

You know the answer. During my training in the 1980s, we were taught that you only removed a gall bladder containing gallstones when it was infected (“cholecystitis”), unless the patient was diabetic (the much higher complication rate of cholecystitis in diabetics justified prophylactic cholecystectomy). We told all the other patients with known gallstones to avoid fatty foods and to come to the ER promptly if they had severe belly pain, developed a fever, or were mistaken for a pumpkin. Most of these patients ultimately died with their gallbladders still in their abdomens, not the pathology lab.

But lap choley led to “indication creep” – the surgery now seemed benign enough that we began to recommend cholecystectomy for anybody with “symptomatic gallstone disease.” Since everybody ends up with an ultrasound or CT at some point in their life, we find lots of gallstones. Symptomatic? How many people do you know who never have belly pain? Do you? (Perhaps you need your gall bladder out.)

So, whereas technological innovation usually lowers costs in other industries (Exhibit A: Moore’s Law), in healthcare it often raises them as the indications for expensive procedures change faster than the unit price.

Is there a way out of the lap choley conundrum? Perhaps comparative effectiveness research will help – it might tell us precisely which patients will, and won’t, benefit from lap choley. All the usual issues must be navigated.

The expensive lunch club and the story of lap choley are two reasons why our healthcare system consumes 16% of our GDP. Sure, there is waste, greed, and fraud in healthcare, but I find the stories helpful because they illustrate how the actions of perfectly reasonable doctors, patients, and administrators will lead to inexorable inflation if the system isn’t changed in fundamental ways.

That increasingly seems like an awfully big “if”.

Robert Wachter is widely regarded as a leading figure in the modern patient safety movement. Together with Dr. Lee Goldman, he coined the term “hospitalist” in an influential 1996 essay in The New England Journal of Medicine. His most recent book, Understanding Patient Safety, (McGraw-Hill, 2008) examines the factors that have contributed to what is often described as “an epidemic” facing American hospitals. His posts appear semi-regularly on THCB and on his own blog “Wachter’s World,” where this post first appeared.

The Case for Comparative Effectiveness Research

Jul 13, 2009• 7

By RAHUL PARIKH, MD

When I was a kid growing up in Los Angeles, there was this local TV show my dad used to enjoy watching called Fight Back with David Horowitz. Basically, Horowitz, a TV reporter and consumer advocate, used to put the claims a manufacturer made about their products to the test—whether it was if Samsonite luggage could withstand abuse from a Gorilla or Bounty really was the “quicker picker upper,” it was on its show and ended up either endorsed or debunked by it. It was Consumer Reports come to life, if you will—pitting products against one another to see which one was worth putting down some hard earned dollars for.

Now, over 30 years later, we in medicine are just getting around to doing the exact same thing that Horowitz was with retail way back in the 1970s—comparing the claims made by drug and device makers about their products.