In God We Trust. All Others Must Bring Data.

I knew it would happen sooner or later, and earlier this week it finally did.

In 2003 US News & World Report pronounced my hospital, UCSF Medical Center, the 7th best in the nation. That same year, Medicare launched its Hospital Compare website. For the first time, quality measures for patients with pneumonia, heart failure, and heart attack were now instantly available on the Internet. While we performed well on many of the Medicare measures, we were mediocre on some. And on one of them – the percent of hospitalized pneumonia patients who received pneumococcal vaccination prior to discharge – we were abysmal, getting it right only 10% of the time.

Here we were, a billion dollar university hospital, one of healthcare’s true Meccas, and we couldn’t figure out how to give patients a simple vaccine. Trying to inspire my colleagues to tackle this and other QI projects with the passion they require, I appealed to both physicians’ duty to patients and our innate competitiveness. US News & World Report might now consider us one of the top ten hospitals in the country, I said, but that was largely a reputational contest. How long do you think it’ll be before these publicly reported quality measures factor heavily into the US News rankings? Or that our reputation will actually be determined by real performance data?

The late Israeli prime minister Golda Meir once quipped, “Don’t be humble, you’re not that great.” To me, the launch of the Hospital Compare website was our Golda Meir moment.

And yesterday, when USN&WR unveiled its annual “Honor Roll” of “America’s Best Hospitals” for 2012, our Golda Meir moment came home to roost. UCSF had fallen out of the top 10 for the first time in over a decade – dropping from number 7 to number 13. The headline, though, was that Johns Hopkins Hospital had been pushed off its Number One perch, the coveted (and widely advertised) spot it held for 21 years.

What happened? When US News launched its Best Hospitals list with its April 30, 1990 issue, the entire ranking (which, then and now, considers only large teaching hospitals with advanced technologies) was based on reputation – a survey of 400 experts in each specialty rated the best hospitals in their field. Was this a measure of quality and safety? Maybe a little. But I’d bet that the rankings had more to do with the prominence of each hospital’s senior physicians, its publications and NIH portfolio, the quality of its training programs, and its successes in philanthropy than with the quality of the care it delivered. While the magazine changed the methodology to include some non-reputational outcome and process data in 1993, the reputational survey remained the most important factor. A 2010 Annals of Internal Medicine study found that the reputational score explained the final overall ranking in more than 90% of cases.

For this year’s ranking, USN&WR changed its methodology again, dialing down the credit for reputation (to just under one-third) and increasing the weight given to other statistical measures of quality, such as risk-adjusted mortality (32.5%), patient safety (things like major post-operative bleeding, 5%), and “other care-related indicators” (such as nurse-staffing, 30%). The result is that the Top Ten list – which had been about as predictable as the one chiseled on Mount Sinai – now offered some drama. In addition to UCSF and Hopkins, others that dropped included Penn (10th to 15th) and Michigan (14th to 17th); Vanderbilt, Stanford, and University of Washington fell off the list (which ends at 17) entirely. Moving up were Mass General (now number one), Pitt (from 12th to 10th), and two hospitals that had previously been off the list (NYU, now 11th, and Northwestern, now 12th).

Americans love rankings, and the hospital ranking game has become big business. Ranking is a cash cow for the rankers: when US News publishes a rankings issue (not just of hospitals but professional schools, colleges, and more), it’s a guaranteed hot seller. Rankings are also a big deal for those who get ranked. Hospitals are now ranked by at least half a dozen organizations, including Solucient (recently renamed Truven) and Healthgrades. Even the Joint Commission, which used to limit its work to hospital accreditation, has joined the ranking business. Just in the past two months, two new rankings were released, one by the business coalition The Leapfrog Group and another by Consumer Reports. Add up all the top hospitals lists and you’ll see some 500 American hospitals that can (and do, in billboards large and small) claim to be “One of the Top 100 in the Nation.”

Importantly, the evidence that patients use these rankings to make choices about where to receive their care is limited (Bill Clinton famously choose a “poorly ranked,” on outcomes at least, heart surgeon for his heart bypass operation, based on the surgeon’s reputation). Yet there is no question that good results are touted widely and disappointing results lead to significant soul searching, changes in resource allocation, and even some real improvements. After our Golda Meir moment in 2003, my hospital utterly transformed its approach to quality, safety, and patient experience, and we have made amazing strides (including, you’ll be pleased to know, in our pneumovax rate). Without question, UCSF is a far better hospital today than it was then, and I don’t think that would have happened without public reporting and rankings. The fact that a few of our peer hospitals moved the needle even farther than we did, as reflected in this year’s USN&WR list, will motivate us to do still better.

While some of the energy that rankings create is healthy, there is also a dark side, mostly because today’s quality measures are far from perfect. As the skin in the quality game increases, so too will the unintended consequences. Extra energy and money will go into the problems that feed the rankings, much of it drawn from areas that are just as important but not measured. Just consider all of the attention being lavished on preventing hospital falls and central line infections, safety problems that are not nearly as consequential or common as diagnostic errors (which have received considerably less attention because they’re so hard to measure). Great performance on some measures – like ultra-tight glucose control or the four-hour door-to-antibiotics measure for pneumonia – was ultimately proven to be harmful to patients.

And, as long as many of the outcome measures (such as mortality and readmission rates) are judged based on “observed-to-expected” ratios, hospitals will find it a lot easier to improve their ranking by changing the “expected” number (through changing their documentation and coding) than by actually improving the quality of care. You can bet that every hospital vying for a Top Ten spot is working this angle vigorously (with the aid, of course, of pricey consultants), resulting in something of a coding arms race. Appropriate coding is important and it is worthwhile to truly document our patients’ severity of illness, say by writing “severe sepsis” rather than “sepsis” when it is clinically apt. But this effort to document every co-morbidity and to use words that will trigger higher expected mortality rates can border on Kafkaesque. One consultant recommends that clinicians chart “functional quadriplegia” (yes, it’s got its own ICD-9 code, 780.72) when describing a bedbound patient. I’m sorry, but that’s just silly.

It’s easy to point to the gaming and the potential for unfairness, and to dismiss rankings as a childish and wasteful enterprise, more Reality Show than science. To me, though, the upside far outweighs the downside. Ranking and public reporting does serve to motivate hospitals to take quality and safety seriously, and to invest in systems and people to improve them. The unintended consequences should become less prominent as we develop more robust measures and as we are forced to all report measures the same way – the latter should be a key goal of regulators and an important deliverable for IT vendors. At my hospital, this year’s dip will drive us to redouble our efforts to improve the care we deliver. For patients, that seems like a win.

And as for being #13, well, that still makes us the top ranked hospital within 300 miles of San Francisco. As I said, we healthcare folks are competitive souls.

Robert Wachter, MD, professor of medicine at UCSF, is widely regarded as a leading figure in the patient safety and quality movements. He edits the federal government’s two leading safety websites, and the second edition of his book, “Understanding Patient Safety,” was recently published by McGraw-Hill. In addition, he coined the term “hospitalist” in an influential 1996 essay in The New England Journal of Medicine and is chair-elect of the American Board of Internal Medicine.  His posts appear semi-regularly on THCB and on his own blog, Wachter’s World.