Not all Ratings Are Equal: Part II

By Read Part I here.

Why are all ratings not equal? Because they are designed for different purposes!
Herein lay the underlying truth to the many objections posed by organizations being rated. Rightfully so, the Three R’s (ratings, rankings and reviews) of providers must be kept in the context of overall purpose. This is one of the challenges to getting The Three R’s accepted and to making report cards right.

Health care is a big industry to rate and it is going to take more than one blog entry to develop a clearer picture of how best to move forward and embrace ratings systems, but let’s put down some context and history, as it is important to our current day objections and it is instructive to our future direction.

In the Beginning… in a fee-for-service market, before we had enough data to understand the enormous variability of clinical care, and before HCFA first contemplated releasing mortality data, performance measurement was all about financial performance measures. The ratings and rankings were quite simply all about financial and operating ratios, and hospitals were the institutional providers who were the rated with the CFO taking the bullet. Thanks to the public debt markets of the municipal bond industry, the hospital industry’s bricks, mortar and technology were mostly financed by long-term tax-exempt municipal bonds. Like most all other financial instruments these bonds are purchased and sold in the secondary markets long after the initial raising of capital, in some case decades. Being a predominantly not-for-profit industry, there exists no statutory reporting of a hospital’s financial results, and thus the Bloomberg terminals used by traders were void of hospital performance data, and the secondary bond market and the portfolio surveillance by large bond funds and bond insurers was a real challenge! No current data, no timely ratios, no real-time analysis…plenty of risk for those trading bonds. Sound familiar?

Follow the Data…as anyone trained in health care administration or health policy knows the requisite information was remarkably parked in an annual filing called the Medicare Cost Report. This reconciliation document was a required submission to the Fiscal Intermediary in each of Medicare Regions around the country. It was a complex document (sort of like a tax return for a hospital) that helped justify the cost of services that a hospital based its DRG and other fee-for-service reimbursement upon. The paper documents include many schedules of detail (not just Medicare measures but hospital-wide performance) and it included the income statement and balance sheet of the submitting hospital. Collecting all 6,000 Medicare Cost Reports would greatly inform one about the US hospital industry… and by doing so, created today’s billion dollar rating, ranking and review industry.

It Takes Young Turks…like serial entrepreneur George Pillari, the Johns Hopkins University mathematics whiz working with Stephen Renn and Gerry Anderson at the Johns Hopkins School of Public Health to challenge conventional wisdom and see the need to consolidate the disparate data to meet the needs of the financial markets. The earliest ratings, ranking and review results about hospitals were produced for the financial markets and the findings were published as The Distressed Hospital Quarterly a compendium of the hospitals in the U.S. carrying the highest technical risk for financial insolvency. The company was HCIA, Inc and the year was 1988.

The Freedom of Information Act… allowed us ( I joined HCIA in 1990.) to collect mountains of data from CMS, build repositories, then measures, then benchmarks and advanced methodologies to adjust for case-mix, severity, risk-models and predictive models to level the playing field and account for appropriate levels of clinical variation of performance. It was the availability of the data, the national denominator of having all Medicare certified providers, the understanding and the models that released the insights about performance that this industry still chases today. And we learned it first, that not all providers are equal! It would take large scale analysis, large repositories of clinical, financial and operational data and a desire to make sense of the cohorts and peer groups among the nation’s hospitals to create the first comprehensive national hospital ratings initiative…it was born 100 Top Hospitals: Benchmarks for Success and it grew to include Top 40 Hospitals in the U.K. and Top 20 Hospitals in Spain.

Takeaway #1 – You can’t change what you don’t measure, and you can’t measure without the data and scale. Anyone objecting to the current ratings systems needs to point to the better data sources. Building nationally representative data warehouse with new data sets is not for the faint of heart, yet build it (or release it in the case of CMS) and it will get used. History has shown this over 25 years.

Different Audience, Different Purpose… made for the most important yet simple discovery of the ratings field, and it is instructive. If you are going to serve multiple constituencies (sometimes competing) with information you better be prepared to sit on top of the fence where the barbs are the sharpest. In other words even then, transparency, objectivity, methodology, perspective and insights must be balanced with a fair dose of humility and honor. We had “Distressed Hospitals” for the financial sector, and “Top Hospitals” for the hospital industry; each with different metrics, methodology and purpose. Our methods were transparent and our purpose and goals clear.

Takeaway #2 – Become better informed if you are being rated; learn about the data and metrics that are being scored and understand the audience for whom the results are intended. For example the USNews rankings are intended only to identify where the “most serious or complicated care” is recommended by a relatively small survey of physicians…not for routine or routine complex care. Yes, it is subtle but significant. So for St. Vincent Hospital in Worcester or Berkshire Health (both outside Boston) they were basically never in the running to begin with. But, if they were an Academic Medical Center (as determined by being a member of the Council of Teaching Hospitals or COTH) they would be in the running and need to understand the methodology. This leads me to Takeaway #3:

Takeaway #3 – The data and methodology should clearly describe the purpose and meet the needs of the target audience in order to be credible. For example, in my opinion USNews has no business selling magazines to consumers based upon a methodology that is; overly sensitive to the opinions of a couple hundred doctors per year and so narrowly focused on “the most serious or complicated care” when consumers have no idea what that is, how to differentiate it from routine complicated care, or routine care, or complex care or acute care or critical care…you see my point. Request: If anyone knows how to determine in advance of needing care whether you need services from hospitals that have been voted as having the most complicated care, please post. I am also interested in finding any trials that show that providing care to complicated cases results in better overall health outcomes, or improves the handling of complex or routine care or just ordinary acute care or emergency care?

Both USNews and their “research partner” RTI, Inc. know of these objections, and if you read the public comments on their Web site (here) so do many professionals. USNews in my opinion is not credible, yet they have marketing power to a consumer constituency. They are part of the problem and they are perhaps the best example of the most typical and reasonable objections that many providers have with rating systems. They deliver the wrong message to the wrong audience!

Location, location, location… the three L’s that describe the value of real estate, yet it applies to the five W’s of information, ratings and report cards. Who is the audience? What was the purpose? When was the analysis done? Where were they influenced? Why is it relevant? I would like to think that there are a limited number of audiences for this information, but let’s discuss the conflicting interests of the major users and reveal some more pertinent objections and critical needs that are not being met. We’ll look at Purchasers, Providers and Consumers:

Purchasers – This group really goes down two paths, a) health plans and insurers who represent employers (including self-insured) and b) Medicare that represents the interest of all taxpayers to the benefit of its enrollees. Everyone wants the best outcomes at the fairest price…OK, most everyone. The greatest challenge and biggest objection here is that “the wrong things are being measured”. I couldn’t agree more in that we haven’t come far enough, yet we are measuring what we have data to measure. In my opinion, for both Medicare and employer-sponsored care we should be most focused on the end outcome of a medical episode…such as the functional status of the patient, not mortality, complications or cost. Why? Because we should want to know that the patient was able to return to society within a reasonable window of time, and that their function (walking around the block, climbing a set of stairs, getting back to work, caring for oneself) was restored. Remember that in the case of the employer, the cost of; lost productivity, lost wages, poor efficiency, disability and compensation and maintaining market competitiveness, far exceed the direct medical cost of an episode of care. HCMS Group does a nice job of this for employers and governments. Yet, we can’t measure what we don’t have data to measure for everyone. Thus we end up with an array of metrics for ratings that monitor compliance with known measures of process versus known measures of desired outcomes. We would be better off measuring the Functional Status (using the internationally recognized SF-36 survey instrument) of patients than scoring compliance with certain process measures such as “Heart attack patients given aspirin at arrival”. Shouldn’t measures this simple be subject to a common sense index? Yet, the answer when we measure these simple things is “no”, not everyone complies, and as a result this measure will be around until it is “topped out” at 100% compliance.

So, whether we measure performance with process of care measures such as how frequently we are counseling patients on smoking cessation or how efficient we are at getting a patient back into the work force, providers will be rated, ranked and reviewed upon measures that may not seem relevant to everyone who provide care. The best thing for providers to do to advance their own cause is to get 100% compliant on the current process of care measures while getting activated in building the data warehouses for what measures are most relevant in the longer term. There is no doubt that until providers get out in front of the story, the purchasers economic incentives will stand in the way and will continue measuring the wrong things, or at least they will continue measuring those things for which they have data and resources to analyze.

But it’s administrative claims data! Another common objection of ratings by purchaser is the use of administrative claims data. This is a fair objection to most analysts as they are always thrilled to have more granular patient record-level data. Yet, it must be said that until administrative data stops producing accurate findings it will continue to be used as a high-level screening tool. Additionally, since providers can now link to detailed medical record data, false positives are quick to be verified allowing chart reviews to speedily move analysis on to the next case or record.

But, my patients are different and sicker! Administrative data has also been successfully modeled to adjust for clinical severity and risk; proving what we already know about large data sets… that they speak loudly when properly adjusted for variances in; cost, frequency, price, cost of living, population psychographics and demographics. Daniel Gilden lays this out nicely in his posting where Atul Gawande’s New Yorker essay about McAllen, Texas is best explained empirically by these known variables.

And my favorite, but they only studied patients who had died! That’s right folks, ratings do need to be representative of the served population by the providers and include the success as well as the failure to produce a credible result. And so if you provide primary care in the coal towns of West Virginia or in the high altitudes of Grand Junction, CO your performance should be appropriately adjusted for population risk.

Takeaway #4 –“God grant me the serenity to accept the things I cannot change; the courage to change the things I can; and the wisdom to know the difference.” The Serenity Prayer. No one will stop the analysis of provider data, but we can always improve upon the standards by which we measure performance.

In my next blog post Part III, I will address the objections and challenges that the Consumer and Provider industry must face to put some order and understanding into the ratings that serve those constituents.

In the interim, I welcome all comments and research findings.

John R. Morrow has founded, created and contributed to a variety of national and international ratings programs including; 100 Top Hospitals : Benchmarks for Success a Thomson-Reuters product, The Patient Satisfaction Index™ a National Research Corporation product, The Hospital Value Index™ a Press Ganey & Associates property, and is currently in Beta with Distinguished Doctor™, a new doctor profiling initiative. Morrow was a Principal at HCIA/Solucient, CEO of CHKS Ltd U.K., SVP at HealthGrades and is Principal at The Ratings Guy LLC. John welcomes all comments.

2 replies »

  1. Mr. JOHN R. MORROW has started very good and interesting topic and it is true that health care is a very big industry to rate, but sometimes it is ludicrous that the way in which people rank a doctor or clinic because they decide ranking by fee structure of doctor. people think that if a doctor is charging more for treatment, then that is the best doctor, but this is not the way by which you can decide ranking of a doctor there should be few different criteria like his attention towards your health, movement of his hands during treatment, how much time he gives you these are the few criteria on the basis of which people can decide ranking of a doctor.

  2. “And my favorite, but they only studied patients who had died! ”
    I’m glad you brought this up, since I am having a bit of a problem understanding these studies. Let’s assume the following two scenarios:
    Hospital A – Admitted 200 patients with similar conditions. 50 died at the hospital and 50 more died a couple of months later at home. 100 survived and are doing well. Hospital spent $2 on each patient, dead or alive.
    Hospital B – Admitted 200 patients with same conditions as Hospital A. 50 died at the hospital, 100 more died a couple of months later at home. 50 survived and are doing fine. Hospital spent $1 per patient, again, dead or alive.
    Retrospective studies on dead people will show Hospital A as spending twice as much as Hospital B for end of life care, with the same percentage of hospital deaths (not sure they even measure this). Therefore Hospital A is wasteful and Hospital B is efficient.
    Which hospital would you rather go to?