Big Data? Read the Fine Print

flying cadeuciiAs you think about claims data, the information is capturing the services provided to a patient by a healthcare provider for preventive care or for the diagnosis or the treatment of a condition.

This information can be grouped by different cohorts—those getting preventive exams, those examining categories of care, or those that seeing specific physicians and/or hospitals for conditions. These data, for example, can be grouped by diagnoses, called a diagnosis related group, involving a hospital stay.

However, all claims data is just a collection of medical bills. Medical bills do not contain a complete look at the patient, such as important information as a patient’s prognosis. That’s a gap. Thus, it is important to set appropriate expectations on the use of the data.

Number 1 (one of the most important): Avoid the averages 

Most claims data sets are not normally distributed, so the averages do not provide relevant information. In most discussions today, employers evaluate the average cost of employees with specific conditions, e.g., diabetes or high blood pressure. This is a flawed approach because spending by employees with various chronic conditions is skewed, thus not really “averageable”. For example, assume 90% of an employee population with diabetes is spending $10,000/year and 10% is spending $250,000/year, the average will be a meaningless $34,000/year. All too often, a wild goose chase ensues, when in fact the focus should be on the $250,000 cohort to understand why they were so much more expensive.

Number 2: Follow the money 

A superior use of claims data is to look at distributions of spending. In most plans today, roughly 8% of enrollees are consuming 80% of plan dollars, and these 8% typically change every twelve to eighteen months. (We still run into benefit managers who were unaware of that.) The future belongs to micro-managing these “outliers”, rather than the 92% who spend only 20% of the dollars. If you study those outliers carefully, you will find that only about 7% of their spending possibly would have been preventable, and then only if they faithfully did what their doctors told them to do decades earlier. A cardiologist recently told me that of the patients he has seen with a significant acute blockage, about 25% had no known health risks of any kind…no high blood pressure, cholesterol, diabetes, obesity, no smoking, no genetic predisposition, etc. As such, there is a component of randomness in terms of many who gets blocked arteries. The same holds true for cancer. For the other 75%, their physicians have usually counseled them on the importance of exercise and nutrition and the dangers of tobacco use, but to no avail.

Number 3: Realize the limitations for quality designations 

Yet another big error is trying to use claims data to determine the best quality doctors. You better be really, really talented to try that one. Why? We are in an era in which many doctors are making their “quality” and “outcomes” look better by referring their most complex and risky patients to someone else. (Much has been written about this.) On the other hand, there are highly effective doctors, who take responsibility for their riskiest patients, but as a consequence score poorly on so-called “quality measures”. The real travesty is that the low scoring doctors ironically may be the most cost-effective and provide the best care.

Number 4: Misdiagnoses are a real cost driver 

Another huge shortcoming of claims data is one that Readers of Cracking Health Costs know about. Namely, a large number of patients with complex health problems are simply misdiagnosed – today, that’s about 20% of the outliers in benefit plans accounting for 18% of claim dollars. Thus, you cannot rely on diagnoses in claims data, and you cannot tell who is getting diagnoses right or wrong – this takes detective work beyond claims data. Click here for a good article by the Mayo Clinic on rates of misdiagnoses. We have sent hundreds of people to the Mayo Clinic for second opinions and can verify by personal experience the truth in that article…same for other clinics we have used for employers. Our first rule in selecting a Center of Excellence is its success in correctly diagnosing patients with complex health problems. Huge amounts of claim dollars are spent on treatments or surgeries that are either completely erroneous or clearly suboptimal. An executive at a Fortune 100 company once said to me that the biggest quality failure in healthcare is to misdiagnose a patient…everything that follows harms the patient.

Number 5: Coding can impact the data analysis

During a data analysis for a very larger employer, over 250k covered lives, they told me they had not paid for a solid organ transplant in a number of years. Based on their size, they should have been paying for about 25 a year. After further detective work, we discovered their consultant was using a DRG grouper that coded all transplants as ventilator cases…who knows why…but a huge error. The benefit team had no idea they were really paying for about 25 a year at an average cost over five years of about $1,500,000 each.

Number 6: Reversion to the mean

One thing we’ve learned from years of claims analysis of big companies’ benefit programs is that if you have enough life years of data, it all looks about the same, i.e., it reverts to the mean. If the workforce is comparatively older, they will have somewhat more high cost claims.

Tom Emerick and David Toomey are founders of Thera Advisors. Their focus is to help employers maximize their role as the purchaser of healthcare services in working with suppliers to impact their population’s health and to lower costs.

Livongo’s Post Ad Banner 728*90

Categories: Featured, THCB

Tagged as: ,

Leave a Reply

3 Comment threads
0 Thread replies
Most reacted comment
Hottest comment thread
3 Comment authors
William Palmer MDTom EmerickNortin Hadler Recent comment authors
newest oldest most voted
William Palmer MD
William Palmer MD

I’ve alyays thought that the comparison between a health plan’s actual caseload of a (well-defined) disease and the theoretic Bayesian-inferred caseload would be a fine index of quality. In other words, we know a population should have roughly 6% diabetics, maybe 9% hypertensives, 1% hemochromatosis, 1% schizophrenics, between .3% and 2% celiac disease, et al…Where are these folks in the plan? Did the plan identify these people? How are they being treated? How are they doing? Naturally, there will be many caveats, but I think this would be a reasonable approach to an estimate of global diagnostic accuracy in a… Read more »

Tom Emerick
Tom Emerick

Nortin, thanks of the comment and questions. My definition of a Center of Excellence (COE) is this. First the doctors have to be integrated and accountable. That is, someone is observing their work and trying to ensure good medicine is being practiced, not just the most profitable, an advantage the UK’s system has over ours. Further they have to have established a track record of both getting complex diagnoses right and offering the safest and least invasive way to solve a patient’s health problem. I’ve been sending patents for second opinions to what I call a COE for decades and… Read more »

Nortin Hadler
Nortin Hadler

Another highly informative post. Thank you. However, you seem wedded to the notion of a “Center of Excellence” because you believe that these Centers make valid diagnoses and eschew the unnecessary. On what do you base your confidence? Obviously you are choosing “Centers of Excellence” because you consider them to be excellent in performing treatments that are sufficiently costly to justify the exercise of shipping patients away from their local providers. If these “Centers” happen to have a business model that demands a full schedule of lucrative procedures, what makes you certain that they are serving your referred patients more… Read more »