By KIP SULLIVAN JD
The Hospital Readmissions Reduction Program (HRRP), one of numerous pay-for-performance (P4P) schemes authorized by the Affordable Care Act, was sprung on the Medicare fee-for-service population on October 1, 2012 without being pre-tested and with no other evidence indicating what it is hospitals are supposed to do to reduce readmissions. Research on the impact of the HRRP conducted since 2012 is limited even at this late date , but the research suggests the HRRP has harmed patients, especially those with congestive heart failure (CHF) (CHF, heart attack, and pneumonia were the first three conditions covered by the HRRP). The Medicare Payment Advisory Commission (MedPAC) disagrees. MedPAC would have us believe the HRRP has done what MedPAC hoped it would do when they recommended it in their June 2007 report to Congress (see discussion of that report in Part I of this two-part series). In Chapter 1 of their June 2018 report to Congress, MedPAC claimed the HRRP has reduced 30-day readmissions of targeted patients without raising the mortality rate.
MedPAC is almost certainly wrong about that. What is indisputable is that MedPAC’s defense of the HRRP in that report was inexcusably sloppy and, therefore, not credible. To illustrate what is wrong with the MedPAC study, I will compare it with an excellent study published by Ankur Gupta et al. in JAMA Cardiology in November 2017. Like MedPAC, Gupta et al. reported that 30-day CHF readmission rates dropped after the HRRP went into effect. Unlike MedPAC, Gupta et al. reported an increase in mortality rates among CHF patients. 
We will see that the study by Gupta et al. is more credible than MedPAC’s for several reasons, the most important of which are: (1) Gupta et al. separated in-patient from post-discharge mortality, while MedPAC collapsed those two measures into one, thus disguising any increase in mortality during the 30 days after discharge; (2) Gupta et al.’s method of controlling for differences in patient health was superior to MedPAC’s because they used medical records data plus claims data, while MedPAC used only claims data.
I will discuss as well research demonstrating that readmission rates have not fallen when the increase in observation stays and readmissions following observations stays are taken into account, and that some hospitals are more willing to substitute observation stays for admissions than others and thereby escape the HRRP penalties.
All this research taken together indicates the HRRP has given CHF patients the worst of all worlds: No reduction in readmissions but an increase in mortality, and possibly higher out-of-pocket costs for those who should have been admitted but were assigned to observation status instead.
My analysis of the MedPAC and Gupta studies will also illustrate the need for the new rule I proposed in Part I of this series: MedPAC should not propose, and Congress should not authorize, payment “reforms” that have not been subjected to a controlled trial. Gupta et al. reached the same conclusion. They closed their article with this admonition: “Our study is … a reminder that, like drugs and devices, public health policies should be tested in a rigorous fashion – most preferably in randomized trials – before their widespread adoption” (p. 52).
We saw in Part I that CMS implemented a version of the HRRP that was even worse than the vague, evidence-free version MedPAC proposed in Chapter 5 of their June 2007 report to Congress. The MedPAC version was bad enough:
- It would use crude software developed by 3M, or another algorithm like it, to determine which post-discharge admissions were “clinically related” to earlier targeted admissions (CHF etc.).
- It would rely on crude risk adjustment of readmission rates using only claims data.
- It would not make resources available to hospitals to finance interventions that might reduce readmissions.
The version of the HRRP approved by CMS was even worse than MedPAC’s because it included no mechanism, not even the crude 3M software, to determine whether an admission within 30 days of a discharge was “related” (“clinically” or otherwise) to the original admission. The version CMS authorized simply treated every “unplanned” admission within 30 days of a discharge as a “readmission.” According to MedPAC’s June 2018 report to Congress, CMS’s definition of “unplanned” excludes only 5 percent of all readmissions (p. 14).
As if that weren’t bad enough, CMS authorized a risk-adjustment method that relied solely on claims data created in the preceding year, and which CMS decided, for no good reason, could be applied to pools of patients as tiny as 25 . The signal-to-noise ratio of a risk-adjustment algorithm that crude means hospitals cannot make heads or tails of the signals they get back from CMS – punishment (if their rate is above the national average) or the absence of punishment (hospitals with readmission rates below the national average are not penalized). As McIlvennan et al. put it in an article in Circulation, “[T]he retrospective nature of diagnosis ascertainment, the relatively poor performance of readmission risk models, and the wide range of causes of readmission have limited the ability of hospitals to target patients at highest risk with tailored interventions” (p. 1797). Or, as I like to put it, even a rat in a Skinner box needs accurate feedback to know which lever to push to get a food pellet.
Research on the HRRP done over the last three or four years consistently reaches two conclusions: The 30-day readmissions of conditions targeted by the HRRP dropped after October 1, 2012 (the start date of the HRRP), and hospitals that serve a disproportionate share of the poor are far more likely to be penalized. These findings created a growing concern that some hospitals might be refusing to admit some patients within the 30-days-after-discharge period who should be admitted. The concern had become so widespread by 2016 that Congress required MedPAC to conduct a study to determine whether the drop in readmission rates had been offset by increased emergency room visits or observation stays (this mandate was written into the 21st Century Cures Act of 2016). Chapter 1 of MedPAC’s June 2018 report to Congress was MedPAC response to that mandate.
In the remainder of this article, I will describe the conclusions reached by Gupta et al. and MedPAC, and the main reasons why those conclusions differed.
Conclusions reached by Gupta et al. and MedPAC
In November of 2017 Ankur Gupta and 10 other experts in cardiovascular medicine published an article in JAMA Cardiology entitled, “Association of the Hospital Readmissions Reduction Program implementation with readmission and mortality outcomes in heart failure.” The authors were affiliated with well-known universities, and three of them were also editors of JAMA Cardiology . Their research was financed by grants from the NIH and Get With the Guidelines-Heart Failure (GWTG-HF), a “voluntary quality improvement program” sponsored by the American Heart Association.
Gupta et al. examined data on 30-day CHF readmission and mortality rates for the years 2006 to 2014 . They divided this period into three segments – a pre-HRRP implementation phase (January 1, 2006 to March 31, 2010), the implementation phase (April 1, 2010 to September 30, 2012), and the HRRP penalty phase (October 1, 2012 to December 31, 2014). They found that 30-day readmissions fell during the penalty phase, while mortality rose slightly during the implementation phase and substantially during the penalty phase. The 30-day risk-adjusted mortality rate rose from 7.2 percent before HRRP implementation to 8.6 percent after. In an interview, co-author Gregg Fonarow stated, “If we were to extrapolate this to all Medicare beneficiaries hospitalized with heart failure, we are talking about maybe 10,000 patients a year with heart failure losing their lives as a consequence of this program.”
MedPAC, on the other hand, examined the period 2008 to 2016, and they lumped inpatient mortality with 30-day readmission mortality to derive one rate they called the “in-hospital through 30 days post-discharge” mortality rate. They reported that both the risk-adjusted readmission rate and the lumped-together mortality rate fell throughout the entire 2008-2016 period, and the rate of decline in readmissions accelerated slightly after 2012. They also reported finding no correlation at the individual hospital level between their crudely adjusted readmission rate and their crudely adjusted, lumped-together, mortality rate.
Reasons why MedPAC’s analysis is less credible than Gupta et al.’s
Reason number one: As I just noted, MedPAC lumped readmission mortality rates with in-patient mortality rates. Their justification for this was, “[W]e believe looking at the combination of inpatient and post-discharge mortality will reduce problems that can be caused by a shift in the site of mortality (for example, from the inpatient setting to hospice, which may have the effect of increasing post-discharge mortality)” (p. 13). This argument is not persuasive. What “problems” (plural) are “reduced”? The argument seems to go like this: If patients are so sick at discharge they need to be in a hospice, then they should be removed from the study because, well, their deaths within 30 days of discharge would raise the 30-day mortality rate and make the HRRP look bad. Gupta et al. reported an increase in discharges to hospice and home after the HRRP was implemented, and that even when they excluded discharges to hospices the increase in one-year mortality (but not 30-day mortality) persisted.
Regardless of the rationale for reporting a lumped-together mortality rate, MedPAC should have reported the two rates separately as well and let readers decide which rates are more useful. But they didn’t, and so it is impossible to divine from their report whether readmission rates are correlated with mortality rates in any way.
Reason number two: MedPAC’s method of risk adjusting readmission and mortality rates relied on claims data over the year prior to admission, which means they relied at most on a few diagnoses (plus age and sex, which predict almost nothing), whereas Gupta et al. used both claims data and data typically found only in medical records. Gupta et al.’s additional data, collected routinely in accord with standards developed by the GWTG-HF program, included the presence of ten co-morbidities (previous stroke, for example) acquired at any time (not just in the past year) as well as data typically collected on CHF patients such as ejection fraction (a measure of the heart’s ability to pump blood) and blood pressure. These patient-level data, which were collected at admission (“prospectively,” as the authors put it), were available to the authors in a registry maintained by the GWTG-HF program. This additional data substantially improved the accuracy of Gupta et al.’s risk adjuster compared with MedPAC’s, not only because medical-records plus claims data is more predictive of outcomes than claims data alone, but because the additional medical records information is much less susceptible to the equivalent of upcoding, that is, making patients look sicker than they are . Moreover, the people who collected the medical data did not have an incentive to make patients look sicker, which is not true of providers subjected to the HRRP and other P4P schemes.
Finally, I’ll mention a third, less important reason to suspect MedPAC’s conclusions. The graph MedPAC displayed showing falling (lumped-together) mortality rates (Figure 1.11 p. 25) suggests the HRRP did aggravate mortality rates. It shows that mortality for CHF patients fell faster during the four years prior to the start of the HRRP (2008-2012) than in the first four years of the HRRP program (2012-2016). Figure 1.11 shows the CHF mortality rate fell by 17.6 percent during the four years 2008-2012 but by 16.1 percent during the next four years.  If the mortality rates had been more accurately risk adjusted, the figure might have shown a more substantial slowdown in the decline in MedPAC’s lumped-together mortality rates after 2012.
It gets worse.
But even Gupta’s study may have underestimated the damage done by the HRRP. Gupta et al., like nearly all others who have investigated the impact of the HRRP on readmissions and mortality, did not take into account admissions that occur within 30 days after observation stays. Observation stays have soared over the last decade. Observation stays were supposed to be for patients who aren’t well enough to be discharged but whose symptoms don’t clearly warrant inpatient care. Studies suggest, however, they are now being used in some cases to avoid admitting patients who should be admitted.
Sabbini and Wright wondered what the 30-day readmission rate would look like if observation stays were added to admissions, and if unplanned admissions after observation stays within 30 days were added to readmissions. In an article published in the New England Journal of Medicine in May 2018, they reported that defining admissions and readmissions this way virtually eliminated the downward trend in readmissions (as readmissions are conventionally defined). Here more specifically is what they found. Using data from 350 insurance companies on emergency room visits and 30-day readmissions following all discharges (not just discharges for HRRP-targeted conditions) for adults over the period 2007 to 2015, they reported that the readmission rate (conventionally defined) declined by 13 percent (from 17.8 percent to 15.5 percent) while the rate of readmission after an observation stay shot up by 36 percent (10.9 percent to 14.8 percent). When they added observation stays to admissions (the denominator) and readmissions after observation stays to readmissions (the numerator), they found “virtually no change in all-cause readmissions” (p. 2063).
If CHF admissions and observations stays followed the same pattern, that would suggest that one reason Gupta et al. found an inverse correlation between readmissions and higher CHF mortality rates is that patients who should have received inpatient care were placed on observation status instead and sent home too early, or in some other way given substandard care. 
A study published in the October 2018 Health Affairs helps us refine this hypothesis. The study suggests that only some hospitals responded to the HRRP by misusing observation stays. These hospitals were (surprise, surprise) more likely to be hospitals that managed to keep their (crudely risk-adjusted) readmission rates below the national average and thereby escape the HRRP’s penalties.
We have now reviewed the entire history of MedPAC’s happy-go-lucky promotion of readmission rates as a quality measure. We saw in Part I that MedPAC recommended the HRRP to Congress without a shred of evidence indicating the HRRP was safe and effective, and with total disregard for the costs of interventions hospitals would deploy in an effort to reduce readmissions. We have seen in this second part that when MedPAC was asked by Congress in 2016 (four years after the HRRP penalties kicked in) to review their handiwork, they defended it with sloppy research. We have seen that more credible research indicates the HRRP has failed to reduce readmissions of CHF patients substantially or at all, and has probably increased mortality rates.
We have also seen in this two-part series how difficult it is to evaluate the HRRP after the fact – after it has been inflicted on the majority of acute care hospitals. We have seen how hard it is to adjust readmission and mortality rates for the most important factors outside hospitals’ control, the most important being patient health and other factors like patient income that affect health. Upcoding and misuse of observation stays, often referred to as “gaming,” are other confounders I have discussed here. And despite the length of this two-part series, I have said nothing at all about other significant confounders, such as rising rates of emergency room visits, “teaching to the test” (moving resources away from patients not targeted by the HRRP to patients who are), and improvements in inpatient care that might lead to discharging a sicker pool of patients that could in turn drive up readmission rates.
Controlling for all factors that might influence the success of a drug, medical device, or health policy is never easy, but it is always easier to do before the drug, device or program is unleashed on an entire population or a large portion of a population. Once this happens, finding a natural control group to compare with an experimental group is impossible or at best very difficult.
Let me conclude with an outline of the steps MedPAC should have taken prior to recommending the HRRP to Congress. When MedPAC recommended the HRRP a decade ago, no studies supported MedPAC’s claim that more “coordination” – setting up appointments before discharge, reconciling medications, giving “action plans” to patients – would lower readmission rates. Step one, then, should have been for MedPAC to operationalize the vague notion of “coordination,” that is, define the interventions that hospitals could carry out that might reduce admissions for defined groups of patients (such as CHF patients), and then recommend to Congress that CMS conduct a test of these interventions with controlled trials (hospitals in the experimental group should be given the resources to pay for the interventions being tested). Once a half-dozen such interventions had been tested, step two could be undertaken. Step two would be to test the HRRP (using a risk-adjustment method like the one Gupta et al. used, not CMS’s crude adjuster) on a group of hospitals using another group of hospitals not subjected to the HRRP as a control group. Only if HRRP passed this test with flying colors could MedPAC go to step three – recommending the HRRP to Congress.
In the meantime, CMS should terminate the HRRP program and not resurrect it until the program has been shown by controlled trials to be safe and effective.
 Here are examples of statements from the peer-reviewed literature indicating how little we know about the impact of the HRRP years after it was implemented. “Despite the importance of readmissions, there has been little study of the effect of the [HRRP] program” (Zuckerman et al., from in a 2016 article in the New England Journal of Medicine). “[T]he association between the HRRP implementation and mortality is not known” (Gupta et al. in 2017 article in JAMA Cardiology).
 In its June 2018 report, MedPAC described the article by Gupta et al. as its main competitor. “[T]he primary article contending that the HRRP may have resulted in an increase in risk-adjusted mortality continues to be the article by Gupta and colleagues,” they wrote. (p. 12)
 CMS’s risk adjustment algorithm for the HRRP program looks for any of 31 diagnoses on claim forms (see link to Zuckerman et al. in footnote 1 above and to Barnett et al. below), but because the search is limited to the 12 months prior to the admission, diagnoses that are more than a year old don’t affect the adjustment. The small amount of data that CMS uses in its risk adjuster has been criticized. Barnett et al., for example, offered this criticism in a 2015 article in JAMA Internal Medicine: “In setting an expected readmission rate for each hospital, the Centers for Medicare and Medicaid Services … adjusts only for patients’ age, sex, discharge diagnosis, and diagnoses present in claims during the 12 months prior to admission. This limited adjustment has raised concerns that hospitals may be penalized because they disproportionately serve patients with clinical and social characteristics that predispose them to hospitalization or rehospitalization.”
 The lead author, Gupta, is on the Harvard Medical School faculty. The other authors are from Duke, Northwestern, Stanford, UCLA, and the University of Colorado. The three authors who were also on the JAMA Cardiology staff were not involved in the decision to accept the manuscript.
 Gupta et al. also examined one-year readmission and mortality rates and reported results similar to the 30-day results: Readmissions fell and mortality rose.
 Gupta et al. did not attempt to determine how much more accurate their risk adjuster is than CMS’s. But another study by Michael Barnett et al. that used medical records data plus socio-economic data gives us some idea of how crude CMS’s risk adjuster is. Barnett et al. collected that additional data on all Medicare enrollees admitted to a hospital within 30 days after a discharge for any reason that wasn’t planned. These data included information on 29 indicators of patient health and socio-economic status that CMS does not use. The authors found that 22 of the 29 indicators raised the accuracy of CMS’s risk adjuster and, more importantly, that 17 of these 22 were distributed differently among hospitals. They divided hospitals into quintiles according to their number of readmissions using CMS’s crude risk-adjustment method, and determined that “participants admitted to hospitals in the highest quintile” suffered much worse health and socio-economic status. These patients had “more chronic conditions, less education, fewer assets, worse self-reported health status, more depressive symptoms, worse cognition, worse physical functioning, and more difficulties with ADLs and IADLs than participants admitted to hospitals in the lowest quintile.” In addition, they had higher scores on the risk adjuster CMS uses for adjusting payments to Medicare Advantage plan. The addition of these 22 additional indicators cut the difference between the readmission rates of the top and bottom quintiles in half.
Whether Gupta et al.’s risk adjuster is twice as accurate as CMS’s is impossible to say. It is unquestionably far more accurate. Gupta et al.’s sample size was much smaller than MedPAC’s – 115,000 versus 3 million. Given the myriad factors that could confound studies of post-discharge mortality rates, MedPAC’s larger sample size can’t trump Gupta et al.’s much richer data set on those factors. We see here a common defect in many big databases – the data is unaccompanied by data on confounders.
 Two other types of readmissions targeted by the HRRP during its first few years, heart attack and COPD, revealed the same pattern –a faster drop in mortality prior to the start of the HRRP than after. The drop in the four years prior to 2012 for heart attack was 14.6 percent versus just 11.9 percent in the four years after January 1, 2012. The analogous numbers for COPD were 18.1 and 15.6. Only the pneumonia rates behaved according to MedPAC’s expectations: They fell slightly faster in the 2012-2016 period (23.1 percent) than during the 2008-2012 period (20.3 percent).
 Sabbini and Wright discussed the possibility that their finding of no decline in readmissions (redefined to take observation stays into account) might be explained by a sudden increase in the severity of illness of patients arriving in American ERs over the short time period they studied. If that were the case, it might suggest that sicker patients explain the greater use of observation stays. That seems unlikely. In any event, Gupta et al. reported no change in the illness severity of CHF patients over the pre- and post-HRRP implementation periods.