The paper from the New England Journal of Medicine that reports azithromycin might cause cardiovascular death is not new to electrophysiologists tasked with deciding antibiotic choices in patients with Long QT syndrome or in those who take other antiarrhythmic drugs. Heck, even the useful Arizona CERT QTDrugs.org website could have told us that.
What was far scarier to me, though, was how the authors of this week’s paper reached their estimates of the magnitude of azithromycin’s cardiovascular risk.
Welcome to the underworld of Big Data Medicine.
Careful review of the Methods section of this paper reveals that “persons enrolled in the Tennessee Medicaid program” were the subjects, and that the data collected were “Computerized Medicaid data, which were linked to death certificates and to a state-wide hospital discharge database” and “Medicaid pharmacy files.” Anyone with azithromycin prescribed from 1992-2006 who had “not had a diagnosis of drug abuse or resided in a nursing home in the preceding year and had not been hospitalized in the prior 30 days.” Also, they had to be “Medicaid enrollees for at least 365 days and have regular use of medical care.”
Hey, no selection bias introduced with those criteria, right? But the authors didn’t stop there.
This study used a “matched” control period in which no antibiotics were prescribed and were “frequency-matched according to a propensity score that was calculated from 153 covariates.” (Editor’s note: No doubt there are no more covariates in medicine than the 153 they studied.)
Then, as if to finally admit a smidge of bias to their study design, “to attempt to control for confounding by indication, we also included as additional control groups who took three other antibiotics.” As if THAT will fix the data fields erroneously entered or neglected in the interlinked retrospective databases.
But why focus on the details? The authors had something to prove!
So they processed and pureed the data and checked “for misspecification of the propensity-score regression models” by evaluating “whether the covariate distributions were balanced across study groups.” In other words, they made sure the data worked the way the authors thought it should.
Hey, why not?
Finally at the end, they “estimated” (their word, not mine) the difference between the cumulative incidence of cardiovascular death during a 5-day course of azithromycin and the incidence of a similar period of amoxacillin use.
Never mind that they admitted in their discussion that “as many of 25% of patients would be misclassified as having died from cardiovascular causes” and that “they cannot establish a specific causal mechanism.”
To think that despite all of the confounding factors that the authors had the balls to state that “as compared with amoxacillin that there were 47 additional deaths per 1 million courses of azithromycin therapy; for patients with the highest decile of baseline risk of cardiovascular disease, there were 245 additional cardiovascular deaths per 1 million courses” is ridiculous. Seriously, after all the manipulation of data, they are capable of defining a magnitude to three significant digits out of a million of anything?
Please.
But we should not dwell on these details, should we? After all, this work was published in the journal with the largest impact factor out there: the infamous New England Journal of Medicine. No doubt we can look for more high quality retrospective database reviews in the years ahead as Big Data Medicine takes hold.
Westby G. Fisher, MD, (aka Dr. Wes) is a board certified internist, cardiologist and cardiac electrophysiologist practicing at NorthShore University HealthSystem in Evanston, IL. He is also a Clinical Associate Professor of Medicine at the University of Chicago’s Pritzker School of Medicine. He blogs at Dr.Wes, where this post originally appeared.
Categories: Uncategorized
Very Interesting. I would have to agree with the above comment. Requirement of action is necessary in with further research. Perhaps its the research? Or Dr. Wes’s agenda? I don’t know. Certainly statistics shouldn’t be taken to heart… Definitely requirement of further analysis.
I completely agree with words of Dr. Wes, but azithromycin is not only the medicine, which is harmful for health, list of harmful medicines is endless like azithromycin is harmful for cardiovascular system there are lots of medicines which are harmful for your oral health and mostly medicines cause staining on teeth, but people cannot easily predict that which medicine is useful and, which is harmful because it doesn’t react quickly, so there is requirement of action against this kind of harmful medicines.
Oh, and I forgot to mention that there is a primer that Anonymous above is looking for — http://betweenthelines-book.com
There are always problems with observational data. At the same time there are always problems with interventional data as well. My own reading of the study makes me think that the authors did a really great job trying to disentangle the effect of azithro from other potential reasons for death, such as severity of illness, etc. The methods they used are well accepted, and they applied them in a valid way. I agree that there does not appear to be a selection bias, but a potential issue of generalizability, if you think that the Medicaid population is not representative physiologically of the rest of the population with an opportunity to be exposed to azithromycin. As for “estimate,” that is what all calculation are, usually point estimates with some kind of a confidence interval around them.
I blogged about the study here (http://evimedgroup.blogspot.com/2012/05/why-i-have-propensity-to-believe.html), and my view of it was quite different.
I have to agree with the above commenters. The tone of this post makes me wonder if in fact Dr. Wes has some unspoken agenda of his own. So come out and state exactly what you are trying to say, please!
For those non-researcher/physicians, it would be very helpful to have a brief primer on statistical research and what makes a study robust or not. Also, distinguishing between RCT (randomized clinical trials) and Observational studies, and the differences in the level of rigor for each, would be helpful.
I am also a little unclear as to the role of ‘Big Data Medicine’. Isn’t this the same type of observational or retrospective review that has always been done when a prospective trial could not reasonably be?
I think it’s reasonable to treat all antibiotics as if they potentially cause a fatal heart arrhythmia. I agree that much of this we already knew, and that the statistics are probably being used more to create pageviews (or get grant money for the future) than it is to save lives. For an EP cardiologist, the fact that macrolides can make people die from heart problems is not news at all.
Dr. Wes,
Since you disagree with the methods, it’s incumbent on you to explain how they would compromise the conclusions. What problems do you think the selected sample will cause? What other variables would you have included in the match, and how would you expect them to affect the results? Exactly how will misrecorded fields in the respective databases affect the results, aside from attenuating the estimates towards zero (the classical measurement error result)?
Or are you arguing that all retrospective analysis is suspect, and only RCTs are appropriate (realizing, of course, that rare effects are unlikely to be found in a RCT of realistic size)?
Excellent use of “scare quotes,” which make any “finding” or “decision” seem deeply concerning. I’m not sure why your condescending, dismissive tone, lacking in actual rigor, should make us ignore this study.
This is not an RCT. For an observational study, however, what you’re describing sounds pretty strong. Using Medicaid files is a non-ideal, non-robust population, but I don’t see how it constitutes selection bias. (Selection bias would mean the azithromycin group is more likely to die by how they were selected than both the patients with other antibiotics and those with no antibiotics – how did that happen here?) The use of positive (3 antibiotic groups) and negative (no antibiotic) controls is a clever method seen often in lab sciences – I’m unsure why it makes this study worse.
Similarly, 153 confounders isn’t all possible, but it is a lot. Are propensity scores so unreliable? I’ve never seen that to be true and certainly the work of Donald Rubin makes clear they’re conceptually a major advance for removing unmeasured confounding, especially in large datasets. (Did they use them in a particularly problematic way? How?) Are observational studies never useful? I didn’t know that and if we believe that, we have no reason to think cigarettes are unhealthy. (There’s never been an RCT of smoking causing cancer and the lab data is poor.) Is any study without a mechanism is unreliable? Well, RCTs never demonstrate a mechanism, so we can’t trust any of those! (Of course, you hinted at a mechanism in your first paragraph – prolonged QT.)
I’m sure you’re driving at an interesting point. In all honesty, beneath the dismissive attitude I’m unsure what it is.