Many allege that the FIRST trial, which randomized surgical residencies to strict versus flexible adherence to duty hour restrictions, was unethical because patients weren’t consented for the trial and, as this was an experiment, in the true sense of the word, consent was mandatory. The objection is best summarized by an epizeuxis in a Tweet from Alice Dreger, a writer, medical historian, and a courageous and tireless defender of intellectual freedom.
— Alice Dreger (@AliceDreger) November 20, 2015
It’s important understanding what the FIRST (Flexibility In duty hour Requirements for Surgical Trainees) trial did and didn’t show. It showed neither that working 120 hours a week has better outcomes than working 80 hours a week, nor the opposite. Neither did the trial, despite being a non-inferiority trial, show that working 100 hours was as safe as working 60 hours a week. The trial showed that violating duty hour restrictions didn’t worsen outcomes. The trial was neither designed nor powered to specify the degree to which the violation of duty hours was safe. This key point can be missed. To be fair, neither the trialists, nor the editorials about the trial, claimed so.
The trial was a triumph of flexibility of hours, not of long hours, a triumph for decentralization over rigid adherence to centrally-determined rules. The trial confirmed the intuitions of the Nobel prize-winning economist, Friederich Hayek, that local decisions trump central decisions – which he articulated in his landmark essay, Use of Knowledge in Society.
The trial affirmed Herbert Simon’s “bounded rationality,” which is that what defines reasonable isn’t a point estimate but a range, a bandwidth. Bounded rationality replaces optimal (point estimate) with satisfice – good enough (range). What’s good enough for resident hours is best determined locally, not centrally. Central authorities are good at defining point estimates, not so good at establishing the range of permissibility.
But is Dreger correct that patients should have been consented for the trial? Dreger draws our attention to the Helsinki Declaration. Briefly, if one studies an intervention and collects data to see if the intervention works, it’s an experiment, and patients must be consented for an experiment. By this definition, the FIRST trial, a randomized controlled trial (RCT), is an experiment.
One objection that patients needed consent is that there was equipoise, meaning it was genuinely unknown if strict adherence to duty hours has better outcomes than flexible working hours. The reason for the equipoise is instructive. It was never demonstrated, by a rigorous experiment, that the degree to which duty hours were restricted improved outcomes. Instead, the 80-hour work week, was a reflexive policy based on real anecdotes and what sounded plausible – residents get more tired by working longer hours, which impairs their performance, harming their patients.
There are few better reminders that healthcare is a complex system than resident duty hours. In complex systems, reductionism and simple solutions are enticing, but wrong. My friends in surgery in Britain, who experienced both the brutal 48-hour call and a week of night call when hours were restricted but spread out, preferred the former because they could “get the pain over and done with.”
From my own experience in surgery, I looked forward to the 24-hour call because I left the hospital next day at noon and, invariably, ended up in the local pub by 2 pm. The more brutal the call, the less tired I felt, because the less time I had to feel tired.
Curtailment of duty hours is, by no means, an unalloyed good, as there’s a price for rest, if in deed, residents feel more rested in a shift system which, arguably interferes with circadian rhythms just as much. Not to mention that the shift system disrupts continuity of patient care, as any radiologist who has been frustrated speaking with housestaff about the appropriateness or results of an imaging study when told “Mr. Patel is not my patient – I’m just covering,” can attest.
The question arises why should work be restricted to 80 hours, not 70, or 92 hours. To that, I can do no better than to quote Cyril Radcliffe, who drew the border between India and Pakistan. According to urban myth, when Radcliffe was asked why he drew the border where he had drawn the border, he replied “I had to draw the line somewhere.” All lines are arbitrary, and arbitrariness isn’t an argument against drawing a line, merely a confirmation that a line has been drawn.
Equipoise justifies an RCT but equipoise doesn’t necessarily justify the waiving of consent – i.e. equipoise tells us that it’s ok experimenting, not that it’s ok not getting permission to experiment. When there’s no equipoise, such as reduction of a fracture-dislocation of an ankle versus leaving it alone, experimentation is unethical, even with consent. Extending this logic, if an intervention, such as a quality initiative (QI), is widely implemented without consent, as it often is, it implies there’s no uncertainty about its efficacy, that its efficacy is self-evident, like a parachute.
On a technicality, Dreger may be correct that, by the Helsinki Declaration, the FIRST trial was an experiment which needed consent. However, a technicality is no good if it survives for its own sake, it must serve a bigger goal. The technicality exposes a bigger problem than it solves. Untested ideas such as Meaningful Use (MU) for Electronic Health Records (EHR), checklists, paying physicians for performance (P4P), compliance with quality metrics, such as MACRA, ideas which have gains and losses for patients, can be implemented without patient consent because they seem like a good idea at the time. But when it comes to testing them, to seeing if they really work if, for example, randomizing healthcare systems to P4P and no P4P needs patient consent, one may never be able to study the true risk-benefits of great policies.
In other words, it seems ok implementing an untested quality initiative on every patient without their consent, but when testing that untested quality initiative, it’s unethical implementing it on only half the patients without their consent. I shan’t labor over the parody this numerical discrepancy inspires, except that I can’t help imagining telling Chengiz Khan that it’s less unethical that he slaughters an entire village than spares, randomly, half the villagers.
I’m a fan of ethicism and medical ethicists, largely because I’m weary of the Benthamite, “greater good for the greatest number,” trap. Ethics saves the individual from being sacrificed at the altar for the greater good, for the sake of the population. Ethics seems to do less well protecting the population (lots of individuals, if I may remind you) from being sacrificed at the altar for the anecdote, or some elite’s bright idea.
A distinction has been made between an experiment in which an intervention is implemented with the purpose of collecting outcome data, and health policy which has no ambition of explicitly measuring. Therefore, the logic follows, we need consent for experimentation and not for policy, because as the latter is not implemented with a view to measuring its efficacy, it’s not an experiment. Admittedly, this is clever logic, but it makes me despair even more. Drugs, QIs, and policy have the same intent – improving outcomes. Any distinction between implementing policy and experimentation is a false one. That implementers of policy and quality initiatives are neither required to, nor seek to, deliberately confirm that they work, that is measure outcomes during their implementation with a view to rescinding them if they don’t work, makes our unchallenging their implementation even odder.
All policy and quality initiatives are experiments, provisional assumptions, unless we believe that scholars in this realm have such an unusual access to wisdom that it renders scientific curiosity redundant. It’s an epistemological irony that we can submit statins to greater scientific rigor than, arguably, the more consequential MU for EHR.
Carl Sagan said “extraordinary facts need extraordinary evidence.” Christopher Hitchens went further and said “what is asserted without evidence can be dismissed without evidence.” By Hitchens’ reasoning, policy or QIs implemented without evidence can be dismissed without evidence. The trialists in the FIRST trial compromised Hitchens’ strict dictum. They cluster randomized. If the original policy waived consent in all patients, the FIRST trial waived consent in half the patients. I’m aware of the logical fallacy – two wrongs don’t make a right. But surely, a wrong and half a wrong, 1.5 wrongs, are less wrong than two wrongs.
Quality initiatives have two easy passes. They’re implemented with plausibility and good intentions, alone. And, when it comes to studying their efficacy, the burden of proof inverts – that is the burden is to show that they don’t work, rather that they work. The non-inferiority design of the FIRST trial implicitly recognized the inverted burden of proof – the burden fell on flexible hours to prove that they were safe enough, rather than on restricted hours to prove they were safer. Some might argue that the burden of proof should have been even more stringent – that is flexible hours should have shown they’re safer than restricted hours. Regardless, my point that burden of proof inverts with quality initiatives is supported by the design of the FIRST trial.
The first error with ethics lies in formalizing ethics, like regulations, which induces people to fret over the letter, rather than the spirit, of the law. The danger of such fretting was best exposed by Portia in the Merchant of Venice when she held Shylock, who insisted on Antonio’s pound of flesh, to the letter of the law and said he take no more, but also no less, than a pound of Antonio’s flesh.
The bigger error with ethics is in failing to recognize that ethics is constrained by practicality – there’s a cost in obtaining consent, and sometimes this cost is too prohibitive. As much as I despise MU and MACRA, their implementation couldn’t have been conditional on consenting patients. Aside from consenting patients about unknown risks (who would have thought EHRs would sully the doctor-patient encounter?), you can’t divide pubic goods between those who want it and those who don’t. If public policy can be implemented only when voted by a majority of the proletariat, all policy decisions would need elections, elections less attended than mid-terms, elections every hour. It was inevitable that EHRs were rolled out without patient consent. The proof of the health policy pudding is in eating it – there’s no way round this.
The FIRST trial would have collapsed were consent required because of the impracticality of excluding sick surgical patients, once deemed they required a laparotomy, for instance, from hospitals with either flexible or restricted work hours. There’s no reason assuming well-informed patients would consistently have chosen one not the other. Unless we grant the same slack to investigators who study health policy and QIs systematically, as we grant the implementers of policy and QIs, bad ideas will accumulate faster than are expelled and, eventually, like Gresham’s Law, bad health policies will drive out good health policies.
The biggest error with ethics is in failing to acknowledge that ethics is contextual. During the Ebola epidemic, some said that experimenting an unproven Ebola vaccine on Africans was “unethical.” If we can’t distinguish morally between studying the natural history of syphilis by exploiting unsuspecting and vulnerable African Americans and experimenting a vaccine on at-risk Africans during an epidemic, which could save many other at-risk Africans from Ebola, we’re in big ethical trouble.
“Unethical” is now thrown around so frivolously that the word has lost all meaning just as “fascism,” “stat” and, in my case, “uncaring,” have lost all meaning. My wife used to call me “uncaring” once a month – now she says it every day.
Recently, I heard two doctors argue. One said that giving cancer patients false hope is unethical (hope, like aggregate is, by definition, false). The other doctor said that denying cancer patients hope is unethical. It reminded me of prematurely closing RCTs, like the National Lung Screening Trial, on ethical grounds. When it’s apparent there’s a treatment effect sooner than anticipated in a trial, it can be argued, both, that it’s ethical and unethical continuing the trial. It’s unethical randomizing patients when there’s evidence of a treatment effect. But is it not unethical closing a trial prematurely when we don’t know the true treatment effect, which can better inform about risk: benefit ratio, which can take longer to reveal? You could make a cogent argument for both because ethics, like Schrodinger’s cat, is suspended in a dual state of being and not being.
Rather, what’s ethical or unethical depends more on an interlocutor’s will and skill in making an argument than our moral intuitions. This is both distressing and reassuring because medical ethics is more important than ever. In the era of gene editing, what’ll save mankind from bokanovskification are an articulate ethicist’s rapier logic and persuasive prose. I desperately wish to take ethicists more seriously, but I’m afraid of being tuned out by the boy who cried “unethical.”
(About the author: Saurabh Jha is a radiologist and contributing editor of THCB. He can be reached on Twitter @RogueRad. He’s ethical but his Tweets are unethical)