I’ve been talking in recent posts about how our typical methods of testing AI systems are inadequate and potentially unsafe. In particular, I’ve complainedthat all of the headline-grabbing papers so far only do controlled experiments, so we don’t how the AI systems will perform on real patients.
Today I am going to highlight a piece of work that has not received much attention, but actually went “all the way” and tested an AI system in clinical practice, assessing clinical outcomes. They did an actual clinical trial!
Big news … so why haven’t you heard about it?
The Great Wall of the West
Tragically, this paper has been mostly ignored. 89 tweets*, which when you compare it to many other papers with hundreds or thousands of tweets and news articles is pretty sad. There is an obvious reason why though; the article I will be talking about today comes from China (there are a few US co-authors too, not sure what the relative contributions were, but the study was performed in China).
China is interesting. They appear to be rapidly becoming the world leader in applied AI, including in medicine, but we rarely hear anything about what is happening there in the media. When I go to conferences and talk to people working in China, they always tell me about numerous companies applying mature AI products to patients, but in the media we mostly see headline grabbing news stories about Western research projects that are still years away from clinical practice.
This shouldn’t be unexpected. Western journalists have very little access to China**, and Chinese medical AI companies have no need to solicit Western media coverage. They already have access to a large market, expertise, data, funding, and strong support both from medical governance and from the government more broadly. They don’t need us. But for us in the West, this means that our view of medical AI is narrow, like a frog looking at the sky from the bottom of a well^.
With the application deadline for Bayer’s G4A Partnerships program coming up on Friday, I thought I’d throw out a little inspiration to would-be applicants by featuring an interview I did with one of last year’s program participants at the grand-finale Launch Event.
Not only was this a great party, but a microcosm of the G4A program experience itself: a way to meet Bayer execs en-masse, an opportunity to sell directly to key decision-makers across Bayer’s various global business units, and a chance to feed off the energy of like-minded innovators eager to see ‘big health care’ change for the better.
While the G4A program itself has changed a bit this year to be more streamlined and to allow for bespoke deal-making that may or may not involve giving up equity (my favorite new feature), startups questioning whether or not they have what it takes should take a look at some alums.
There’s a playlist with nearly two dozen interviews waiting for you here if you’re REALLY up for some procrastinating, or you can click through and just check out my chat with Joe Curcio, CEO of KinAptic. A healthtech startup taking wearables to the bleeding edge, Joe shows us a mock-up of the KinAptic ‘smart shirt’ which features their real innovation: printed ink electronics that look and feel like screenprinting ink, but work bi-directionally to both collect data from the body AND apply signals back to it. Is it AI-enabled? Did you have to ask? Listen in for a mindblowing chat about how this tech can change diagnostic analysis and treatment and completely redefine our current limitations when it comes to healthcare wearables.Once you’re inspired, don’t forget to head over to www.g4a.health and fill out your own application for this year’s partnership program.
Jessica DaMassa is the host of the WTF Health show & stars in Health in 2 Point 00 with Matthew Holt
Two years ago we wouldn’t have believed it — the U.S. Congress is considering broad privacy and data protection legislation in 2019. There is some bipartisan support and a strong possibility that legislation will be passed. Two recent articles in The Washington Post and AP News will help you get up to speed.
Federal privacy legislation would have a huge impact on all healthcare stakeholders, including patients. Here’s an overview of the ground we’ll cover in this post:
Six Key Issues for Healthcare
We are aware of at least 5 proposed Congressional bills and 16 Privacy Frameworks/Principles. These are listed in the Appendix below; please feel free to update these lists in your comments. In this post we’ll focus on providing background and describing issues. In a future post we will compare and contrast specific legislative proposals.
Today, we are featuring Dr. Jesse Ehrenfeld from the American Medical Association (AMA) on THCB Spotlight. Matthew Holt interviews Dr. Ehrenfeld, Chair-elect of the AMA Board of Trustees and an anesthesiologist with the Vanderbilt University School of Medicine. The AMA has recently released their Digital Health Implementation Playbook, which is a guide to adopting digital health solutions. They also launched a new online platform called the Physician Innovation Network to help connect physicians with entrepreneurs and developers. Watch the interview to find out more about how the AMA is supporting health innovation, as well as why the AMA thinks the CVS-Aetna merger is not a good idea and how the AMA views the role of AI in the future of health care.
Zoya Khan is the Editor-in-Chief of THCB as well as an Associate at SMACK.health, a health-tech advisory services for early-stage startups.
I have seen the light. I now, finally, see a clear role for artificial intelligence in health care. And, no, I don’t want it to replace me. I want it to complement me.
I want AI to take over the mandated, mundane tasks of what I call Metamedicine, so I can concentrate on the healing.
In primary care visits in the U.S., doctors and clinics are buried in government mandates. We have to screen for depression and alcohol use, document weight counseling for every overweight patient (the vast majority of Americans), make sure we probe about gender at birth and current gender identification, offer screening and/or immunizations for a host of diseases, and on and on and on. All this in 15 minutes most of the time.
Never mind reconciling medications (or at least double checking the work of medical assistants without pharmacology training), connecting with the patient, taking a history, doing an examination, arriving at a diagnosis, and formulating and explaining a patient-focused treatment plan.
At long last, we seem to be on the threshold of departing the earliest phases of AI, defined by the always tedious “will AI replace doctors/drug developers/occupation X?” discussion, and are poised to enter the more considered conversation of “Where will AI be useful?” and “What are the key barriers to implementation?”
As I’ve watched this evolution in both drug discovery and medicine, I’ve come to appreciate that in addition to the many technical barriers often considered, there’s a critical conceptual barrier as well – the threat some AI-based approaches can pose to our “explanatory models” (a construct developed by physician-anthropologist Arthur Kleinman, and nicely explained by Dr. Namratha Kandulahere): our need to ground so much of our thinking in models that mechanistically connect tangible observation and outcome. In contrast, AI relates often imperceptible observations to outcome in a fashion that’s unapologetically oblivious to mechanism, which challenges physicians and drug developers by explicitly severing utility from foundational scientific understanding.
Today we are featuring another #TechCrunchDisrupt2018 THCB Spotlight. Matthew Holt interviews LivioAI, which is an AI hearing aid created by Starkey Technologies. Worldwide, there are 700 Million people with hearing loss but only 10% wear a device to help them. That number is appalling especially because there are a number of co-morbid illnesses linked with hearing loss, like cognitive and physical decline! That is where LivioAI comes in to play. LivioAI is completely controlled by your iPhone, tracks all types of movements (it is always counting your steps so the steps you miss when you put down are also accounted for), classifies acoustic environments to measure your social engagement (it can register the difference between a noisy restaurant and a library to figure out how much you are participating in a situation), and even translates foreign languages directly into your ear with its voice-activated platform. It is connected with Apple Health and Google Fit and can measure data to observe patterns of co-morbid illnesses. It is the new Fitbit, but for the ear! As LivioAI’s motto goes “Hear better, Live better.”
Zoya Khan is the Editor-in-Chief of THCB as well as an Associate at SMACK.health, a health-tech advisory services for early-stage startups.
Several email lists I am on were abuzz last week about the publication of a paper that was described in a press release from Indiana University to demonstrate that “machine learning — the same computer science discipline that helped create voice recognition systems, self-driving cars and credit card fraud detection systems — can drastically improve both the cost and quality of health care in the United States.” The press release referred to a study published by an Indiana faculty member in the journal, Artificial Intelligence in Medicine .
While I am a proponent of computer applications that aim to improve the quality and cost of healthcare, I also believe we must be careful about the claims being made for them, especially those derived from results from scientific research.
After reading and analyzing the paper, I am skeptical of the claims made not only by the press release but also by the authors themselves. My concern is less about their research methods, although I have some serious qualms about them I will describe below, but more so with the press release that was issued by their university public relations office. Furthermore, as always seems to happen when technology is hyped, the press release was picked up and echoed across the Internet, followed by the inevitable conflation of its findings. Sure enough, one high-profile blogger wrote, “physicians who used an AI framework to make patient care decisions had patient outcomes that were 50 percent better than physicians who did not use AI.” It is clear from the paper that physicians did not actually use such a framework, which was only applied retrospectively to clinical data.
What exactly did the study show? Basically, the researchers obtained a small data set for one clinical condition in one institution’s electronic health record and applied some complex data mining techniques to show that lower cost and better outcomes could be achieved by following the options suggested by the machine learning algorithm instead of what the clinicians actually did. The claim, therefore, is that if the data mining were followed by the clinicians instead of their own decision-making, then better and cheaper care would ensue.
How many nurses does it take to care for a hospitalized patient? No, that’s not a bad version of a light bulb joke; it’s a serious question, with thousands of lives and billions of dollars resting on the answer. Several studies (such as here and here) published over the last decade have shown that having more nurses per patient is associated with fewer complications and lower mortality. It makes sense.
Yet these studies have been criticized on several grounds. First, they examined staffing levels for hospitals as a whole, not at the level of individual units. Secondly, they compared well-staffed hospitals against poorly staffed ones, raising the possibility that staffing levels were a mere marker for other aspects of quality such as leadership commitment or funding. Finally, they based their findings on average patient load, failing to take into account patient turnover.
Last week’s NEJM contains the best study to date on this crucial issue. It examined nearly 200,000 admissions to 43 units in a “high quality hospital.” While the authors don’t name the hospital, they do tell us that the institution is a US News top rated medical center, has achieved nursing “Magnet” status, and, during the study period, had a mortality rate nearly 40 percent below that predicted for its case-mix. In other words, it was no laggard.
As one could guess from its pedigree and outcomes, the hospital’s approach to nurse staffing was not stingy. Of 176,000 nursing shifts during the study period, only 16 percent were significantly below the established target (the targets are presumably based on patient volume and acuity, but are not well described in the paper). The authors found that patients who experienced a single understaffed shift had a 2 percent higher mortality rate than ones who didn’t. Each additional understaffed shift carried a similar, and additive, risk. This means that the one-in-three patients who experienced three such shifts during their hospital stay had a 6 percent higher mortality than the few patients who didn’t experience any. If the FDA discovered that a new medication was associated with a 2 percent excess mortality rate, you can bet that the agency would withdraw it from the market faster than you could say “Sidney Wolfe.”
The effects of high patient turnover were even more striking. Exposure to a shift with unusually high turnover (7 percent of all shifts met this definition) was associated with a 4 percent increased odds of death. Apparently, patient turnover – admissions, discharges, and transfers – is to hospital units and nurses as takeoffs and landings are to airplanes and flight crews: a single 5-hour flight (one takeoff/landing) is far less stressful, and much safer, than five hour-long flights (5 takeoffs/landings).Continue reading…
A terrific article in The New York Times Magazine this summer described the decade-long effort on the part of IBM artificial intelligence researchers to build a computer that can beat humans in the game of “Jeopardy!” Since I’m not a computer scientist, their pursuit struck me at first as, well, trivial. But as I read the story, I came to understand that the advance may herald the birth of truly usable artificial intelligence for clinical decision-making.
And that is a big deal.
I’ve lamented, including in an article in this month’s Health Affairs, on the curious omission of diagnostic errors from the patient safety radar screen. Part of the problem is that diagnostic errors are awfully hard to fix. The best we’ve been able to do is improve information flow to try to prevent handoff errors, and teach ourselves to perform meta-cognition: that is, we can think about our own thinking, so that we are aware of common pitfalls and catch them before we pull our diagnostic trigger.
These solutions are fine, but they go only so far. In the age of Google, you’d think we’d be on the cusp of developing a computer that is a better diagnostician than the average doctor. Unfortunately, computer scientists have thought we were close to this same breakthrough for the past 40 years and both they and practicing clinicians have always come away disappointed. Before getting to the Jeopardy-playing computer, I’ll start by recounting the generally sad history of artificial intelligence (AI) in medicine, some of it drawn from our chapter on diagnostic errors in Internal Bleeding: