Artificial Patients need Artificial Intelligence; The Sick and Worried Amongst Us Deserve Better

Every conversation with a patient is an exercise in the analysis of “big data.” The patient’s appearance, changes in mood and expression, and eye contact are data points. The illness narrative is rich in semiotics: pacing, timing, nuances of speech, dialect are influenced by context, background, and insight which in turn reflect religion, education, literacy, numeracy, life experiences and peer input. All this is tempered by personal philosophy and personality traits such as recalcitrance, resilience, and tolerance. Taking a history, by itself, generates a wealth of data but that’s just the start.

Add into the mix physical findings of variable reliability, laboratory markers of variable specificity, imaging bits and bytes and you have “big data.” Then you mine this data for the probabilistic variance of the potential causes of a complaint based on which you begin to consider values for numerous options for care. So armed, the physician next needs to factor the benefits and harms of multiple treatments’ derived from populations that never perfectly reflect the situation of the individual in the chair next to us, our patient. This is the information necessary to empower our patient to make rational choices from the menu of options. That is clinical medicine. That is what we do many times a day to the best of our ability and to the limits of our stamina.

Take that Watson. You need a lot more than 90 servers and megawatts of electricity to manage our bedside rounds. You need to contend with the gloriously complicated and idiosyncratic fabric of human existence. Poets might be a match, but Watson is not.

Watson is doomed not just from its limited technical sufficiency compared our cognitive birthright. Even if Watson could grow its server brain to match ours, it won’t be able to find measurable quantities for the independent variables captured during a patient encounter nor the role of personal values that temper that patient’s choice. Life does not have independent and dependent variables; the things that matter to us are on both sides of a regression model. Watson needs rules to violate this statistic and there are none that generalize. Somehow, our brains have a measuring instrument that no data query can find or measure and that we innately understand but can’t fully communicate. Also, our brains seem to intuitively understand statistics; our brains know that the variations around the regression lines (residuals) mean more to us than the models themselves. Sure, if there is something discrete to know, a simple, measurable deterministic item, or an answer to a game show question, Watson will kick most, and maybe all, of our butts. But, what if what is important to us is not deterministic, nor discrete? What if life is more importantly measured in “when” than “if”? And what if the “when, and how we feel about the when” are intertwined? What if medical life is not even measured in outcomes, but, instead, relationships that foster peaceful moments? In this reality, Watson will be lost.

Watson is doomed on yet another level beyond a dearth of “code friendly” meaningful measures of humanity. It is doomed in that it is capable of reading the “World’s Literature”. Our desires and motives to improve the care of individuals is being buried in reams of codependent, biased, unrestricted, marketed, false positive or false negative associated, and poorly studied information that sees the light of Watson’s day because it can read every report published in the massive number of nearly 20,000 biomedical journals. A “60 Minutes” report on AI reveled in Watson’s prowess at searching the literature. We can’t substantiate one particular quote in the report, and bet the quoted can’t either, that there are 8000 research reports published daily. But, that is Watson’s problem. Watson fails to recognize that it is more important to know what we should not read rather than to be able to read it all. There is just too much precarious information being perpetrated on unsuspecting readers, whether the readers have eyes or algorithms.

Science is the glue that holds medical care together but it is far from a perfect adhesive. We have both served long tenures on the editorial boards of leading general and specialty clinical journals. We have many an anecdote about the rocky relationship between medical care and the science that informs it. An anecdote from Dr. McNutt serves as a particularly disconcerting object lesson. He commented on a paper being brought for publication, a paper that he argued should be rejected because it was a Phase 2 study. The study was not fatally flawed by design, just premature, as many Phase 2 studies fail to be replicated after better-designed Phase 3 studies are performed. Science is about accuracy and redundancy and timelessness and process, not expediency. Despite his arguments the paper was published and became highly cited. Sure enough a better-designed Phase 3 study rejected the hypothesis supported by the Phase 2 study vindicating Dr. McNutt on this occasion. But that is not the point. The point is that Watson knows of both studies. You only need to know one of them. How did Watson handle the irreproducible nature of the studies and their contrary insights? One might wonder if the negative study was cited as often as the positive, premature study. Watson would know.

Are we being too tough on AI? We are not writing about Watson’s specific program but, instead, using it as a metaphor for big data analytics and messy regression models. It is not clear if Watson has been tested in a range of clinical situations where inherent uncertainty prevails.  No pertinent randomized trials are cited when “Watson artificial intelligence” is entered into “PubMed”. There are attempts to match patients to clinical studies, but no outcome studies. This is important since that 60 Minute episode told of a patient who was treated after a “recommendation” from Watson. We assume that the treatment met ethical standards for a Phase 1 study and that the patient was fully informed. We are left to assume, also, that the information found by AI was reliable and adequately tested.  After all, this compliant-with-Watson, yet unfortunate patient succumbed to an “infection” several months after receiving the treatment.  We worry about the validity of the information spewed by the algorithm and how on earth the researchers planned to learn anything about the efficacy of the proposed intervention from treating their patient. Science requires universal aims and adequate comparisons. In our view, any AI solution for any patient should be subjected to stringent, publicly available scientific testing. AI, to us, is in dire need of Phase 1 testing.

Science can be better. Watson will not advance science, scientific inquiry will. Better designs for clinical care and insights from scientific data need to be developed and implemented. We do not need massive amounts of data, just small amounts gathered in thoughtfully planned studies. And with better science, we will not need AI. Instead of banking, or breaking the bank, on AI, we should use our remarkable brains to learn by rigorous scientific enquiry and introduce valid scientific insights into the “big data” dialogue we call the patient’s “history” and do so in the service of what we call “patient care.” Watson and other systems may be able to do a wonderful job determining what books we buy, and, from a medical perspective, it might be able to pick a particular antibiotic given a known infection due to the deterministic nature of that task. But, treating infection, as an example, is a small data part of what we do; we help sick people and for that big data task, Watson will, in our view, not be sufficiently insightful.

Spread the love

Categories: Uncategorized

13 replies »

  1. Jonathan. I understand your point. It is IBM’s money going into Watson, but someday it may be mine, indirectly. However, one idea I tried to leave was that we seem so interested in some issues, but not others. Genomics, AI, technology ideas seem readily reported and funded, but, I don’t see as much interest or reports in problems with data and science. We have disparate data sources, inabilities to confirm studies in timely ways, poor study designs, incomplete populations of patients being studied, and other issues. Can we put our heads together with equal vigor to develop better bed-side science? I think so. My concern for “produced” data from a program like an AI model is that the model may be biased by those doing the study, and there is no uniform way to think about prediction, especially if the data is bad. I saw a neat report of the same data set given to 29 research teams asking the same question and there were 29 different answers; some yes, some no, some no difference. Neutral, cynical, recalcitrant,”prove it to me” types of judgments about the stringency of science seem lacking, in my view, and that worries me. I realize what models can do, I have built them. But, better studies always out perform my models.

  2. Sir William Bragg, many years ago, said: “The most important thing in science is not so much to obtain new facts as to discover new ways of thinking about them.” What is it, then, that empowers a physician to do this every day with each “encounter?” It is clear to me that the diversity involving the dialectic between the realms of knowledge for the humanities and the sciences involving each person’s HEALTH is profound. I suspect that creativity, humor and respect govern at all levels, for both the physician and the person.

  3. This is an interesting article, and much closer to things I remember reading from my days in academic philosophy than the usual fare on THCB. The piece makes important points about how very different forms of evidence need to be used in the exam room and the different forms of communication that need to be used with the patient. It is extremely complex in ways we can’t fully articulate, let alone program today. But I can’t agree with the conclusion.

    It is clear that AI is nowhere near where it would need to be to replace humans today. At the same time, AI is in its infancy. At the current rate of change in learning algorithms, visual recognition software, etc., I think the AI machines 100 years from now will look as different from today’s AI as a passenger jet looks to a WWI biplane. I just don’t see any empirical reason at all to posit a limit on machine intelligence at this point, and one isn’t provided in this article.

    The radical departure for machine intelligence is the creation of self-directing machines that start with rules and inferential mechanisms built by humans, but that have the ability to detect patterns and create new connections and associations at a rate far faster than humans could. They rewire themselves. We don’t have to assume this goes all the way to sentience for it to be more accurate and powerful in medicine than human abilities allow. In some sense we don’t “need” those advances, but that’s the same sense in which we do not “need” advances in medicine in general. People can always stop living longer.

    I also don’t see a reason to reduce investment in AI. It doesn’t strike me as being that large at the moment, given the size of the economy. And given the potential payoff, it seems like exactly the sort of thing a technology company would want to invest in. I do not, by the way, say that as a sanguine fanboy of AI. The prospect of a superintelligent being that far outclasses human abilities fills me with a feeling of dread about an AI dominated future.

  4. I wish I knew everyone’s background (it seems most are MDs), but this is an interesting exchange. I’m not a doc, so with that said, I guess I didn’t appreciate the emotions that this issue raises amongst MDs. I kinda shrugged and said, of course AI cannot replace a well trained perceptive MD in a human face-to-face evaluation. It hasn’t occurred to me that anyone is suggesting that, and if they are, well… phooey.

    I think Kip makes the point. That Watson et als, can augment evaluative activity in all sorts of ways, so long as it doesn’t bog the MD down too much. That’s the key. How do you access what AI and Watson have without getting enmired in data overflow.

  5. I like their post and their point (no surprise). They are standing next to me on the soapbox sharing my distaste for technology being in the room with me and my patient.

    Of course tech people are going to dismiss these men as malcontents, techies truly believe what we do in an exam room can be replicated by Watson. It does make sense to push for “evidence” that Watson can or cannot do what we do.

    However, the trials absolutely should not be evaluating Watson using simple cases, like streptococcal throat infections, hypertension, or conditions for which protocols already exist. Where Watson should be tested is in conversion disorder or other situations where we are pushed to our limit to solve the mystery.

    I would like to see how Watson does up against a 16 year old girl with new onset stuttering and memory loss after mild head injury following syncope. She receives expensive workup by neurology, including imaging and EEG (all normal), referrals to speech and other specialists, yet in reality, all becomes clear once it becomes apparent she is afraid to tell her strict Mormon parents she is illegitimately pregnant and does not know who the father is. This situation is a work in progress as she eliminates one male at a time through paternity testing.

    It might make sense to do more with the fact the physician-patient relationship is therapeutic in its own right; we should not underestimate the value of one person comforting another and that alone being healing.

    • “all becomes clear once it becomes apparent she is afraid to tell her strict Mormon parents she is illegitimately pregnant and does not know who the father is. ”

      What, you mean Watson wouldn’t have picked that up? What you said is absolutely correct, patients need physicians who can listen to them and who care.

  6. Via e-mail

    Like Niran, Ross and the authors, I treat the exaggerated claims made for AI and “big data” with great skepticism, not just in medicine but in all other scientific fields. Here’s a blog comment I wrote a year ago on this issue http://pnhp.org/blog/2014/09/03/big-data-the-latest-fad-in-health-policy/

    At the end Robert and Nortin write: “Science can be better. Watson will not advance science, scientific inquiry will…. We do not need massive amounts of data, just small amounts gathered in thoughtfully planned studies. And with better science, we will not need AI.”

    I need more time than I have right now to think about the implications of that statement, but I’ll say this for it: I love it. It’s provocative, and probably true. The one caveat I would suggest is that Watson, and all databases, can advance science in the sense that they can help scientists develop hypotheses worth pursuing. But big data and AI are no substitute for well designed experiments.

  7. Oh, and by the way; AI will not be a star, it is a tool. Tools can’t be stars. People are stars.

  8. Good point, BobbyGvegas. Can’t measure something, can’t study it. Dr Palmer, I hope to meet you someday and have a talk. You are right, we may be wrong. In fact, everything we think we know may be wrong. That is why we have science. One point I wanted to make is that we are so enamored with “things”, we fail to adequately consider how to measure those things and then study them. If we can spend billions to wipe out cancer (nice thought, but naive, and what on earth do you think we have been doing), spend a billion or so to gather a convenience sample of people and take there genomes and then follow them past when most of will be dead, we can also find ways to make science, THE, practice of medicine. If AI is picking treatments, it needs a study (and malpractice insurance). Paul, right on. So many good ideas that are incorrect become standards leading us to institutionalized insanity.

  9. Some years ago I was involved in a start up company that data mined insurance claims and had docs on staff to try and do case management for patients whose claim history suggested fragmented care or other problems. In house they fretted that they were having trouble proving that the model worked to either improve care or save money. Nevertheless they were acquired for a huge sum…..and the service continues within a health services division of a big company. But I suspect the proof of the model still eludes them (or the handful of companies that provide the service).

    Not exactly AI, but similar….sounds great and promising, but serious questions if it will add value or perhaps trigger negative unintended consequences.

Leave a Reply

Your email address will not be published. Required fields are marked *