New Zealand faces a critical shortage of healthcare workers. When humans are unavailable, one thought is automation. If there aren’t enough human doctors, might Robo Docs fill the gaps?

This piece speculates about how AI could change medicine. Robo Docs are already helping out in Aotearoa. They are weighing us, assessing our blood pressure and pulses. If AIs follow their current pace of improvement, what might medicine look like in 2040?

I don’t pretend to have a crystal ball, so there are no forecasts here. One thing we have seen is that once digital technologies become tolerably good at a human activity, capacities far beyond any human can arrive quite abruptly. We saw this in chess. You may be justifiably reluctant to entrust a Robo Doc with sole charge of your care now, just as you shouldn’t trust a driverless car to safely take you over the Remutakas today. It’s nevertheless important to contemplate a future in which machines significantly outperform humans in both these domains.

The very idea of a Robo Doc that isn’t just a deferential assistant seems absurd. In a book on the consequences of automation published in 2014, before the recent bewildering advances of generative AI, the economists Erik Brynjolfsson and Andrew McAfee offered what seems a balanced assessment: “If the world’s best diagnostician in most specialties – radiology, pathology, oncology, and so on – is not already digital, it soon will be.” But doctors needn’t worry because “Most patients … don’t want to get their diagnosis from a machine.” No one wants a robot to tell them they have cancer.

A study just out in the Journal of the American Medical Association calls into question this area of human supremacy. It compares the responses of Physicians and OpenAI’s ChatGPT to patient inquiries posted to a public social media forum. The replies were assessed by a panel of licensed health care professionals for their quality and empathy. “Of the 195 questions and responses, evaluators preferred chatbot responses to physician responses in 78.6 percent.”

Perhaps this is not surprising. In a stressed medical system, we aren’t comparing machines with humans at their rested, empathic best. We are comparing ChatGPT’s capacity to offer caring advice taken from the 530GB of text downloaded from the internet before September 2021 that it was trained on. This text contains many samples of people expressing concern and compassion for other humans.

Humans are surprisingly willing to treat machines that act like they are human as if they really are. When Siri is unhelpful we don’t insult it, because that seems rude. If ChatGPT delivers responses on an online forum that seem like they express genuine concern, we may be willing to play along. Its responses may compare well with the hastily-typed replies of humans under the gun to reply to a set number of patient inquiries over a given time period.

What happens when generative AIs get access to medical journals?

You still wouldn’t want ChatGPT to diagnose you. It is inclined to hallucinate, offering a melange of useful advice and stuff that is made up. Academics complain that when ChatGPT is asked to justify its claims, it often cites non-existent articles. If you have a worrisome symptom, you don’t want Robo Doc to give you some useful pointers interspersed with recommendations that are little more than quackery thanks to having been frequently recommended in the 530GB of text ChatGPT was trained on.

The guardrails that OpenAI placed around ChatGPT mean that it knows its limitations. It doesn’t want to prescribe pills. “I am not a doctor, but I can provide some general information on possible causes for the symptoms you’ve mentioned. Keep in mind that this is not medical advice, and you should consult a healthcare professional for a proper diagnosis and treatment recommendations.”

It’s likely that much of its medical knowledge comes from Wikipedia, which it downloaded in its entirety. Just as you wouldn’t want me to offer serious medical advice about a heart murmur if my source was the page on Wikipedia, you shouldn’t trust ChatGPT as it is in early 2023.

But if we look at the possibly near future we can imagine something that you might trust. First, we should not demand perfection from Robo Doc. We should demand something that performs at least as well as human doctors. Driverless cars will kill people, but a risk of serious accident that improves on humans is achievable. The errors of human drivers kill, but so do the errors of human doctors. Consider the many iatrogenic disorders resulting from diagnostic and therapeutic procedures undertaken on a patient, and think about whether a 2040 Robo Doc can improve on that.

The information that generative AIs need to significantly improve their diagnostic capacities exists. Much of it is behind the paywalls of medical journals. Consider a generative AI with access to a significant subset of these medical journals. Doctors struggle to keep up with newly-published research in their areas. A medical generative AI may do better. And since it will have access to all of the bibliographies of these journals it is likely to get its references right.

There are challenges for the cosy picture of co-existence between AI and human. An August 2013 Economist story about accidents caused by pilots overly reliant on cockpit automation repeated a “hoary old story” about a “not too distant future” in which “airliners will have only two crew members on the flight deck – a pilot and a dog. The pilot’s job will be to feed the dog. The dog’s job will be to bite the pilot if he touches the controls.” Will the medical consulting rooms of the future also contain dogs whose job is to bite the doctor should they ever dare to override the advice of Robo Doc?

Is it relevant to ask who owns the journals that might power a 2040 Robo Doc? Among the most aggressive of these academic publishers is Elsevier. Its journals include the presigious The Lancet, Cell, and Journal of the American College of Cardiology, among many others. Elsevier also has an extensive online database, ScienceDirect, that could be available to a proprietary generative AI.

Perhaps we should be asking how much a corporation like Elsevier, ruthlessly engineered for profit, might charge Aotearoa to fix its shortage of doctors by offering the services of its Robo Doc trained on all of the medical information in journals it might own by 2040.

What medicine in 2040 might be like

So what might a medical consult in 2040 be like? Recently we have marvelled at Carrie Fisher and Peter Cushing brought back from the dead to perform in Star Wars movies. What happens when we combine these capacities with generative AI’s powers to write as if it feels empathy, even if it actually feels nothing?

Perhaps your 2040 medical consult could be with an AI that is able to take the form of your favourite TV doctor – Dr Quinn medicine woman or Star Trek’s Dr. Leonard H. McCoy. Its advice could come from its Large Language Model based on all the medical journals it has access to.

Is this humanless future one we should want?

Nicholas Agar is a Distinguished Visiting Professor at Carnegie Mellon University in Australia and Adjunct Professor of philosophy at Victoria University of Wellington.

Leave a comment