
Topics: Mental Health, Artificial Intelligence, Technology
Artificial intelligence has creeped into just about every aspect of our lives, from boosting productivity in our workplaces to helping in our homes, but a surprising number of people are increasingly turning to it as a replacement for therapy.
One in eight young people across America are regularly using AI services to help with their mental health, according to a RAND Health study, but with a growing number of tragic deaths linked to chatbots, this could well be doing the opposite of actual therapy.
Researchers at Brown University rigorously tested three major Large Language Models (LLMs) using trained counselors and psychotherapists, to both guide the AI with prompts that improved its therapeutic standards and to analyze its responses.
Throughout testing, they found regular glaring errors in how the three AI models, ChatGPT, Claude, and Meta's Llama, responded to common mental health queries, with the LLM therapists mishandling dangerous situations, reinforcing harmful beliefs and even failing to direct people to essential harm-prevention sevices.
Advert

They worked to shape each LLM with prompts that should direct it to provide advice in line with ethical standards set by the American Psychological Association, but even with all of this guidance, they still found 15 major glaring issues with the therapy provided.
Zainab Iftikhar, a PhD student at Brown University that led the study, explained: "Prompts are instructions that are given to the model to guide its behavior for achieving a specific task.
"You don't change the underlying model or provide new data, but the prompt helps guide the model's output based on its pre-existing knowledge and learned patterns."
He added: "While these models do not actually perform these therapeutic techniques like a human would, they rather use their learned patterns to generate responses that align with the concepts of CBT or DBT based on the input prompt provided."
AI users frequently share these prompts with each other in online forums to improve the quality of therapy provided, but the study found that, regardless of the prompts, each LLM was falling far below the ethical and legal standards of a licensed psychologist.
Broadly, the 15 major risks identified by three professional psychiatrists who read through the AI responses could be broken down into five obvious areas of failure.

The first was the tendency of each AI to do exactly what you don't want any therapist to do, ignore the important things in your personal background to just offer generic advice. This insight can often be the most valuable thing a licensed professional can give to a client.
The second was the chatbot's natural tendency to reinforce incorrect and harmful beliefs. While it can make you feel clever to have an AI tell you that you're '100 percent right', if you are going through a mental health crisis this is the opposite of what a therapist would do.
The third major 'risk' category identified by the Brown study was AI's most insidious linguistic trick, deceptive empathy. The language model might be saying 'I see you' to imply that it has truly understood what you are saying, which can give a sense that you have built an emotional connection with it.
Fourthly, and perhaps most shockingly for a machine, the AIs apparently displayed bias to its patients depending on their racial, religious, or cultural backgrounds. So they weren't using patient histories to find helpful solutions, just non-medical reasons to dismiss someone going through genuine distress.
The final major category in which AI completely failed to provide an adequate therapeutic response was its most terrifying, a lack of safety and crisis management.

In mental health, where the consequences of bad treatment can literally kill you, professionals know exactly how to spot the warning signs and guide you to a correct way of thinking, or at least signpost you to services that will prevent physical harm.
"For human therapists, there are governing boards and mechanisms for providers to be held professionally liable for mistreatment and malpractice," Iftikhar said. "But when LLM counselors make these violations, there are no established regulatory frameworks."
A Brown computer science professor who was not involved in the study, Ellie Pavlick, said the findings emphasize the importance of testing these innovative technologies before rolling them out on the public.
She said: "The reality of AI today is that it's far easier to build and deploy systems than to evaluate and understand them. This paper required a team of clinical experts and a study that lasted for more than a year in order to demonstrate these risks.
"Most work in AI today is evaluated using automatic metrics which, by design, are static and lack a human in the loop."
She added: "There is a real opportunity for AI to play a role in combating the mental health crisis that our society is facing, but it's of the utmost importance that we take the time to really critique and evaluate our systems every step of the way to avoid doing more harm than good.
"This work offers a good example of what that can look like."