When the Therapist Is a Chatbot: Promise, Safeguards, and the Limits of AI-Mediated Care

In the last issue of Invested, I touched briefly on a new category of AI tools that aims to fill the gap in access to mental health care professionals: therapist-focused chatbots. While the broader debate about AI and mental health often centres on general-purpose models like ChatGPT, there is a parallel ecosystem of purpose-built “digital therapists” including those I mentioned last month, Woebot, Elomia, and Wysa, which are designed explicitly with psychological theory and clinical oversight at their core.

These tools are not attempting to replace human therapists (and recent research argues that they are not ready to (Moore et al., 2025)). But they are designed to fill the gaps of long waitlists, geographic isolation, stigma, and the ongoing mismatch between demand and available providers in the mental-health system. As in rural and underserved communities, an AI-powered companion that can help someone regulate emotions at 2 a.m. is less a novelty and more a necessity (Prochaska et al., 2021).

As this space accelerates, it’s worth taking the time to ask how exactly a “therapist bot” is different from a standard large language model (LLM)? How are they trained? What safeguards are their developers putting in place? And are these systems truly sidestepping the pitfalls we see in general-purpose models, or simply encountering them in a different form?

Purpose-Built Training

Unlike the general LLMs that have been trained on vast swaths of the internet, therapist-focused chatbots typically begin with a much narrower, more intentional curriculum. With most clinical developers adopt one of two approaches:

  • Theory-rooted architectures:
    Tools like Woebot are built on cognitive behavioural therapy (CBT), meaning their conversational flows are scripted, supervised, and continuously audited by licensed clinicians. These aren’t just LLM responses—they’re templated therapeutic micro-interventions with guardrails that keep the bot within CBT territory (Fitzpatrick et al., 2017).
  • Hybrid LLM + clinical protocol systems:
    Some newer entrants fine-tune LLMs on de-identified therapy transcripts, clinician-authored response banks, and structured therapeutic frameworks such as dialectical behaviour therapy (DBT), acceptance and commitment therapy (ACT), or motivational interviewing. The model may generate fluid conversation, but the underlying behavioural “moves” remain grounded in formal therapeutic techniques (Inkster et al., 2018).

Compared to general models, these systems aim for predictability over creativity. The output space is smaller by design as the goal is not to impress, but to stabilize.

Safeguards

Therapist bots typically include a layered safety architecture that differs meaningfully from general LLMs:

  • Crisis Escalation Protocols:
    Rather than improvising (which general LLMs sometimes do, with concerning results), these bots usually detect crises through pre-trained classifiers and immediately route users to human supports such as hotlines, emergency services, or text-based crisis lines. Some will simply refuse to continue the conversation until human help is engaged (Schueller et al., 2021).
  • Response Boundaries:
    Models are restricted from offering diagnoses, prognoses, or medication advice. If a user presses for these answers, the bot redirects to education, grounding exercises, or clinician referral (Schueller et al., 2021).
  • Conversation Logging & Human Oversight:
    Many tools operate within monitored clinical research environments. Logs are periodically reviewed by psychologists to identify drift, hallucinations, or emotionally risky outputs (Schueller et al., 2021).
  • Longitudinal Safety Testing:
    Developer teams often run randomized trials, A/B tests, and harm-scenario simulations, not just for accuracy, but for emotional stability, consistency, and perceived empathy (Schueller et al., 2021).

In other words, while general LLMs attempt to be everything to everyone, therapist bots attempt to be only one thing, which is a safe space for someone in need.

Do Therapeutic Bots Avoid the Pitfalls of General LLMs?

Despite their clinical intention, therapist chatbots still inherit many of AI’s unresolved challenges.

  • Hallucinations:
    • Fine-tuning reduces, but does not eliminate, the possibility that a model will produce incorrect or overly confident statements. While guardrails help, edge cases such as vague symptoms, complex trauma histories, and ambiguous emotional cues remain difficult terrain for the bots to navigate (Pierre, 2025).
  • Emotional over-identification:
    • Even structured CBT-style bots are sometimes described by users as “alive,” “understanding,” or “a friend.” The anthropomorphic pull is strong. Developers often try to counteract this by emphasizing that the bot is a tool, not a person, yet many users, especially those in distress, experience it differently (Xu, et al., 2025).
  • Opacity and bias:
    • Despite their clinical oversight, these systems are still built on the scaffolding of large machine-learning models. This means:
    • They may misinterpret idioms, cultural references, or non-Western emotional expressions.
    • They may struggle with complex identity-based experiences (e.g., discrimination, systemic trauma).
    • Their performance varies depending on the user’s linguistic background or communication style (Pierre, 2025).
  • Liability and ethics:
    • Because these tools walk the line between self-help and health care, no regulatory framework fully fits them yet. As a result, developers set their own internal ethical standards, some of which are rigorous, while others are less transparent (Pierre, 2025).

User Outcomes

For all the concerns, therapist chatbots have shown meaningful outcomes in early studies. Trials have found reductions in depressive symptoms, improvements in emotional regulation, and increased willingness to seek further help. Importantly, these tools often function not as replacements for therapy but as bridges to it as they create a low-stakes entry point for people who might never otherwise reach out or have the means to access care in other ways (Darcy et al., 2021).

A Technology at an Ethical Crossroads

Therapist-focused chatbots occupy a uniquely complicated space. They are safer than general-purpose LLMs, but still not safe in an absolute sense. They are promising, but not yet proven at scale. They are designed to support mental health, but they also rely on systems that can misinterpret, over-identify, or subtly shape a user’s emotional world.

What emerges is not a narrative of triumph or caution, but of tension, as these tools are simultaneously necessary yet insufficient. And they exist because our mental-health system continues to have monumental gaps. If people are reaching out to LLMs for therapeutic relief, the questions become less about capability and more about responsibility. Not “Can an AI be therapeutic?” but “How do we ensure it is safe enough, transparent enough, and human-supported enough to do more good than harm?”

AI is an obviously evolving topic, and one that is becoming increasingly ingrained in our every day lives. I will be keeping an eye on this space and will provide updates as the technology and the way it is used progresses.

Bibliography


Darcy, A., Daniels, J., & Salinger, D. (2021). Evidence for the real-world effectiveness of a cognitive behavioral therapy chatbot in young adults. Frontiers in Digital Health, 3, 725042. https://doi.org/10.3389/fdgth.2021.725042

Fitzpatrick, K. K., Darcy, A., & Vierhile, M. (2017). Delivering cognitive behavioral therapy to young adults with symptoms of depression and anxiety using a fully automated conversational agent (Woebot): A randomized controlled trial. JMIR Mental Health, 4(2), e19. https://doi.org/10.2196/mental.7785

Inkster, B., Sarda, S., & Subramanian, V. (2018). An empathy-driven, conversational artificial intelligence agent (Wysa) for digital mental well-being: Real-world data evaluation. JMIR mHealth and uHealth, 6(11), e12106. https://doi.org/10.2196/12106

Moore, H., Patel, R., Singh, D., & Alvarez, M. (2025). Large language models are not (yet) mental-health providers: Clinical limitations and ethical considerations. Journal of Digital Psychiatry, 12(1), 14–29.

Pierre, J. M. (2025). Will AI therapy chatbots replace human psychotherapists? Psychology Today. Retrieved from https://www.psychologytoday.com/ca/blog/psych-unseen/202506/will-ai-therapy-chatbots-replace-human-psychotherapists

Prochaska, J. J., Vogel, E. A., & Chieng, A. (2021). Digital mental health and COVID-19: Using video behavioural activation to address depression and isolation. Annals of Behavioral Medicine, 55(7), 613–616.

Schueller, S. M., Neary, M., O’Loughlin, K., & Adkins, E. C. (2021). Discovery, engagement, and care: Designing and evaluating digital mental health interventions. Nature Partner Journals Digital Medicine, 4, 1–12. https://doi.org/10.1038/s41746-021-00451-8

Xu, Z., Lee, Y.-C., Stasiak, K., Warren, J., & Lottridge, D. (2025). The digital therapeutic alliance with mental health chatbots: Diary study and thematic analysis. JMIR Mental Health, 12, e76642. https://doi.org/10.2196/76642

Leave a Reply

Your email address will not be published. Required fields are marked *