Abstract
Background: The current paradigm in mental health care focuses on clinical recovery and symptom remission. This model's efficacy is influenced by therapist trust in patient recovery potential and the depth of the therapeutic relationship. Schizophrenia is a chronic illness with severe symptoms where the possibility of recovery is a matter of debate. As artificial intelligence (AI) becomes integrated into the health care field, it is important to examine its ability to assess recovery potential in major psychiatric disorders such as schizophrenia. Objective: This study aimed to evaluate the ability of large language models (LLMs) in comparison to mental health professionals to assess the prognosis of schizophrenia with and without professional treatment and the long-term positive and negative outcomes. Methods: Vignettes were inputted into LLMs interfaces and assessed 10 times by 4 AI platforms: ChatGPT-3.5, ChatGPT-4, Google Bard, and Claude. A total of 80 evaluations were collected and benchmarked against existing norms to analyze what mental health professionals (general practitioners, psychiatrists, clinical psychologists, and mental health nurses) and the general public think about schizophrenia prognosis with and without professional treatment and the positive and negative long-term outcomes of schizophrenia interventions. Results: For the prognosis of schizophrenia with professional treatment, ChatGPT-3.5 was notably pessimistic, whereas ChatGPT-4, Claude, and Bard aligned with professional views but differed from the general public. All LLMs believed untreated schizophrenia would remain static or worsen without professional treatment. For long-term outcomes, ChatGPT-4 and Claude predicted more negative outcomes than Bard and ChatGPT-3.5. For positive outcomes, ChatGPT-3.5 and Claude were more pessimistic than Bard and ChatGPT-4. Conclusions: The finding that 3 out of the 4 LLMs aligned closely with the predictions of mental health professionals when considering the "with treatment"condition is a demonstration of the potential of this technology in providing professional clinical prognosis. The pessimistic assessment of ChatGPT-3.5 is a disturbing finding since it may reduce the motivation of patients to start or persist with treatment for schizophrenia. Overall, although LLMs hold promise in augmenting health care, their application necessitates rigorous validation and a harmonious blend with human expertise.
Original language | English |
---|---|
Article number | e53043 |
Journal | JMIR Mental Health |
Volume | 11 |
Issue number | 1 |
DOIs | |
State | Published - 18 Mar 2024 |
Externally published | Yes |
Bibliographical note
Publisher Copyright:© Zohar Elyoseph, Inbar Levkovich.
Keywords
- artificial intelligence
- ChatGPT
- Generative Pre-trained Transformers
- GPT
- language model
- language models
- large language models
- LLM
- LLMs
- mental
- natural language processing
- NLP
- outcome
- outcomes
- prognosis
- prognostic
- prognostics
- recovery
- schizophrenia
- vignette
- vignettes
ASJC Scopus subject areas
- Psychiatry and Mental health