Show simple item record

dc.contributor.authorPugh, Samuel L.
dc.contributor.authorChandler, Chelsea
dc.contributor.authorCohen, Alex S.
dc.contributor.authorDiaz-Asper, Catherine
dc.contributor.authorElvevåg, Brita
dc.contributor.authorFoltz, Peter W.
dc.date.accessioned2024-11-05T14:18:20Z
dc.date.available2024-11-05T14:18:20Z
dc.date.issued2024-08-03
dc.description.abstractNatural Language Processing (NLP) methods have shown promise for the assessment of formal thought disorder, a hallmark feature of schizophrenia in which disturbances to the structure, organization, or coherence of thought can manifest as disordered or incoherent speech. We investigated the suitability of modern Large Language Models (LLMs - e.g., GPT-3.5, GPT-4, and Llama 3) to predict expert-generated ratings for three dimensions of thought disorder (coherence, content, and tangentiality) assigned to speech samples collected from both patients with a diagnosis of schizophrenia (n = 26) and healthy control participants (n = 25). In addition to (1) evaluating the accuracy of LLM-generated ratings relative to human experts, we also (2) investigated the degree to which the LLMs produced consistent ratings across multiple trials, and we (3) sought to understand the factors that impacted the consistency of LLM-generated output. We found that machine-generated ratings of the level of thought disorder in speech matched favorably those of expert humans, and we identified a tradeoff between accuracy and consistency in LLM ratings. Unlike traditional NLP methods, LLMs were not always consistent in their predictions, but these inconsistencies could be mitigated with careful parameter selection and ensemble methods. We discuss implications for NLP-based assessment of thought disorder and provide recommendations of best practices for integrating these methods in the field of psychiatry.en_US
dc.identifier.citationPugh, Chandler, Cohen, Diaz-Asper, Elvevåg, Foltz. Assessing dimensions of thought disorder with large language models: The tradeoff of accuracy and consistency. Psychiatry Research. 2024;341en_US
dc.identifier.cristinIDFRIDAID 2298270
dc.identifier.doi10.1016/j.psychres.2024.116119
dc.identifier.issn0165-1781
dc.identifier.issn1872-7123
dc.identifier.urihttps://hdl.handle.net/10037/35455
dc.language.isoengen_US
dc.publisherElsevieren_US
dc.relation.journalPsychiatry Research
dc.rights.accessRightsopenAccessen_US
dc.rights.holderCopyright 2024 The Author(s)en_US
dc.rights.urihttps://creativecommons.org/licenses/by/4.0en_US
dc.rightsAttribution 4.0 International (CC BY 4.0)en_US
dc.titleAssessing dimensions of thought disorder with large language models: The tradeoff of accuracy and consistencyen_US
dc.type.versionpublishedVersionen_US
dc.typeJournal articleen_US
dc.typeTidsskriftartikkelen_US
dc.typePeer revieweden_US


File(s) in this item

Thumbnail

This item appears in the following collection(s)

Show simple item record

Attribution 4.0 International (CC BY 4.0)
Except where otherwise noted, this item's license is described as Attribution 4.0 International (CC BY 4.0)