通过语言模型检测基于网络论坛用户的抑郁症实现早期医疗干预：纵向分析与评估

Enabling Early Health Care Intervention by Detecting Depression in Users of Web-Based Forums using Language Models: Longitudinal Analysis and Evaluation.

作者信息

Owen David, Antypas Dimosthenis, Hassoulas Athanasios, Pardiñas Antonio F, Espinosa-Anke Luis, Collados Jose Camacho

机构信息

School of Computer Science and Informatics, Cardiff University, Cardiff, United Kingdom.

Centre for Medical Education, School of Medicine, Cardiff University, Cardiff, United Kingdom.

出版信息

JMIR AI. 2023 Mar 24;2:e41205. doi: 10.2196/41205.

DOI:10.2196/41205

PMID:37525646

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7614849/

Abstract

BACKGROUND

Major depressive disorder is a common mental disorder affecting 5% of adults worldwide. Early contact with health care services is critical for achieving accurate diagnosis and improving patient outcomes. Key symptoms of major depressive disorder (depression hereafter) such as cognitive distortions are observed in verbal communication, which can also manifest in the structure of written language. Thus, the automatic analysis of text outputs may provide opportunities for early intervention in settings where written communication is rich and regular, such as social media and web-based forums.

OBJECTIVE

The objective of this study was 2-fold. We sought to gauge the effectiveness of different machine learning approaches to identify users of the mass web-based forum Reddit, who eventually disclose a diagnosis of depression. We then aimed to determine whether the time between a forum post and a depression diagnosis date was a relevant factor in performing this detection.

METHODS

A total of 2 Reddit data sets containing posts belonging to users with and without a history of depression diagnosis were obtained. The intersection of these data sets provided users with an estimated date of depression diagnosis. This derived data set was used as an input for several machine learning classifiers, including transformer-based language models (LMs).

RESULTS

Bidirectional Encoder Representations from Transformers (BERT) and MentalBERT transformer-based LMs proved the most effective in distinguishing forum users with a known depression diagnosis from those without. They each obtained a mean -score of 0.64 across the experimental setups used for binary classification. The results also suggested that the final 12 to 16 weeks (about 3-4 months) of posts before a depressed user's estimated diagnosis date are the most indicative of their illness, with data before that period not helping the models detect more accurately. Furthermore, in the 4- to 8-week period before the user's estimated diagnosis date, their posts exhibited more negative sentiment than any other 4-week period in their post history.

CONCLUSIONS

Transformer-based LMs may be used on data from web-based social media forums to identify users at risk for psychiatric conditions such as depression. Language features picked up by these classifiers might predate depression onset by weeks to months, enabling proactive mental health care interventions to support those at risk for this condition.

摘要

背景

重度抑郁症是一种常见的精神障碍，全球5%的成年人受其影响。尽早联系医疗服务对于实现准确诊断和改善患者预后至关重要。重度抑郁症（以下简称抑郁症）的关键症状，如认知扭曲，在言语交流中可见，也可能体现在书面语言结构中。因此，在书面交流丰富且频繁的环境中，如社交媒体和网络论坛，对文本输出进行自动分析可能为早期干预提供机会。

目的

本研究有两个目标。我们试图评估不同机器学习方法识别大规模网络论坛Reddit用户的有效性，这些用户最终被诊断为患有抑郁症。然后，我们旨在确定论坛帖子发布时间与抑郁症诊断日期之间的时间间隔是否是进行这种检测的一个相关因素。

方法

总共获得了2个Reddit数据集，其中包含有抑郁症诊断史和无抑郁症诊断史用户的帖子。这些数据集的交集为用户提供了抑郁症诊断的估计日期。这个派生数据集被用作几个机器学习分类器的输入，包括基于Transformer的语言模型（LMs）。

结果

基于Transformer的双向编码器表征（BERT）和MentalBERT语言模型在区分已知患有抑郁症的论坛用户和未患抑郁症的用户方面最为有效。在用于二元分类的实验设置中，它们各自的平均得分均为0.64。结果还表明，在抑郁症患者估计诊断日期前的最后12至16周（约3 - 4个月）的帖子最能表明其病情，在此之前的数据无助于模型更准确地检测。此外，在用户估计诊断日期前的4至8周内，他们的帖子比其发帖历史中的任何其他4周时间段表现出更多的负面情绪。

结论

基于Transformer的语言模型可用于基于网络的社交媒体论坛数据，以识别有患抑郁症等精神疾病风险的用户。这些分类器提取的语言特征可能在抑郁症发作前数周甚至数月出现，从而能够进行积极的精神卫生保健干预，以支持有患此病风险的人群。

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

通过语言模型检测基于网络论坛用户的抑郁症实现早期医疗干预：纵向分析与评估

Enabling Early Health Care Intervention by Detecting Depression in Users of Web-Based Forums using Language Models: Longitudinal Analysis and Evaluation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献

通过语言模型检测基于网络论坛用户的抑郁症实现早期医疗干预：纵向分析与评估

Enabling Early Health Care Intervention by Detecting Depression in Users of Web-Based Forums using Language Models: Longitudinal Analysis and Evaluation.

作者信息

机构信息

出版信息

BACKGROUND

OBJECTIVE

METHODS

RESULTS

CONCLUSIONS

背景

目的

方法

结果

结论

相似文献

引用本文的文献

本文引用的文献