Ahmed Usman, Mukhiya Suresh Kumar, Srivastava Gautam, Lamo Yngve, Lin Jerry Chun-Wei
Electrical Engineering and Mathematical Sciences, Western Norway University of Applied Sciences, Bergen, Norway.
Department of Mathematics and Computer Science, Brandon University, Brandon, MB, Canada.
Front Psychol. 2021 Mar 30;12:642347. doi: 10.3389/fpsyg.2021.642347. eCollection 2021.
With the increasing prevalence of Internet usage, Internet-Delivered Psychological Treatment (IDPT) has become a valuable tool to develop improved treatments of mental disorders. IDPT becomes complicated and labor intensive because of overlapping emotion in mental health. To create a usable learning application for IDPT requires diverse labeled datasets containing an adequate set of linguistic properties to extract word representations and segmentations of emotions. In medical applications, it is challenging to successfully refine such datasets since emotion-aware labeling is time consuming. Other known issues include vocabulary sizes per class, data source, method of creation, and baseline for the human performance level. This paper focuses on the application of personalized mental health interventions using Natural Language Processing (NLP) and attention-based in-depth entropy active learning. The objective of this research is to increase the trainable instances using a semantic clustering mechanism. For this purpose, we propose a method based on synonym expansion by semantic vectors. Semantic vectors based on semantic information derived from the context in which it appears are clustered. The resulting similarity metrics help to select the subset of unlabeled text by using semantic information. The proposed method separates unlabeled text and includes it in the next active learning mechanism cycle. Our method updates model training by using the new training points. The cycle continues until it reaches an optimal solution, and it converts all the unlabeled text into the training set. Our in-depth experimental results show that the synonym expansion semantic vectors help enhance training accuracy while not harming the results. The bidirectional Long Short-Term Memory (LSTM) architecture with an attention mechanism achieved 0.85 Receiver Operating Characteristic (ROC curve) on the blind test set. The learned embedding is then used to visualize the activated word's contribution to each symptom and find the psychiatrist's qualitative agreement. Our method improves the detection rate of depression symptoms from online forum text using the unlabeled forum texts.
随着互联网使用的日益普及,互联网提供的心理治疗(IDPT)已成为开发改进的精神障碍治疗方法的宝贵工具。由于心理健康中情感的重叠,IDPT变得复杂且劳动强度大。为IDPT创建一个可用的学习应用程序需要包含足够语言属性集的多样化标记数据集,以提取单词表示和情感分割。在医学应用中,成功完善此类数据集具有挑战性,因为情感感知标记很耗时。其他已知问题包括每个类别的词汇量、数据源、创建方法以及人类表现水平的基线。本文重点关注使用自然语言处理(NLP)和基于注意力的深度熵主动学习的个性化心理健康干预应用。本研究的目的是使用语义聚类机制增加可训练实例。为此,我们提出了一种基于语义向量同义词扩展的方法。基于从其出现的上下文中导出的语义信息的语义向量被聚类。由此产生的相似性度量有助于通过使用语义信息选择未标记文本的子集。所提出的方法分离未标记文本并将其包含在下一个主动学习机制周期中。我们的方法通过使用新的训练点来更新模型训练。这个循环持续进行,直到达到最优解,并将所有未标记文本转换为训练集。我们深入的实验结果表明,同义词扩展语义向量有助于提高训练精度,同时不影响结果。具有注意力机制的双向长短期记忆(LSTM)架构在盲测集上达到了0.85的受试者工作特征(ROC曲线)。然后,学习到的嵌入用于可视化激活词对每个症状的贡献,并找到精神科医生的定性一致性。我们的方法使用未标记的论坛文本提高了从在线论坛文本中检测抑郁症状的准确率。