Baghdadi Nadiah A, Malki Amer, Magdy Balaha Hossam, AbdulAzeem Yousry, Badawy Mahmoud, Elhosseini Mostafa
Nursing Management and Education Department, College of Nursing, Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia.
College of Computer Science and Engineering, Taibah University, Yanbu, Saudi Arabia.
PeerJ Comput Sci. 2022 Aug 23;8:e1070. doi: 10.7717/peerj-cs.1070. eCollection 2022.
Many people worldwide suffer from mental illnesses such as major depressive disorder (MDD), which affect their thoughts, behavior, and quality of life. Suicide is regarded as the second leading cause of death among teenagers when treatment is not received. Twitter is a platform for expressing their emotions and thoughts about many subjects. Many studies, including this one, suggest using social media data to track depression and other mental illnesses. Even though Arabic is widely spoken and has a complex syntax, depressive detection methods have not been applied to the language. The Arabic tweets dataset should be scraped and annotated first. Then, a complete framework for categorizing tweet inputs into two classes (such as Normal or Suicide) is suggested in this study. The article also proposes an Arabic tweet preprocessing algorithm that contrasts lemmatization, stemming, and various lexical analysis methods. Experiments are conducted using Twitter data scraped from the Internet. Five different annotators have annotated the data. Performance metrics are reported on the suggested dataset using the latest Bidirectional Encoder Representations from Transformers (BERT) and Universal Sentence Encoder (USE) models. The measured performance metrics are balanced accuracy, specificity, F1-score, IoU, ROC, Youden Index, NPV, and weighted sum metric (WSM). Regarding USE models, the best-weighted sum metric (WSM) is 80.2%, and with regards to Arabic BERT models, the best WSM is 95.26%.
全球许多人患有精神疾病,如重度抑郁症(MDD),这些疾病会影响他们的思想、行为和生活质量。在未接受治疗的情况下,自杀被视为青少年的第二大死因。推特是一个表达他们对许多主题的情感和想法的平台。包括本研究在内的许多研究都建议利用社交媒体数据来追踪抑郁症和其他精神疾病。尽管阿拉伯语广泛使用且语法复杂,但抑郁检测方法尚未应用于该语言。首先应抓取并标注阿拉伯语推文数据集。然后,本研究提出了一个将推文输入分类为两类(如正常或自杀)的完整框架。文章还提出了一种阿拉伯语推文预处理算法,该算法对比了词形还原、词干提取和各种词汇分析方法。使用从互联网上抓取的推特数据进行实验。五名不同的注释者对数据进行了标注。使用最新的来自变换器的双向编码器表示(BERT)和通用句子编码器(USE)模型,在建议的数据集上报告性能指标。测量的性能指标包括平衡准确率、特异性、F1分数、交并比(IoU)、ROC、约登指数、阴性预测值和加权和指标(WSM)。关于USE模型,最佳加权和指标(WSM)为80.2%,关于阿拉伯语BERT模型,最佳WSM为95.26%。
PeerJ Comput Sci. 2022-8-23
Front Artif Intell. 2022-8-15
J Med Internet Res. 2022-8-17
JMIR Form Res. 2023-2-28
Healthcare (Basel). 2025-4-22
PeerJ Comput Sci. 2024-10-7
Bioengineering (Basel). 2024-6-19
PeerJ Comput Sci. 2024-2-29
PeerJ Comput Sci. 2023-10-24
NPJ Digit Med. 2022-4-8
JMIR Ment Health. 2022-3-1
Healthcare (Basel). 2022-2-1