• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于潜在自杀信息的自训练半监督标注方法。

Bootstrapping semi-supervised annotation method for potential suicidal messages.

作者信息

Acuña Caicedo Roberto Wellington, Gómez Soriano José Manuel, Melgar Sasieta Héctor Andrés

机构信息

Information Technology Undergraduate Program, Universidad Estatal del Sur de Manabí, Jipijapa, Manabí, SENESCYT Scholarship Holder, Ecuador.

Department of Engineering, Computer Engineering Section, Graduate School, Pontificia Universidad Católica del Perú, Lima, Peru.

出版信息

Internet Interv. 2022 Feb 28;28:100519. doi: 10.1016/j.invent.2022.100519. eCollection 2022 Apr.

DOI:10.1016/j.invent.2022.100519
PMID:35281704
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8913319/
Abstract

The suicide of a person is a tragedy that deeply affects families, communities, and countries. According to the standardized rate of suicides per number of inhabitants worldwide, in 2022 there will be approximately about 903,450 suicides and 18,069,000 unconsummated suicides, affecting people of all ages, countries, races, beliefs, social status, economic status, sex, etc. The publication of suicidal intentions by users of social networks has led to the initiation of research processes in this field, to detect them and encourage them not to commit suicide. This study focused on determining a semi-supervised method to populate the Life Corpus, using a bootstrapping technique, to automatically detect and classify texts extracted from social networks and forums related to suicide and depression based on initial supervised samples. To carry out the experiments we used two different classifiers: Support Vector Machine (SVM) (with Bag of Words (BoW) features with and without Term-Frequency/Inverse Document Frequency (Tf/Idf), as a weighted term, and with or without stopwords) and Rasa (with the default feature extraction system). In addition, we performed the experiments using five data collections: Life, Reddit, Life+Reddit, Life_en, and Life_en + Reddit. Using the semi-supervised method, we managed to increase the size of the Life Corpus from 102 to 273 samples with texts from the social network Reddit, in a combination Life+Reddit+BoW_Embeddings, with the SVM classifier, with which a macro f1 value of 0.80 was achieved. These texts were in turn evaluated by annotators manually with a Cohen's Kappa level of agreement of 0.86.

摘要

一个人的自杀是一场深刻影响家庭、社区和国家的悲剧。根据全球每居民人数的自杀标准化率,2022年将有大约903450起自杀事件和18069000起未遂自杀事件,影响所有年龄、国家、种族、信仰、社会地位、经济地位、性别等的人群。社交网络用户公布自杀意图引发了该领域的研究进程,以检测这些意图并鼓励他们不要自杀。本研究专注于确定一种半监督方法来填充生命语料库,使用自训练技术,基于初始监督样本自动检测和分类从社交网络和论坛中提取的与自杀和抑郁相关的文本。为了进行实验,我们使用了两种不同的分类器:支持向量机(SVM)(具有词袋(BoW)特征,有或没有词频/逆文档频率(Tf/Idf)作为加权项,有或没有停用词)和Rasa(使用默认特征提取系统)。此外,我们使用五个数据集进行了实验:Life、Reddit、Life+Reddit、Life_en和Life_en + Reddit。使用半监督方法,我们成功地将生命语料库的规模从102个样本增加到273个样本,这些样本来自社交网络Reddit,采用Life+Reddit+BoW_Embeddings组合,使用SVM分类器,实现了0.80的宏f1值。这些文本随后由注释者手动评估,科恩kappa一致性水平为0.86。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/66365d2cc1d8/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/f34dd0dbf659/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/ca80a6f5e689/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/66365d2cc1d8/gr3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/f34dd0dbf659/gr1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/ca80a6f5e689/gr2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9e0f/8913319/66365d2cc1d8/gr3.jpg

相似文献

1
Bootstrapping semi-supervised annotation method for potential suicidal messages.用于潜在自杀信息的自训练半监督标注方法。
Internet Interv. 2022 Feb 28;28:100519. doi: 10.1016/j.invent.2022.100519. eCollection 2022 Apr.
2
Assessment of supervised classifiers for the task of detecting messages with suicidal ideation.用于检测具有自杀意念信息任务的监督分类器评估。
Heliyon. 2020 Aug 3;6(8):e04412. doi: 10.1016/j.heliyon.2020.e04412. eCollection 2020 Aug.
3
Detecting Suicidal Ideation on Forums: Proof-of-Concept Study.在论坛上检测自杀意念:概念验证研究。
J Med Internet Res. 2018 Jun 21;20(6):e215. doi: 10.2196/jmir.9840.
4
Leveraging Reddit for Suicidal Ideation Detection: A Review of Machine Learning and Natural Language Processing Techniques.利用 Reddit 检测自杀意念:机器学习和自然语言处理技术的综述。
Int J Environ Res Public Health. 2022 Aug 19;19(16):10347. doi: 10.3390/ijerph191610347.
5
A semi-supervised Support Vector Machine model for predicting the language outcomes following cochlear implantation based on pre-implant brain fMRI imaging.一种基于植入前脑功能磁共振成像预测人工耳蜗植入后语言结果的半监督支持向量机模型。
Brain Behav. 2015 Oct 12;5(12):e00391. doi: 10.1002/brb3.391. eCollection 2015 Dec.
6
Create Solidarity Networks: Dialogs in Reddit to Overcome Depression and Suicidal Ideation among Males.创建团结网络:Reddit 上的对话,以帮助男性克服抑郁和自杀意念。
Int J Environ Res Public Health. 2021 Nov 13;18(22):11927. doi: 10.3390/ijerph182211927.
7
Analyzing Suicide Risk From Linguistic Features in Social Media: Evaluation Study.通过社交媒体语言特征分析自杀风险:评估研究
JMIR Form Res. 2022 Aug 30;6(8):e35563. doi: 10.2196/35563.
8
Detection of Suicidal Ideation on Social Media: Multimodal, Relational, and Behavioral Analysis.社交媒体上自杀意念的检测:多模态、关系和行为分析。
J Med Internet Res. 2020 Jul 7;22(7):e17758. doi: 10.2196/17758.
9
Characteristics of High Suicide Risk Messages From Users of a Social Network-Sina Weibo "Tree Hole".社交网络——新浪微博“树洞”用户发布的高自杀风险信息的特征
Front Psychiatry. 2022 Feb 18;13:789504. doi: 10.3389/fpsyt.2022.789504. eCollection 2022.
10
Social Reminiscence in Older Adults' Everyday Conversations: Automated Detection Using Natural Language Processing and Machine Learning.老年人日常对话中的社会怀旧:使用自然语言处理和机器学习的自动检测。
J Med Internet Res. 2020 Sep 15;22(9):e19133. doi: 10.2196/19133.

引用本文的文献

1
Applications of Large Language Models in the Field of Suicide Prevention: Scoping Review.大语言模型在自杀预防领域的应用:范围综述
J Med Internet Res. 2025 Jan 23;27:e63126. doi: 10.2196/63126.

本文引用的文献

1
Assessment of supervised classifiers for the task of detecting messages with suicidal ideation.用于检测具有自杀意念信息任务的监督分类器评估。
Heliyon. 2020 Aug 3;6(8):e04412. doi: 10.1016/j.heliyon.2020.e04412. eCollection 2020 Aug.
2
Clinical Usefulness of the Geriatric Depression Scale to Identify the Elderly at Risk of Suicide.老年抑郁量表在识别有自杀风险老年人方面的临床实用性。
Psychiatry Investig. 2020 May;17(5):481-486. doi: 10.30773/pi.2019.0299. Epub 2020 May 15.
3
Promoting Reproducible Research for Characterizing Nonmedical Use of Medications Through Data Annotation: Description of a Twitter Corpus and Guidelines.
通过数据标注促进用于表征药物非医疗用途的可重复研究:Twitter语料库描述及指南
J Med Internet Res. 2020 Feb 26;22(2):e15861. doi: 10.2196/15861.
4
Proactive Suicide Prevention Online (PSPO): Machine Identification and Crisis Management for Chinese Social Media Users With Suicidal Thoughts and Behaviors.在线主动预防自杀(PSPO):针对有自杀想法和行为的中国社交媒体用户的机器识别与危机管理
J Med Internet Res. 2019 May 8;21(5):e11705. doi: 10.2196/11705.
5
Risk Assessment Tools and Data-Driven Approaches for Predicting and Preventing Suicidal Behavior.用于预测和预防自杀行为的风险评估工具及数据驱动方法。
Front Psychiatry. 2019 Feb 13;10:36. doi: 10.3389/fpsyt.2019.00036. eCollection 2019.
6
COPIOUS: A gold standard corpus of named entities towards extracting species occurrence from biodiversity literature.COPIOUS:一个用于从生物多样性文献中提取物种出现信息的命名实体黄金标准语料库。
Biodivers Data J. 2019 Jan 22(7):e29626. doi: 10.3897/BDJ.7.e29626. eCollection 2019.
7
Automatic detection of cyberbullying in social media text.社交媒体文本中网络欺凌的自动检测。
PLoS One. 2018 Oct 8;13(10):e0203794. doi: 10.1371/journal.pone.0203794. eCollection 2018.
8
Validation of an abbreviated version of the Lubben Social Network Scale ("LSNS-6") and its associations with suicidality among older adults in China.验证简化版卢本社会网络量表(“LSNS-6”)在中国老年人中的适用性及其与自杀倾向的关系。
PLoS One. 2018 Aug 2;13(8):e0201612. doi: 10.1371/journal.pone.0201612. eCollection 2018.
9
Semi-Supervised Recurrent Neural Network for Adverse Drug Reaction mention extraction.基于半监督循环神经网络的药物不良反应提及抽取。
BMC Bioinformatics. 2018 Jun 13;19(Suppl 8):212. doi: 10.1186/s12859-018-2192-4.
10
Leveraging machine learning-based approaches to assess human papillomavirus vaccination sentiment trends with Twitter data.利用基于机器学习的方法,利用 Twitter 数据评估人乳头瘤病毒疫苗接种情绪趋势。
BMC Med Inform Decis Mak. 2017 Jul 5;17(Suppl 2):69. doi: 10.1186/s12911-017-0469-6.