• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用短语的分布表示和在线知识进行无标记数据的精神症状识别。

Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge.

机构信息

School of Biomedical Informatics, The University of Texas Health Science Center at Houston, Houston, TX 77030, USA.

St. John's School, Houston, TX 77019, USA.

出版信息

J Biomed Inform. 2017 Nov;75S:S129-S137. doi: 10.1016/j.jbi.2017.06.014. Epub 2017 Jun 15.

DOI:10.1016/j.jbi.2017.06.014
PMID:28624644
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5705397/
Abstract

OBJECTIVE

Mental health is becoming an increasingly important topic in healthcare. Psychiatric symptoms, which consist of subjective descriptions of the patient's experience, as well as the nature and severity of mental disorders, are critical to support the phenotypic classification for personalized prevention, diagnosis, and intervention of mental disorders. However, few automated approaches have been proposed to extract psychiatric symptoms from clinical text, mainly due to (a) the lack of annotated corpora, which are time-consuming and costly to build, and (b) the inherent linguistic difficulties that symptoms present as they are not well-defined clinical concepts like diseases. The goal of this study is to investigate techniques for recognizing psychiatric symptoms in clinical text without labeled data. Instead, external knowledge in the form of publicly available "seed" lists of symptoms is leveraged using unsupervised distributional representations.

MATERIALS AND METHODS

First, psychiatric symptoms are collected from three online repositories of healthcare knowledge for consumers-MedlinePlus, Mayo Clinic, and the American Psychiatric Association-for use as seed terms. Candidate symptoms in psychiatric notes are automatically extracted using phrasal syntax patterns. In particular, the 2016 CEGS N-GRID challenge data serves as the psychiatric note corpus. Second, three corpora-psychiatric notes, psychiatric forum data, and MIMIC II-are adopted to generate distributional representations with paragraph2vec. Finally, semantic similarity between the distributional representations of the seed symptoms and candidate symptoms is calculated to assess the relevance of a phrase. Experiments were performed on a set of psychiatric notes from the CEGS N-GRID 2016 Challenge.

RESULTS & CONCLUSION: Our method demonstrates good performance at extracting symptoms from an unseen corpus, including symptoms with no word overlap with the provided seed terms. Semantic similarity based on the distributional representation outperformed baseline methods. Our experiment yielded two interesting results. First, distributional representations built from social media data outperformed those built from clinical data. And second, the distributional representation model built from sentences resulted in better representations of phrases than the model built from phrase alone.

摘要

目的

心理健康正成为医疗保健领域日益重要的议题。精神症状包括对患者体验的主观描述以及精神障碍的性质和严重程度,对支持精神障碍的表型分类以实现个性化预防、诊断和干预至关重要。然而,由于(a) 缺乏注释语料库,构建起来既耗时又昂贵,以及(b) 症状本身作为非疾病等明确的临床概念存在固有语言困难,因此很少有自动化方法被提出用于从临床文本中提取精神症状。本研究旨在研究在无标记数据的情况下识别临床文本中精神症状的技术。相反,利用未标记数据的分布式表示形式,以公开可用的“种子”症状列表形式利用外部知识。

材料和方法

首先,从三个在线消费者医疗保健知识库(MedlinePlus、梅奥诊所和美国精神病学协会)中收集精神症状,用作种子术语。使用短语语法模式自动提取精神科病历中的候选症状。特别是,2016 年 CEGS N-GRID 挑战赛数据作为精神科病历语料库。其次,采用三个语料库(精神科病历、精神科论坛数据和 MIMIC II)使用 paragraph2vec 生成分布式表示。最后,计算种子症状和候选症状的分布式表示之间的语义相似度,以评估短语的相关性。在 CEGS N-GRID 2016 挑战赛的一组精神科病历上进行了实验。

结果与结论

我们的方法在从未见语料库中提取症状方面表现出良好的性能,包括与提供的种子术语无词重叠的症状。基于分布式表示的语义相似性优于基线方法。实验产生了两个有趣的结果。首先,从社交媒体数据构建的分布式表示优于从临床数据构建的分布式表示。其次,从句子构建的分布式表示模型比仅从短语构建的模型更能表示短语。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/446e8afd2ad0/nihms890851f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/d88362ceeac2/nihms890851f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/5c2c0c3caebd/nihms890851f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/d056b8638018/nihms890851f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/9ecf59aea43f/nihms890851f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/446e8afd2ad0/nihms890851f5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/d88362ceeac2/nihms890851f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/5c2c0c3caebd/nihms890851f2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/d056b8638018/nihms890851f3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/9ecf59aea43f/nihms890851f4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fc63/5705397/446e8afd2ad0/nihms890851f5.jpg

相似文献

1
Psychiatric symptom recognition without labeled data using distributional representations of phrases and on-line knowledge.使用短语的分布表示和在线知识进行无标记数据的精神症状识别。
J Biomed Inform. 2017 Nov;75S:S129-S137. doi: 10.1016/j.jbi.2017.06.014. Epub 2017 Jun 15.
2
Biomedical Text Classification Using Augmented Word Representation Based on Distributional and Relational Contexts.基于分布和关系上下文的增强词表示法进行生物医学文本分类
Comput Intell Neurosci. 2023 Feb 15;2023:2989791. doi: 10.1155/2023/2989791. eCollection 2023.
3
Redundancy in perceptual and linguistic experience: comparing feature-based and distributional models of semantic representation.感知与语言经验中的冗余:比较基于特征和分布的语义表征模型
Top Cogn Sci. 2011 Apr;3(2):303-45. doi: 10.1111/j.1756-8765.2010.01111.x. Epub 2010 Aug 19.
4
Corpus domain effects on distributional semantic modeling of medical terms.语料库领域对医学术语分布语义建模的影响。
Bioinformatics. 2016 Dec 1;32(23):3635-3644. doi: 10.1093/bioinformatics/btw529. Epub 2016 Aug 16.
5
Expanding a radiology lexicon using contextual patterns in radiology reports.利用放射科报告中的上下文模式扩展放射学词汇。
J Am Med Inform Assoc. 2018 Jun 1;25(6):679-685. doi: 10.1093/jamia/ocx152.
6
Mining association language patterns using a distributional semantic model for negative life event classification.使用分布语义模型挖掘关联语言模式,对负面生活事件进行分类。
J Biomed Inform. 2011 Aug;44(4):509-18. doi: 10.1016/j.jbi.2011.01.006. Epub 2011 Feb 1.
7
Automatic recognition of symptom severity from psychiatric evaluation records.从精神科评估记录中自动识别症状严重程度。
J Biomed Inform. 2017 Nov;75S:S71-S84. doi: 10.1016/j.jbi.2017.05.020. Epub 2017 May 30.
8
Assigning clinical codes with data-driven concept representation on Dutch clinical free text.基于数据驱动的概念表示为荷兰语临床自由文本分配临床编码。
J Biomed Inform. 2017 May;69:118-127. doi: 10.1016/j.jbi.2017.04.007. Epub 2017 Apr 8.
9
Linguistic Distributional Knowledge and Sensorimotor Grounding both Contribute to Semantic Category Production.语言分布知识和感觉运动基础都有助于语义类别生成。
Cogn Sci. 2021 Oct;45(10):e13055. doi: 10.1111/cogs.13055.
10
Spicy Adjectives and Nominal Donkeys: Capturing Semantic Deviance Using Compositionality in Distributional Spaces.辛辣的形容词与名词性驴子:利用分布空间中的组合性捕捉语义偏差
Cogn Sci. 2017 Jan;41(1):102-136. doi: 10.1111/cogs.12330. Epub 2016 Mar 16.

引用本文的文献

1
The interaction network and potential clinical effectiveness of dimensional psychopathology phenotyping based on EMR: a Bayesian network approach.基于电子病历的维度精神病理学表型的交互网络及潜在临床疗效:一种贝叶斯网络方法
BMC Psychiatry. 2025 Jan 28;25(1):81. doi: 10.1186/s12888-025-06510-2.
2
Mental Health Severity Detection from Psychological Forum Data using Domain-Specific Unlabelled Data.利用特定领域未标记数据从心理论坛数据中进行心理健康严重程度检测。
AMIA Jt Summits Transl Sci Proc. 2020 May 30;2020:487-496. eCollection 2020.
3
Precursor-induced conditional random fields: connecting separate entities by induction for improved clinical named entity recognition.

本文引用的文献

1
Predicting early psychiatric readmission with natural language processing of narrative discharge summaries.通过对出院小结进行自然语言处理预测早期精神科再入院情况。
Transl Psychiatry. 2016 Oct 18;6(10):e921. doi: 10.1038/tp.2015.182.
2
Improving Prediction of Suicide and Accidental Death After Discharge From General Hospitals With Natural Language Processing.利用自然语言处理技术提高综合医院出院后自杀和意外死亡的预测能力。
JAMA Psychiatry. 2016 Oct 1;73(10):1064-1071. doi: 10.1001/jamapsychiatry.2016.2172.
3
Corpus domain effects on distributional semantic modeling of medical terms.
诱导前条件随机场:通过诱导连接独立实体以提高临床命名实体识别。
BMC Med Inform Decis Mak. 2019 Jul 15;19(1):132. doi: 10.1186/s12911-019-0865-1.
4
Extracting psychiatric stressors for suicide from social media using deep learning.利用深度学习从社交媒体中提取自杀相关的精神压力源
BMC Med Inform Decis Mak. 2018 Jul 23;18(Suppl 2):43. doi: 10.1186/s12911-018-0632-8.
5
Adapting Word Embeddings from Multiple Domains to Symptom Recognition from Psychiatric Notes.将多领域词嵌入应用于精神科病历症状识别
AMIA Jt Summits Transl Sci Proc. 2018 May 18;2017:281-289. eCollection 2018.
6
Development of an automated phenotyping algorithm for hepatorenal syndrome.开发用于肝肾综合征的自动表型算法。
J Biomed Inform. 2018 Apr;80:87-95. doi: 10.1016/j.jbi.2018.03.001. Epub 2018 Mar 9.
7
A natural language processing challenge for clinical records: Research Domains Criteria (RDoC) for psychiatry.临床记录面临的自然语言处理挑战:精神病学的研究领域标准(RDoC)
J Biomed Inform. 2017 Nov;75S:S1-S3. doi: 10.1016/j.jbi.2017.10.005. Epub 2017 Oct 16.
语料库领域对医学术语分布语义建模的影响。
Bioinformatics. 2016 Dec 1;32(23):3635-3644. doi: 10.1093/bioinformatics/btw529. Epub 2016 Aug 16.
4
Cannabis use and treatment resistance in first episode psychosis: a natural language processing study.大麻使用与首发精神病治疗抵抗:一项自然语言处理研究。
Lancet. 2015 Feb 26;385 Suppl 1:S79. doi: 10.1016/S0140-6736(15)60394-4.
5
A Controlled Trial Using Natural Language Processing to Examine the Language of Suicidal Adolescents in the Emergency Department.一项使用自然语言处理技术来研究急诊科有自杀倾向青少年语言的对照试验。
Suicide Life Threat Behav. 2016 Apr;46(2):154-9. doi: 10.1111/sltb.12180. Epub 2015 Aug 7.
6
The role of fine-grained annotations in supervised recognition of risk factors for heart disease from EHRs.细粒度注释在基于电子健康记录的心脏病风险因素监督识别中的作用。
J Biomed Inform. 2015 Dec;58 Suppl(Suppl):S111-S119. doi: 10.1016/j.jbi.2015.06.010. Epub 2015 Jun 26.
7
U-path: An undirected path-based measure of semantic similarity.U路径:一种基于无向路径的语义相似性度量。
AMIA Annu Symp Proc. 2014 Nov 14;2014:882-91. eCollection 2014.
8
A grammar-based semantic similarity algorithm for natural language sentences.一种基于语法的自然语言句子语义相似度算法。
ScientificWorldJournal. 2014;2014:437162. doi: 10.1155/2014/437162. Epub 2014 Apr 10.
9
"Sitting on pins and needles": characterization of symptom descriptions in clinical notes".“如坐针毡”:临床记录中症状描述的特征分析
AMIA Jt Summits Transl Sci Proc. 2013 Mar 18;2013:67-71. eCollection 2013.
10
Detection of infectious symptoms from VA emergency department and primary care clinical documentation.从 VA 急诊部和初级保健临床文档中检测传染性症状。
Int J Med Inform. 2012 Mar;81(3):143-56. doi: 10.1016/j.ijmedinf.2011.11.005. Epub 2012 Jan 12.