• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于词汇语义和生物医学文献中句子频率的疾病因果关系提取

Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature.

作者信息

Lee Dong-Gi, Shin Hyunjung

机构信息

Department of Industrial Engineering, Ajou University, 206 Worldcup-ro, Yeongtong-gu, Suwon, 16499, South Korea.

出版信息

BMC Med Inform Decis Mak. 2017 May 18;17(Suppl 1):53. doi: 10.1186/s12911-017-0448-y.

DOI:10.1186/s12911-017-0448-y
PMID:28539124
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5444051/
Abstract

BACKGROUND

Recently, research on human disease network has succeeded and has become an aid in figuring out the relationship between various diseases. In most disease networks, however, the relationship between diseases has been simply represented as an association. This representation results in the difficulty of identifying prior diseases and their influence on posterior diseases. In this paper, we propose a causal disease network that implements disease causality through text mining on biomedical literature.

METHODS

To identify the causality between diseases, the proposed method includes two schemes: the first is the lexicon-based causality term strength, which provides the causal strength on a variety of causality terms based on lexicon analysis. The second is the frequency-based causality strength, which determines the direction and strength of causality based on document and clause frequencies in the literature.

RESULTS

We applied the proposed method to 6,617,833 PubMed literature, and chose 195 diseases to construct a causal disease network. From all possible pairs of disease nodes in the network, 1011 causal pairs of 149 diseases were extracted. The resulting network was compared with that of a previous study. In terms of both coverage and quality, the proposed method showed outperforming results; it determined 2.7 times more causalities and showed higher correlation with associated diseases than the existing method.

CONCLUSIONS

This research has novelty in which the proposed method circumvents the limitations of time and cost in applying all possible causalities in biological experiments and it is a more advanced text mining technique by defining the concepts of causality term strength.

摘要

背景

最近,关于人类疾病网络的研究取得了成功,并已成为一种有助于厘清各种疾病之间关系的工具。然而,在大多数疾病网络中,疾病之间的关系仅仅被表示为一种关联。这种表示方式导致难以识别前驱疾病及其对后继疾病的影响。在本文中,我们提出了一种因果疾病网络,该网络通过对生物医学文献进行文本挖掘来实现疾病因果关系。

方法

为了识别疾病之间的因果关系,所提出的方法包括两种方案:第一种是基于词典的因果关系术语强度,它基于词典分析提供各种因果关系术语的因果强度。第二种是基于频率的因果关系强度,它根据文献中的文档和子句频率来确定因果关系的方向和强度。

结果

我们将所提出的方法应用于6,617,833篇PubMed文献,并选择了195种疾病来构建因果疾病网络。从网络中所有可能的疾病节点对中,提取了149种疾病的1011对因果关系对。将所得网络与先前研究的网络进行了比较。在所涵盖的范围和质量方面,所提出的方法均显示出更好的结果;与现有方法相比,它确定的因果关系多2.7倍,并且与相关疾病的相关性更高。

结论

本研究具有新颖性,所提出的方法规避了在生物实验中应用所有可能因果关系时的时间和成本限制,并且通过定义因果关系术语强度的概念,它是一种更先进的文本挖掘技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/9df1983a6291/12911_2017_448_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/1023d121eec2/12911_2017_448_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/0a6e0f1c2322/12911_2017_448_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/e5b3744a305c/12911_2017_448_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/2035d90ae125/12911_2017_448_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/9df1983a6291/12911_2017_448_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/1023d121eec2/12911_2017_448_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/0a6e0f1c2322/12911_2017_448_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/e5b3744a305c/12911_2017_448_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/2035d90ae125/12911_2017_448_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0ce8/5444051/9df1983a6291/12911_2017_448_Fig5_HTML.jpg

相似文献

1
Disease causality extraction based on lexical semantics and document-clause frequency from biomedical literature.基于词汇语义和生物医学文献中句子频率的疾病因果关系提取
BMC Med Inform Decis Mak. 2017 May 18;17(Suppl 1):53. doi: 10.1186/s12911-017-0448-y.
2
Extracting causal relations from the literature with word vector mapping.从文献中通过词向量映射提取因果关系。
Comput Biol Med. 2019 Dec;115:103524. doi: 10.1016/j.compbiomed.2019.103524. Epub 2019 Nov 27.
3
Exploiting syntactic and semantics information for chemical-disease relation extraction.利用句法和语义信息进行化学-疾病关系提取。
Database (Oxford). 2016 Apr 14;2016. doi: 10.1093/database/baw048. Print 2016.
4
Causality modeling for directed disease network.定向疾病网络的因果关系建模
Bioinformatics. 2016 Sep 1;32(17):i437-i444. doi: 10.1093/bioinformatics/btw439.
5
An effective neural model extracting document level chemical-induced disease relations from biomedical literature.从生物医学文献中提取文档级化学诱导疾病关系的有效神经网络模型。
J Biomed Inform. 2018 Jul;83:1-9. doi: 10.1016/j.jbi.2018.05.001. Epub 2018 May 8.
6
An Unsupervised Graph Based Continuous Word Representation Method for Biomedical Text Mining.一种用于生物医学文本挖掘的基于无监督图的连续词表示方法。
IEEE/ACM Trans Comput Biol Bioinform. 2016 Jul-Aug;13(4):634-42. doi: 10.1109/TCBB.2015.2478467. Epub 2015 Sep 14.
7
Uyghur Text Matching in Graphic Images for Biomedical Semantic Analysis.维吾尔文图形图像的生物医学语义分析中的匹配。
Neuroinformatics. 2018 Oct;16(3-4):445-455. doi: 10.1007/s12021-017-9350-0.
8
Detecting causality from online psychiatric texts using inter-sentential language patterns.使用句子间语言模式从在线精神科文本中检测因果关系。
BMC Med Inform Decis Mak. 2012 Jul 18;12:72. doi: 10.1186/1472-6947-12-72.
9
Knowledge based word-concept model estimation and refinement for biomedical text mining.用于生物医学文本挖掘的基于知识的词概念模型估计与优化。
J Biomed Inform. 2015 Feb;53:300-7. doi: 10.1016/j.jbi.2014.11.015. Epub 2014 Dec 12.
10
Analysis of biological processes and diseases using text mining approaches.使用文本挖掘方法分析生物过程和疾病。
Methods Mol Biol. 2010;593:341-82. doi: 10.1007/978-1-60327-194-3_16.

引用本文的文献

1
A study on large-scale disease causality discovery from biomedical literature.一项关于从生物医学文献中发现大规模疾病因果关系的研究。
BMC Med Inform Decis Mak. 2025 Mar 18;25(1):136. doi: 10.1186/s12911-025-02893-0.
2
Causal relationships between diseases mined from the literature improve the use of polygenic risk scores.从文献中挖掘出的疾病因果关系可提高多基因风险评分的使用。
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae639.
3
Exploring novel disease-disease associations based on multi-view fusion network.基于多视图融合网络探索新型疾病-疾病关联

本文引用的文献

1
Causality modeling for directed disease network.定向疾病网络的因果关系建模
Bioinformatics. 2016 Sep 1;32(17):i437-i444. doi: 10.1093/bioinformatics/btw439.
2
A coupling approach of a predictor and a descriptor for breast cancer prognosis.一种用于乳腺癌预后的预测器与描述符的耦合方法。
BMC Med Genomics. 2014;7 Suppl 1(Suppl 1):S4. doi: 10.1186/1755-8794-7-S1-S4. Epub 2014 May 8.
3
Knowledge boosting: a graph-based integration approach with multi-omics data and genomic knowledge for cancer clinical outcome prediction.
Comput Struct Biotechnol J. 2023 Feb 24;21:1807-1819. doi: 10.1016/j.csbj.2023.02.038. eCollection 2023.
4
A Word-Granular Adversarial Attacks Framework for Causal Event Extraction.用于因果事件提取的词粒度对抗攻击框架
Entropy (Basel). 2022 Jan 24;24(2):169. doi: 10.3390/e24020169.
5
Inference on chains of disease progression based on disease networks.基于疾病网络的疾病进展链推断。
PLoS One. 2019 Jun 28;14(6):e0218871. doi: 10.1371/journal.pone.0218871. eCollection 2019.
6
Evolution of Translational Bioinformatics: lessons learned from TBC 2016.转化生物信息学的发展:从2016年转化生物信息学大会汲取的经验教训
BMC Med Genomics. 2017 May 24;10(Suppl 1):32. doi: 10.1186/s12920-017-0262-5.
知识增强:一种基于图的整合方法,利用多组学数据和基因组知识进行癌症临床结果预测。
J Am Med Inform Assoc. 2015 Jan;22(1):109-20. doi: 10.1136/amiajnl-2013-002481. Epub 2014 Jul 7.
4
Human symptoms-disease network.人类症状-疾病网络。
Nat Commun. 2014 Jun 26;5:4212. doi: 10.1038/ncomms5212.
5
dRiskKB: a large-scale disease-disease risk relationship knowledge base constructed from biomedical text.dRiskKB:一个从生物医学文本中构建的大规模疾病-疾病风险关系知识库。
BMC Bioinformatics. 2014 Apr 12;15:105. doi: 10.1186/1471-2105-15-105.
6
Text mining effectively scores and ranks the literature for improving chemical-gene-disease curation at the comparative toxicogenomics database.文本挖掘有效地对文献进行评分和排序,以提高比较毒理学基因组学数据库中的化学物质-基因-疾病的编纂工作。
PLoS One. 2013 Apr 17;8(4):e58201. doi: 10.1371/journal.pone.0058201. Print 2013.
7
The expanded human disease network combining protein-protein interaction information.扩展的人类疾病网络,结合蛋白质-蛋白质相互作用信息。
Eur J Hum Genet. 2011 Jul;19(7):783-8. doi: 10.1038/ejhg.2011.30. Epub 2011 Mar 9.
8
Event extraction for systems biology by text mining the literature.通过文献挖掘进行系统生物学的事件抽取。
Trends Biotechnol. 2010 Jul;28(7):381-90. doi: 10.1016/j.tibtech.2010.04.005. Epub 2010 Jun 1.
9
Biomedical text mining and its applications.生物医学文本挖掘及其应用。
PLoS Comput Biol. 2009 Dec;5(12):e1000597. doi: 10.1371/journal.pcbi.1000597. Epub 2009 Dec 24.
10
Human disease-drug network based on genomic expression profiles.基于基因组表达谱的人类疾病-药物网络。
PLoS One. 2009 Aug 6;4(8):e6536. doi: 10.1371/journal.pone.0006536.