• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从文献中挖掘出的疾病因果关系可提高多基因风险评分的使用。

Causal relationships between diseases mined from the literature improve the use of polygenic risk scores.

机构信息

Computer, Electrical and Mathematical Sciences & Engineering, King Abdullah University of Science and Technology, Thuwal 23955, Saudi Arabia.

Department of Physiology, Development & Neuroscience, University of Cambridge, Cambridge CB2 3EG, United Kingdom.

出版信息

Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae639.

DOI:10.1093/bioinformatics/btae639
PMID:39460944
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11639291/
Abstract

MOTIVATION

Identifying causal relations between diseases allows for the study of shared pathways, biological mechanisms, and inter-disease risks. Such causal relations can facilitate the identification of potential disease precursors and candidates for drug re-purposing. However, computational methods often lack access to these causal relations. Few approaches have been developed to automatically extract causal relationships between diseases from unstructured text, but they are often only focused on a small number of diseases, lack validation of the extracted causal relations, or do not make their data available.

RESULTS

We automatically mined statements asserting a causal relation between diseases from the scientific literature by leveraging lexical patterns. Following automated mining of causal relations, we mapped the diseases to the International Classification of Diseases (ICD) identifiers to allow the direct application to clinical data. We provide quantitative and qualitative measures to evaluate the mined causal relations and compare to UK Biobank diagnosis data as a completely independent data source. The validated causal associations were used to create a directed acyclic graph that can be used by causal inference frameworks. We demonstrate the utility of our causal network by performing causal inference using the do-calculus, using relations within the graph to construct and improve polygenic risk scores, and disentangle the pleiotropic effects of variants.

AVAILABILITY AND IMPLEMENTATION

The data are available through https://github.com/bio-ontology-research-group/causal-relations-between-diseases.

摘要

动机

识别疾病之间的因果关系可以研究共同的途径、生物学机制和疾病间的风险。这些因果关系可以帮助识别潜在的疾病前兆和药物再利用的候选者。然而,计算方法通常无法获得这些因果关系。已经开发了一些从非结构化文本中自动提取疾病之间因果关系的方法,但它们通常只关注少数几种疾病,缺乏对提取的因果关系的验证,或者没有公开其数据。

结果

我们通过利用词汇模式,从科学文献中自动挖掘断言疾病之间因果关系的陈述。在自动挖掘因果关系之后,我们将疾病映射到国际疾病分类(ICD)标识符,以允许直接应用于临床数据。我们提供了定量和定性的度量标准来评估挖掘出的因果关系,并与英国生物库(UK Biobank)的诊断数据进行比较,UK Biobank 是一个完全独立的数据源。经过验证的因果关联被用来创建一个有向无环图,该图可以被因果推理框架使用。我们通过使用 do 演算在因果网络上进行因果推理,使用图中的关系来构建和改进多基因风险评分,并分解变体的多效性效应,展示了我们因果网络的实用性。

可用性和实现

数据可通过 https://github.com/bio-ontology-research-group/causal-relations-between-diseases 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/211e/11639291/d5f74ea9d540/btae639f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/211e/11639291/d5f74ea9d540/btae639f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/211e/11639291/d5f74ea9d540/btae639f1.jpg

相似文献

1
Causal relationships between diseases mined from the literature improve the use of polygenic risk scores.从文献中挖掘出的疾病因果关系可提高多基因风险评分的使用。
Bioinformatics. 2024 Nov 1;40(11). doi: 10.1093/bioinformatics/btae639.
2
Ontology based mining of pathogen-disease associations from literature.基于本体的从文献中挖掘病原体-疾病关联
J Biomed Semantics. 2019 Sep 18;10(1):15. doi: 10.1186/s13326-019-0208-2.
3
An Automatic and End-to-End System for Rare Disease Knowledge Graph Construction Based on Ontology-Enhanced Large Language Models: Development Study.基于本体增强大语言模型的罕见病知识图谱构建自动端到端系统:开发研究
JMIR Med Inform. 2024 Dec 18;12:e60665. doi: 10.2196/60665.
4
Integrating Mendelian randomization and literature-mined evidence for breast cancer risk factors.整合孟德尔随机化和文献挖掘证据以研究乳腺癌风险因素
J Biomed Inform. 2025 May;165:104810. doi: 10.1016/j.jbi.2025.104810. Epub 2025 Mar 22.
5
Text mining-based word representations for biomedical data analysis and protein-protein interaction networks in machine learning tasks.基于文本挖掘的词表示在生物医学数据分析和机器学习任务中的蛋白质-蛋白质相互作用网络。
PLoS One. 2021 Oct 15;16(10):e0258623. doi: 10.1371/journal.pone.0258623. eCollection 2021.
6
Linking common human diseases to their phenotypes; development of a resource for human phenomics.将常见人类疾病与其表型相联系;开发人类表型组学资源。
J Biomed Semantics. 2021 Aug 23;12(1):17. doi: 10.1186/s13326-021-00249-x.
7
An atlas of polygenic risk score associations to highlight putative causal relationships across the human phenome.多基因风险评分关联图谱,突出人类表型全范围的潜在因果关系。
Elife. 2019 Mar 5;8:e43657. doi: 10.7554/eLife.43657.
8
Extraction of relations between genes and diseases from text and large-scale data analysis: implications for translational research.从文本和大规模数据分析中提取基因与疾病之间的关系:对转化研究的启示。
BMC Bioinformatics. 2015 Feb 21;16:55. doi: 10.1186/s12859-015-0472-9.
9
Literature mining discerns latent disease-gene relationships.文献挖掘揭示潜在的疾病-基因关系。
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae185.
10
Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas.在流行地区,服用抗叶酸抗疟药物的人群中,叶酸补充剂与疟疾易感性和严重程度的关系。
Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217.

本文引用的文献

1
Recent advances in polygenic scores: translation, equitability, methods and FAIR tools.多基因评分的最新进展:转化、公平性、方法与FAIR工具
Genome Med. 2024 Feb 19;16(1):33. doi: 10.1186/s13073-024-01304-9.
2
The DO-KB Knowledgebase: a 20-year journey developing the disease open science ecosystem.DO-KB 知识库:开发疾病开放科学生态系统的 20 年历程。
Nucleic Acids Res. 2024 Jan 5;52(D1):D1305-D1314. doi: 10.1093/nar/gkad1051.
3
Portal Hypertension in Alcohol-Associated Hepatitis.酒精性肝炎中的门静脉高压症
Curr Hepatol Rep. 2023;22(2):67-73. doi: 10.1007/s11901-023-00601-y. Epub 2023 Apr 5.
4
Integrating multiple traits for improving polygenic risk prediction in disease and pharmacogenomics GWAS.综合多种特征以提高疾病和药物基因组学 GWAS 中的多基因风险预测。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad181.
5
The NHGRI-EBI GWAS Catalog: knowledgebase and deposition resource.NHGRI-EBI GWAS 目录:知识库和存储资源。
Nucleic Acids Res. 2023 Jan 6;51(D1):D977-D985. doi: 10.1093/nar/gkac1010.
6
Polygenic Risk Scores for Cardiovascular Disease: A Scientific Statement From the American Heart Association.多基因风险评分与心血管疾病:美国心脏协会科学声明
Circulation. 2022 Aug 23;146(8):e93-e118. doi: 10.1161/CIR.0000000000001077. Epub 2022 Jul 18.
7
Informative Causality Extraction from Medical Literature via Dependency-Tree-Based Patterns.基于依存树模式从医学文献中提取信息性因果关系
J Healthc Inform Res. 2022 May 25;6(3):295-316. doi: 10.1007/s41666-022-00116-z. eCollection 2022 Sep.
8
Dissecting the Polygenic Basis of Primary Hypertension: Identification of Key Pathway-Specific Components.剖析原发性高血压的多基因基础:关键途径特异性成分的鉴定
Front Cardiovasc Med. 2022 Feb 16;9:814502. doi: 10.3389/fcvm.2022.814502. eCollection 2022.
9
The role of FGF-4 and FGFR-2 on preimplantation embryo development in experimental maternal diabetes.成纤维细胞生长因子 4 和 FGFR-2 在实验性母体糖尿病对胚胎植入前发育的作用。
Gynecol Endocrinol. 2022 Mar;38(3):248-252. doi: 10.1080/09513590.2021.2005782. Epub 2021 Dec 14.
10
The Human Disease Ontology 2022 update.人类疾病本体 2022 更新版。
Nucleic Acids Res. 2022 Jan 7;50(D1):D1255-D1261. doi: 10.1093/nar/gkab1063.