• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于知识图谱的疾病相关基因互作预测与解释方法。

A knowledge graph approach to predict and interpret disease-causing gene interactions.

机构信息

Interuniversity Institute of Bioinformatics in Brussels, Université Libre de Bruxelles - Vrije Universiteit Brussel, Brussels, Belgium.

Machine Learning Group, Université Libre de Bruxelles, Brussels, Belgium.

出版信息

BMC Bioinformatics. 2023 Aug 29;24(1):324. doi: 10.1186/s12859-023-05451-5.

DOI:10.1186/s12859-023-05451-5
PMID:37644440
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10463539/
Abstract

BACKGROUND

Understanding the impact of gene interactions on disease phenotypes is increasingly recognised as a crucial aspect of genetic disease research. This trend is reflected by the growing amount of clinical research on oligogenic diseases, where disease manifestations are influenced by combinations of variants on a few specific genes. Although statistical machine-learning methods have been developed to identify relevant genetic variant or gene combinations associated with oligogenic diseases, they rely on abstract features and black-box models, posing challenges to interpretability for medical experts and impeding their ability to comprehend and validate predictions. In this work, we present a novel, interpretable predictive approach based on a knowledge graph that not only provides accurate predictions of disease-causing gene interactions but also offers explanations for these results.

RESULTS

We introduce BOCK, a knowledge graph constructed to explore disease-causing genetic interactions, integrating curated information on oligogenic diseases from clinical cases with relevant biomedical networks and ontologies. Using this graph, we developed a novel predictive framework based on heterogenous paths connecting gene pairs. This method trains an interpretable decision set model that not only accurately predicts pathogenic gene interactions, but also unveils the patterns associated with these diseases. A unique aspect of our approach is its ability to offer, along with each positive prediction, explanations in the form of subgraphs, revealing the specific entities and relationships that led to each pathogenic prediction.

CONCLUSION

Our method, built with interpretability in mind, leverages heterogenous path information in knowledge graphs to predict pathogenic gene interactions and generate meaningful explanations. This not only broadens our understanding of the molecular mechanisms underlying oligogenic diseases, but also presents a novel application of knowledge graphs in creating more transparent and insightful predictors for genetic research.

摘要

背景

越来越多的人认识到,了解基因相互作用对疾病表型的影响是遗传疾病研究的一个关键方面。这种趋势反映在越来越多的关于寡基因疾病的临床研究中,疾病表现受少数特定基因上的变异组合影响。尽管已经开发了统计机器学习方法来识别与寡基因疾病相关的相关遗传变异或基因组合,但它们依赖于抽象特征和黑盒模型,这对医学专家的可解释性提出了挑战,并阻碍了他们理解和验证预测的能力。在这项工作中,我们提出了一种新颖的、基于知识图的可解释预测方法,该方法不仅提供了对致病基因相互作用的准确预测,还为这些结果提供了解释。

结果

我们引入了 BOCK,这是一个用于探索致病遗传相互作用的知识图,它整合了来自临床病例的寡基因疾病的精心整理信息,以及相关的生物医学网络和本体。我们使用这个图,开发了一种基于连接基因对的异构路径的新预测框架。该方法训练了一个可解释的决策集模型,不仅可以准确预测致病基因相互作用,还可以揭示与这些疾病相关的模式。我们方法的一个独特方面是,它不仅能够提供阳性预测,还能够以子图的形式提供解释,揭示导致每个致病预测的特定实体和关系。

结论

我们的方法考虑到了可解释性,利用知识图中的异构路径信息来预测致病基因相互作用,并生成有意义的解释。这不仅拓宽了我们对寡基因疾病的分子机制的理解,而且为知识图在为遗传研究创建更透明和有见地的预测器方面提供了新的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/9dc509ddeaa3/12859_2023_5451_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/0b867cbbf7af/12859_2023_5451_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/17bbc142677e/12859_2023_5451_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/c44a1f09608c/12859_2023_5451_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/64d9ccf82f4c/12859_2023_5451_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/389b990d991b/12859_2023_5451_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/9dc509ddeaa3/12859_2023_5451_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/0b867cbbf7af/12859_2023_5451_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/17bbc142677e/12859_2023_5451_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/c44a1f09608c/12859_2023_5451_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/64d9ccf82f4c/12859_2023_5451_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/389b990d991b/12859_2023_5451_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6e3c/10463539/9dc509ddeaa3/12859_2023_5451_Fig6_HTML.jpg

相似文献

1
A knowledge graph approach to predict and interpret disease-causing gene interactions.一种基于知识图谱的疾病相关基因互作预测与解释方法。
BMC Bioinformatics. 2023 Aug 29;24(1):324. doi: 10.1186/s12859-023-05451-5.
2
Explaining protein-protein interactions with knowledge graph-based semantic similarity.用基于知识图的语义相似度解释蛋白质-蛋白质相互作用。
Comput Biol Med. 2024 Mar;170:108076. doi: 10.1016/j.compbiomed.2024.108076. Epub 2024 Feb 1.
3
Exploiting semantic patterns over biomedical knowledge graphs for predicting treatment and causative relations.利用生物医学知识图谱中的语义模式预测治疗和因果关系。
J Biomed Inform. 2018 Jun;82:189-199. doi: 10.1016/j.jbi.2018.05.003. Epub 2018 May 12.
4
KGML-xDTD: a knowledge graph-based machine learning framework for drug treatment prediction and mechanism description.KGML-xDTD:一种基于知识图的机器学习框架,用于药物治疗预测和机制描述。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giad057. Epub 2023 Aug 21.
5
Multi-domain knowledge graph embeddings for gene-disease association prediction.多领域知识图谱嵌入在基因-疾病关联预测中的应用。
J Biomed Semantics. 2023 Aug 14;14(1):11. doi: 10.1186/s13326-023-00291-x.
6
Knowledge Graph Embeddings for ICU readmission prediction.知识图嵌入在 ICU 再入院预测中的应用。
BMC Med Inform Decis Mak. 2023 Jan 19;23(1):12. doi: 10.1186/s12911-022-02070-7.
7
A Mobile App That Addresses Interpretability Challenges in Machine Learning-Based Diabetes Predictions: Survey-Based User Study.一款应对基于机器学习的糖尿病预测中可解释性挑战的移动应用程序:基于调查的用户研究。
JMIR Form Res. 2023 Nov 13;7:e50328. doi: 10.2196/50328.
8
Explaining decisions of graph convolutional neural networks: patient-specific molecular subnetworks responsible for metastasis prediction in breast cancer.解释图卷积神经网络决策:乳腺癌转移预测中与患者特异性相关的分子子网络。
Genome Med. 2021 Mar 11;13(1):42. doi: 10.1186/s13073-021-00845-7.
9
KDGene: knowledge graph completion for disease gene prediction using interactional tensor decomposition.KDGene:利用交互张量分解进行疾病基因预测的知识图补全。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae161.
10
Combining biomedical knowledge graphs and text to improve predictions for drug-target interactions and drug-indications.将生物医学知识图谱和文本相结合,提高药物-靶点相互作用和药物适应症的预测能力。
PeerJ. 2022 Apr 4;10:e13061. doi: 10.7717/peerj.13061. eCollection 2022.

引用本文的文献

1
Leveraging Social Determinants of Health in Alzheimer's Research Using LLM-Augmented Literature Mining and Knowledge Graphs.利用基于大语言模型增强的文献挖掘和知识图谱,在阿尔茨海默病研究中利用健康的社会决定因素
AMIA Jt Summits Transl Sci Proc. 2025 Jun 10;2025:491-500. eCollection 2025.
2
Knowledge graph and its application in the study of neurological and mental disorders.知识图谱及其在神经和精神疾病研究中的应用。
Front Psychiatry. 2025 Mar 18;16:1452557. doi: 10.3389/fpsyt.2025.1452557. eCollection 2025.
3
KG2ML: Integrating Knowledge Graphs and Positive Unlabeled Learning for Identifying Disease-Associated Genes.

本文引用的文献

1
Integrating and formatting biomedical data as pre-calculated knowledge graph embeddings in the Bioteque.在 Bioteque 中整合和格式化生物医学数据作为预先计算的知识图嵌入。
Nat Commun. 2022 Sep 9;13(1):5304. doi: 10.1038/s41467-022-33026-0.
2
Paralog knockout profiling identifies DUSP4 and DUSP6 as a digenic dependence in MAPK pathway-driven cancers.旁系同源基因敲除分析确定双特异性磷酸酶4(DUSP4)和双特异性磷酸酶6(DUSP6)是丝裂原活化蛋白激酶(MAPK)途径驱动的癌症中的双基因依赖性。
Nat Genet. 2021 Dec;53(12):1664-1672. doi: 10.1038/s41588-021-00967-z. Epub 2021 Dec 2.
3
Identifying digenic disease genes via machine learning in the Undiagnosed Diseases Network.
KG2ML:整合知识图谱与正例无标注学习以识别疾病相关基因
medRxiv. 2025 Mar 17:2025.03.17.25323906. doi: 10.1101/2025.03.17.25323906.
4
Unified Clinical Vocabulary Embeddings for Advancing Precision Medicine.用于推进精准医学的统一临床词汇嵌入
medRxiv. 2024 Dec 10:2024.12.03.24318322. doi: 10.1101/2024.12.03.24318322.
5
DOME Registry: implementing community-wide recommendations for reporting supervised machine learning in biology.DOME注册库:实施全社区关于报告生物学中监督式机器学习的建议。
Gigascience. 2024 Jan 2;13. doi: 10.1093/gigascience/giae094.
6
Prioritization of oligogenic variant combinations in whole exomes.全外显子组中寡基因变异组合的优先级排序
Bioinformatics. 2024 Mar 29;40(4). doi: 10.1093/bioinformatics/btae184.
7
Strategies for dissecting the complexity of neurodevelopmental disorders.剖析神经发育障碍复杂性的策略。
Trends Genet. 2024 Feb;40(2):187-202. doi: 10.1016/j.tig.2023.10.009. Epub 2023 Nov 8.
8
AgeAnnoMO: a knowledgebase of multi-omics annotation for animal aging.AgeAnnoMO:一个用于动物衰老的多组学注释知识库。
Nucleic Acids Res. 2024 Jan 5;52(D1):D822-D834. doi: 10.1093/nar/gkad884.
通过机器学习在未确诊疾病网络中鉴定双基因疾病基因。
Am J Hum Genet. 2021 Oct 7;108(10):1946-1963. doi: 10.1016/j.ajhg.2021.08.010. Epub 2021 Sep 15.
4
Causal interactions from proteomic profiles: Molecular data meet pathway knowledge.蛋白质组学图谱中的因果相互作用:分子数据与通路知识的结合。
Patterns (N Y). 2021 May 12;2(6):100257. doi: 10.1016/j.patter.2021.100257. eCollection 2021 Jun 11.
5
The Human Phenotype Ontology in 2021.2021 年人类表型本体论。
Nucleic Acids Res. 2021 Jan 8;49(D1):D1207-D1217. doi: 10.1093/nar/gkaa1043.
6
The InterPro protein families and domains database: 20 years on.The InterPro 蛋白质家族和结构域数据库:20 年的发展历程。
Nucleic Acids Res. 2021 Jan 8;49(D1):D344-D354. doi: 10.1093/nar/gkaa977.
7
Mutations in cause an autosomal-recessive form of hypertrophic cardiomyopathy.导致常染色体隐性肥厚型心肌病。
Heart. 2020 Sep;106(17):1342-1348. doi: 10.1136/heartjnl-2020-316913. Epub 2020 May 25.
8
MyoMiner: explore gene co-expression in normal and pathological muscle.MyoMiner:探索正常和病态肌肉中的基因共表达。
BMC Med Genomics. 2020 May 11;13(1):67. doi: 10.1186/s12920-020-0712-3.
9
Heterogeneous networks integration for disease-gene prioritization with node kernels.基于节点核的疾病基因优先级推断的异质网络整合。
Bioinformatics. 2020 May 1;36(9):2649-2656. doi: 10.1093/bioinformatics/btaa008.
10
To Embed or Not: Network Embedding as a Paradigm in Computational Biology.嵌入与否:网络嵌入作为计算生物学中的一种范式
Front Genet. 2019 May 1;10:381. doi: 10.3389/fgene.2019.00381. eCollection 2019.