• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从表型、功能和表达的解剖部位预测候选基因。

Predicting candidate genes from phenotypes, functions and anatomical site of expression.

机构信息

Computational Bioscience Research Center (CBRC), Computer, Electrical & Mathematical Sciences and Engineering (CEMSE) Division, King Abdullah University of Science and Technology (KAUST), Thuwal 23955, Saudi Arabia.

Computer Science Department, College of Computers and Information Technology, Taif University, Taif 26571, Saudi Arabia.

出版信息

Bioinformatics. 2021 May 5;37(6):853-860. doi: 10.1093/bioinformatics/btaa879.

DOI:10.1093/bioinformatics/btaa879
PMID:33051643
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8248315/
Abstract

MOTIVATION

Over the past years, many computational methods have been developed to incorporate information about phenotypes for disease-gene prioritization task. These methods generally compute the similarity between a patient's phenotypes and a database of gene-phenotype to find the most phenotypically similar match. The main limitation in these methods is their reliance on knowledge about phenotypes associated with particular genes, which is not complete in humans as well as in many model organisms, such as the mouse and fish. Information about functions of gene products and anatomical site of gene expression is available for more genes and can also be related to phenotypes through ontologies and machine-learning models.

RESULTS

We developed a novel graph-based machine-learning method for biomedical ontologies, which is able to exploit axioms in ontologies and other graph-structured data. Using our machine-learning method, we embed genes based on their associated phenotypes, functions of the gene products and anatomical location of gene expression. We then develop a machine-learning model to predict gene-disease associations based on the associations between genes and multiple biomedical ontologies, and this model significantly improves over state-of-the-art methods. Furthermore, we extend phenotype-based gene prioritization methods significantly to all genes, which are associated with phenotypes, functions or site of expression.

AVAILABILITY AND IMPLEMENTATION

Software and data are available at https://github.com/bio-ontology-research-group/DL2Vec.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在过去的几年中,已经开发出许多计算方法来将表型信息纳入疾病基因优先排序任务中。这些方法通常计算患者表型与基因-表型数据库之间的相似性,以找到最表型相似的匹配。这些方法的主要局限性在于它们依赖于与特定基因相关的表型知识,而这些知识在人类以及许多模型生物(如小鼠和鱼类)中并不完整。基因产物的功能信息和基因表达的解剖部位信息可用于更多的基因,并且也可以通过本体论和机器学习模型与表型相关联。

结果

我们开发了一种新颖的基于图的机器学习方法,用于生物医学本体论,能够利用本体论中的公理和其他图结构数据。使用我们的机器学习方法,我们根据相关表型、基因产物的功能和基因表达的解剖位置来嵌入基因。然后,我们开发了一种基于基因与多个生物医学本体论之间的关联来预测基因-疾病关联的机器学习模型,该模型明显优于最新方法。此外,我们将基于表型的基因优先排序方法显著扩展到所有与表型、功能或表达部位相关的基因。

可用性和实现

软件和数据可在 https://github.com/bio-ontology-research-group/DL2Vec 上获得。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Predicting candidate genes from phenotypes, functions and anatomical site of expression.从表型、功能和表达的解剖部位预测候选基因。
Bioinformatics. 2021 May 5;37(6):853-860. doi: 10.1093/bioinformatics/btaa879.
2
mOWL: Python library for machine learning with biomedical ontologies.mOWL:用于生物医学本体机器学习的 Python 库。
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btac811.
3
Formal axioms in biomedical ontologies improve analysis and interpretation of associated data.生物医学本体论中的形式公理可改善相关数据的分析和解释。
Bioinformatics. 2020 Apr 1;36(7):2229-2236. doi: 10.1093/bioinformatics/btz920.
4
OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction.OPA2Vec:结合生物医学本体的正式和非正式内容以改进基于相似度的预测。
Bioinformatics. 2019 Jun 1;35(12):2133-2140. doi: 10.1093/bioinformatics/bty933.
5
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes.语义疾病基因嵌入物(SmuDGE):基于表型的疾病基因优先排序,无需表型。
Bioinformatics. 2018 Sep 1;34(17):i901-i907. doi: 10.1093/bioinformatics/bty559.
6
Semantic similarity and machine learning with ontologies.语义相似性和本体论的机器学习。
Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa199.
7
Improving the classification of cardinality phenotypes using collections.利用集合提高基数表型的分类。
J Biomed Semantics. 2023 Aug 7;14(1):9. doi: 10.1186/s13326-023-00290-y.
8
Onto2Vec: joint vector-based representation of biological entities and their ontology-based annotations.Onto2Vec:基于向量的生物实体联合表示及其基于本体论的标注。
Bioinformatics. 2018 Jul 1;34(13):i52-i60. doi: 10.1093/bioinformatics/bty259.
9
DeepSVP: integration of genotype and phenotype for structural variant prioritization using deep learning.DeepSVP:利用深度学习进行基因型和表型整合的结构变异优先级排序。
Bioinformatics. 2022 Mar 4;38(6):1677-1684. doi: 10.1093/bioinformatics/btab859.
10
Multi-domain knowledge graph embeddings for gene-disease association prediction.多领域知识图谱嵌入在基因-疾病关联预测中的应用。
J Biomed Semantics. 2023 Aug 14;14(1):11. doi: 10.1186/s13326-023-00291-x.

引用本文的文献

1
Learning genotype-phenotype associations from gaps in multi-species sequence alignments.从多物种序列比对的缺口处学习基因型-表型关联。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbaf022.
2
Knowledge graphs in psychiatric research: Potential applications and future perspectives.精神医学研究中的知识图谱:潜在应用与未来展望。
Acta Psychiatr Scand. 2025 Mar;151(3):180-191. doi: 10.1111/acps.13717. Epub 2024 Jun 17.
3
Prioritizing genomic variants through neuro-symbolic, knowledge-enhanced learning.通过神经符号学、知识增强学习对基因组变体进行优先级排序。
Bioinformatics. 2024 May 2;40(5). doi: 10.1093/bioinformatics/btae301.
4
Enabling personalised disease diagnosis by combining a patient's time-specific gene expression profile with a biomedical knowledge base.通过将患者特定时间的基因表达谱与生物医学知识库相结合,实现个性化疾病诊断。
BMC Bioinformatics. 2024 Feb 7;25(1):62. doi: 10.1186/s12859-024-05674-0.
5
SSLpheno: a self-supervised learning approach for gene-phenotype association prediction using protein-protein interactions and gene ontology data.SSLpheno:一种基于自监督学习的方法,利用蛋白质-蛋白质相互作用和基因本体数据进行基因-表型关联预测。
Bioinformatics. 2023 Nov 1;39(11). doi: 10.1093/bioinformatics/btad662.
6
Clustering rare diseases within an ontology-enriched knowledge graph.在本体丰富的知识图中对罕见病进行聚类。
J Am Med Inform Assoc. 2023 Dec 22;31(1):154-164. doi: 10.1093/jamia/ocad186.
7
Integrative rare disease biomedical profile based network supporting drug repurposing or repositioning, a case study of glioblastoma.基于综合罕见病生物医学特征的网络支持药物重定位或再定位,以胶质母细胞瘤为例。
Orphanet J Rare Dis. 2023 Sep 25;18(1):301. doi: 10.1186/s13023-023-02876-2.
8
Leveraging genetic diversity to understand monogenic Parkinson's disease's landscape in AfrAbia.利用遗传多样性来了解非洲和阿拉伯地区单基因帕金森病的情况。
Am J Neurodegener Dis. 2023 Aug 15;12(4):108-122. eCollection 2023.
9
Integrative Rare Disease Biomedical Profile based Network Supporting Drug Repurposing, a case study of Glioblastoma.基于网络的综合罕见病生物医学概况支持药物再利用,胶质母细胞瘤的案例研究
Res Sq. 2023 Apr 18:rs.3.rs-2809689. doi: 10.21203/rs.3.rs-2809689/v1.
10
A knowledge graph-based disease-gene prediction system using multi-relational graph convolution networks.基于知识图的多关系图卷积网络疾病-基因预测系统。
AMIA Annu Symp Proc. 2023 Apr 29;2022:468-476. eCollection 2022.

本文引用的文献

1
What is the right sequencing approach? Solo VS extended family analysis in consanguineous populations.正确的测序方法是什么?在血缘人群中,独奏与扩展家庭分析。
BMC Med Genomics. 2020 Jul 17;13(1):103. doi: 10.1186/s12920-020-00743-8.
2
Formal axioms in biomedical ontologies improve analysis and interpretation of associated data.生物医学本体论中的形式公理可改善相关数据的分析和解释。
Bioinformatics. 2020 Apr 1;36(7):2229-2236. doi: 10.1093/bioinformatics/btz920.
3
The CAFA challenge reports improved protein function prediction and new functional annotations for hundreds of genes through experimental screens.CAFA 挑战赛报告称,通过实验筛选,提高了数百个基因的蛋白质功能预测和新的功能注释。
Genome Biol. 2019 Nov 19;20(1):244. doi: 10.1186/s13059-019-1835-8.
4
The Monarch Initiative in 2019: an integrative data and analytic platform connecting phenotypes to genotypes across species.2019 年君主计划:一个整合的数据和分析平台,连接不同物种的表型与基因型。
Nucleic Acids Res. 2020 Jan 8;48(D1):D704-D715. doi: 10.1093/nar/gkz997.
5
Expression Atlas update: from tissues to single cells.表达图谱更新:从组织到单细胞。
Nucleic Acids Res. 2020 Jan 8;48(D1):D77-D83. doi: 10.1093/nar/gkz947.
6
Specific phenotype semantics facilitate gene prioritization in clinical exome sequencing.特定表型语义有助于临床外显子组测序中的基因优先级排序。
Eur J Hum Genet. 2019 Sep;27(9):1389-1397. doi: 10.1038/s41431-019-0412-7. Epub 2019 May 3.
7
STRING v11: protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets.STRING v11:具有增强覆盖范围的蛋白质-蛋白质相互作用网络,支持在全基因组实验数据集的功能发现。
Nucleic Acids Res. 2019 Jan 8;47(D1):D607-D613. doi: 10.1093/nar/gky1131.
8
Expansion of the Human Phenotype Ontology (HPO) knowledge base and resources.人类表型本体(HPO)知识库和资源的扩展。
Nucleic Acids Res. 2019 Jan 8;47(D1):D1018-D1027. doi: 10.1093/nar/gky1105.
9
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes.语义疾病基因嵌入物(SmuDGE):基于表型的疾病基因优先排序,无需表型。
Bioinformatics. 2018 Sep 1;34(17):i901-i907. doi: 10.1093/bioinformatics/bty559.
10
OPA2Vec: combining formal and informal content of biomedical ontologies to improve similarity-based prediction.OPA2Vec:结合生物医学本体的正式和非正式内容以改进基于相似度的预测。
Bioinformatics. 2019 Jun 1;35(12):2133-2140. doi: 10.1093/bioinformatics/bty933.