• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种基于网络的机器学习框架,用于识别功能模块和疾病基因。

A network-based machine-learning framework to identify both functional modules and disease genes.

机构信息

School of Computer and Information Technology, Institute of Medical Intelligence, Beijing Jiaotong University, Beijing, 100044, China.

Institute for TCM-X, MOE Key Laboratory of Bioinformatics / Bioinformatics Division, BNRIST, Department of Automation, Tsinghua University, Beijing, 10084, China.

出版信息

Hum Genet. 2021 Jun;140(6):897-913. doi: 10.1007/s00439-020-02253-0. Epub 2021 Jan 7.

DOI:10.1007/s00439-020-02253-0
PMID:33409574
Abstract

Disease gene identification is a critical step towards uncovering the molecular mechanisms of diseases and systematically investigating complex disease phenotypes. Despite considerable efforts to develop powerful computing methods, candidate gene identification remains a severe challenge owing to the connectivity of an incomplete interactome network, which hampers the discovery of true novel candidate genes. We developed a network-based machine-learning framework to identify both functional modules and disease candidate genes. In this framework, we designed a semi-supervised non-negative matrix factorization model to obtain the functional modules related to the diseases and genes. Of note, we proposed a disease gene-prioritizing method called MapGene that integrates the correlations from both functional modules and network closeness. Our framework identified a set of functional modules with highly functional homogeneity and close gene interactions. Experiments on a large-scale benchmark dataset showed that MapGene performs significantly better than the state-of-the-art algorithms. Further analysis demonstrates MapGene can effectively relieve the impact of the incompleteness of interactome networks and obtain highly reliable rankings of candidate genes. In addition, disease cases on Parkinson's disease and diabetes mellitus confirmed the generalization of MapGene for novel candidate gene identification. This work proposed, for the first time, an integrated computing framework to predict both functional modules and disease candidate genes. The methodology and results support that our framework has the potential to help discover underlying functional modules and reliable candidate genes in human disease.

摘要

疾病基因识别是揭示疾病分子机制和系统研究复杂疾病表型的关键步骤。尽管人们付出了相当大的努力来开发强大的计算方法,但由于不完全的互作网络的连通性,候选基因识别仍然是一个严峻的挑战,这阻碍了真正新颖的候选基因的发现。我们开发了一种基于网络的机器学习框架,用于识别功能模块和疾病候选基因。在这个框架中,我们设计了一个半监督非负矩阵分解模型来获得与疾病和基因相关的功能模块。值得注意的是,我们提出了一种称为 MapGene 的疾病基因优先排序方法,该方法整合了功能模块和网络接近度的相关性。我们的框架确定了一组具有高度功能同质性和紧密基因相互作用的功能模块。在一个大规模基准数据集上的实验表明,MapGene 的性能明显优于最先进的算法。进一步的分析表明,MapGene 可以有效地缓解互作网络不完整的影响,并获得高度可靠的候选基因排名。此外,帕金森病和糖尿病病例证实了 MapGene 对新候选基因识别的泛化能力。这项工作首次提出了一种集成的计算框架,用于预测功能模块和疾病候选基因。该方法和结果支持我们的框架有潜力帮助发现人类疾病中的潜在功能模块和可靠的候选基因。

相似文献

1
A network-based machine-learning framework to identify both functional modules and disease genes.一种基于网络的机器学习框架,用于识别功能模块和疾病基因。
Hum Genet. 2021 Jun;140(6):897-913. doi: 10.1007/s00439-020-02253-0. Epub 2021 Jan 7.
2
Testing and controlling for horizontal pleiotropy with probabilistic Mendelian randomization in transcriptome-wide association studies.基于转录组关联研究的概率性孟德尔随机化检验和控制水平遗传异质性。
Nat Commun. 2020 Jul 31;11(1):3861. doi: 10.1038/s41467-020-17668-6.
3
Predicting stage-specific cancer related genes and their dynamic modules by integrating multiple datasets.通过整合多个数据集预测特定阶段的癌症相关基因及其动态模块。
BMC Bioinformatics. 2019 May 1;20(Suppl 7):194. doi: 10.1186/s12859-019-2740-6.
4
Analyzing the genes related to Alzheimer's disease via a network and pathway-based approach.通过基于网络和通路的方法分析与阿尔茨海默病相关的基因。
Alzheimers Res Ther. 2017 Apr 27;9(1):29. doi: 10.1186/s13195-017-0252-z.
5
Ensemble positive unlabeled learning for disease gene identification.用于疾病基因识别的集成正无标记学习
PLoS One. 2014 May 9;9(5):e97079. doi: 10.1371/journal.pone.0097079. eCollection 2014.
6
Enriching Human Interactome with Functional Mutations to Detect High-Impact Network Modules Underlying Complex Diseases.通过功能突变丰富人类互作组,以检测复杂疾病相关的高影响力网络模块。
Genes (Basel). 2019 Nov 15;10(11):933. doi: 10.3390/genes10110933.
7
Kinase Interaction Network Expands Functional and Disease Roles of Human Kinases.激酶相互作用网络扩展了人类激酶的功能和疾病作用。
Mol Cell. 2020 Aug 6;79(3):504-520.e9. doi: 10.1016/j.molcel.2020.07.001. Epub 2020 Jul 23.
8
Systematical Identification of Breast Cancer-Related Circular RNA Modules for Deciphering circRNA Functions Based on the Non-Negative Matrix Factorization Algorithm.基于非负矩阵分解算法的乳腺癌相关环状 RNA 模块的系统识别,用于解析环状 RNA 功能。
Int J Mol Sci. 2019 Feb 20;20(4):919. doi: 10.3390/ijms20040919.
9
Feature related multi-view nonnegative matrix factorization for identifying conserved functional modules in multiple biological networks.基于特征相关的多视图非负矩阵分解识别多个生物网络中的保守功能模块。
BMC Bioinformatics. 2018 Oct 29;19(1):394. doi: 10.1186/s12859-018-2434-5.
10
Repositioning drugs by targeting network modules: a Parkinson's disease case study.通过靶向网络模块对药物进行再定位:帕金森病案例研究。
BMC Bioinformatics. 2017 Dec 28;18(Suppl 14):532. doi: 10.1186/s12859-017-1889-0.

引用本文的文献

1
The Significance of Cellular Senescence Hub Genes in the Diagnosis and Subtype Classification of a Comprehensive Database of Gene Expression in Intervertebral Disc Degeneration.细胞衰老枢纽基因在椎间盘退变基因表达综合数据库诊断及亚型分类中的意义
JOR Spine. 2025 Mar 6;8(1):e70050. doi: 10.1002/jsp2.70050. eCollection 2025 Mar.
2
Integration of biological data via NMF for identification of human disease-associated gene modules through multi-label classification.通过非负矩阵分解整合生物数据,以通过多标签分类识别与人类疾病相关的基因模块。
PLoS One. 2024 Dec 12;19(12):e0305503. doi: 10.1371/journal.pone.0305503. eCollection 2024.
3

本文引用的文献

1
Comparative Analysis of Normalization Methods for Network Propagation.网络传播归一化方法的比较分析
Front Genet. 2019 Jan 22;10:4. doi: 10.3389/fgene.2019.00004. eCollection 2019.
2
Semantic Disease Gene Embeddings (SmuDGE): phenotype-based disease gene prioritization without phenotypes.语义疾病基因嵌入物(SmuDGE):基于表型的疾病基因优先排序,无需表型。
Bioinformatics. 2018 Sep 1;34(17):i901-i907. doi: 10.1093/bioinformatics/bty559.
3
Heterogeneous network embedding for identifying symptom candidate genes.用于识别症状候选基因的异质网络嵌入。
The biomedical knowledge graph of symptom phenotype in coronary artery plaque: machine learning-based analysis of real-world clinical data.
冠状动脉斑块症状表型的生物医学知识图谱:基于机器学习的真实世界临床数据分析
BioData Min. 2024 May 21;17(1):13. doi: 10.1186/s13040-024-00365-1.
4
KDGene: knowledge graph completion for disease gene prediction using interactional tensor decomposition.KDGene:利用交互张量分解进行疾病基因预测的知识图补全。
Brief Bioinform. 2024 Mar 27;25(3). doi: 10.1093/bib/bbae161.
5
Identification of key proteins as potential biomarkers associated with post-infarction complications in diabetics.鉴定关键蛋白作为与糖尿病患者梗死后并发症相关的潜在生物标志物。
Int J Immunopathol Pharmacol. 2023 Jan-Dec;37:3946320231216313. doi: 10.1177/03946320231216313.
6
Effect of botanical drugs in improving symptoms of hypertensive nephropathy: Analysis of real-world data, retrospective cohort, network, and experimental assessment.植物药对改善高血压肾病症状的影响:真实世界数据、回顾性队列、网络及实验评估分析
Front Pharmacol. 2023 Apr 4;14:1126972. doi: 10.3389/fphar.2023.1126972. eCollection 2023.
7
Summarizing the Effective Herbs for the Treatment of Hypertensive Nephropathy by Complex Network and Machine Learning.基于复杂网络和机器学习总结治疗高血压肾病的有效草药
Evid Based Complement Alternat Med. 2021 Jun 11;2021:5590743. doi: 10.1155/2021/5590743. eCollection 2021.
J Am Med Inform Assoc. 2018 Nov 1;25(11):1452-1459. doi: 10.1093/jamia/ocy117.
4
A Systems Approach to Refine Disease Taxonomy by Integrating Phenotypic and Molecular Networks.系统方法通过整合表型和分子网络来完善疾病分类学。
EBioMedicine. 2018 May;31:79-91. doi: 10.1016/j.ebiom.2018.04.002. Epub 2018 Apr 6.
5
Functional diversity of topological modules in human protein-protein interaction networks.人类蛋白质 - 蛋白质相互作用网络中拓扑模块的功能多样性。
Sci Rep. 2017 Nov 23;7(1):16199. doi: 10.1038/s41598-017-16270-z.
6
Network propagation: a universal amplifier of genetic associations.网络传播:遗传关联的通用放大器。
Nat Rev Genet. 2017 Sep;18(9):551-562. doi: 10.1038/nrg.2017.38. Epub 2017 Jun 12.
7
Network-Based Approach to Identify Potential Targets and Drugs that Promote Neuroprotection and Neurorepair in Acute Ischemic Stroke.基于网络的方法鉴定促进急性缺血性脑卒中神经保护和神经修复的潜在靶点和药物。
Sci Rep. 2017 Jan 5;7:40137. doi: 10.1038/srep40137.
8
MalaCards: an amalgamated human disease compendium with diverse clinical and genetic annotation and structured search.MalaCards:一个整合了多种临床和基因注释以及结构化搜索功能的人类疾病综合纲要。
Nucleic Acids Res. 2017 Jan 4;45(D1):D877-D887. doi: 10.1093/nar/gkw1012. Epub 2016 Nov 28.
9
A scored human protein-protein interaction network to catalyze genomic interpretation.一个用于催化基因组解读的评分人类蛋白质-蛋白质相互作用网络。
Nat Methods. 2017 Jan;14(1):61-64. doi: 10.1038/nmeth.4083. Epub 2016 Nov 28.
10
A knowledge-based approach for predicting gene-disease associations.一种基于知识的基因-疾病关联预测方法。
Bioinformatics. 2016 Sep 15;32(18):2831-8. doi: 10.1093/bioinformatics/btw358. Epub 2016 Jun 9.