• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于机器学习方法的差异表达网络分析进行候选基因优先级排序。

Candidate gene prioritization by network analysis of differential expression using machine learning approaches.

机构信息

Department of Electrical Engineering (ESAT-SCD) Katholieke Universiteit Leuven, 3001 Leuven, Belgium.

出版信息

BMC Bioinformatics. 2010 Sep 14;11:460. doi: 10.1186/1471-2105-11-460.

DOI:10.1186/1471-2105-11-460
PMID:20840752
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2945940/
Abstract

BACKGROUND

Discovering novel disease genes is still challenging for diseases for which no prior knowledge--such as known disease genes or disease-related pathways--is available. Performing genetic studies frequently results in large lists of candidate genes of which only few can be followed up for further investigation. We have recently developed a computational method for constitutional genetic disorders that identifies the most promising candidate genes by replacing prior knowledge by experimental data of differential gene expression between affected and healthy individuals.To improve the performance of our prioritization strategy, we have extended our previous work by applying different machine learning approaches that identify promising candidate genes by determining whether a gene is surrounded by highly differentially expressed genes in a functional association or protein-protein interaction network.

RESULTS

We have proposed three strategies scoring disease candidate genes relying on network-based machine learning approaches, such as kernel ridge regression, heat kernel, and Arnoldi kernel approximation. For comparison purposes, a local measure based on the expression of the direct neighbors is also computed. We have benchmarked these strategies on 40 publicly available knockout experiments in mice, and performance was assessed against results obtained using a standard procedure in genetics that ranks candidate genes based solely on their differential expression levels (Simple Expression Ranking). Our results showed that our four strategies could outperform this standard procedure and that the best results were obtained using the Heat Kernel Diffusion Ranking leading to an average ranking position of 8 out of 100 genes, an AUC value of 92.3% and an error reduction of 52.8% relative to the standard procedure approach which ranked the knockout gene on average at position 17 with an AUC value of 83.7%.

CONCLUSION

In this study we could identify promising candidate genes using network based machine learning approaches even if no knowledge is available about the disease or phenotype.

摘要

背景

对于没有先前知识(例如已知的疾病基因或疾病相关途径)的疾病,发现新的疾病基因仍然具有挑战性。进行遗传研究通常会导致大量候选基因列表,其中只有少数几个可以进行进一步研究。我们最近开发了一种用于遗传性疾病的计算方法,该方法通过用受影响和健康个体之间的差异基因表达实验数据替代先前知识,确定最有前途的候选基因。为了提高我们的优先级策略的性能,我们通过应用不同的机器学习方法扩展了我们以前的工作,这些方法通过确定基因是否被功能关联或蛋白质-蛋白质相互作用网络中高度差异表达的基因包围来识别有前途的候选基因。

结果

我们提出了三种基于网络的机器学习方法来评分疾病候选基因的策略,例如核脊回归、热核和 Arnoldi 核逼近。为了比较目的,还计算了基于直接邻居表达的局部度量。我们在 40 个公开的小鼠敲除实验中对这些策略进行了基准测试,并根据基于候选基因差异表达水平的遗传标准程序(简单表达排序)获得的结果评估了性能。我们的结果表明,我们的四种策略可以优于该标准程序,并且使用热核扩散排序获得的最佳结果导致平均排名为 100 个基因中的 8 个,AUC 值为 92.3%,与标准程序相比,错误减少了 52.8%,该标准程序平均将敲除基因排在第 17 位,AUC 值为 83.7%。

结论

在这项研究中,即使对疾病或表型没有任何了解,我们也可以使用基于网络的机器学习方法来识别有前途的候选基因。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f624/2945940/bf8d3322d6d6/1471-2105-11-460-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f624/2945940/bf8d3322d6d6/1471-2105-11-460-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f624/2945940/bf8d3322d6d6/1471-2105-11-460-1.jpg

相似文献

1
Candidate gene prioritization by network analysis of differential expression using machine learning approaches.基于机器学习方法的差异表达网络分析进行候选基因优先级排序。
BMC Bioinformatics. 2010 Sep 14;11:460. doi: 10.1186/1471-2105-11-460.
2
An extensive analysis of disease-gene associations using network integration and fast kernel-based gene prioritization methods.使用网络整合和基于快速核的基因优先级排序方法对疾病-基因关联进行广泛分析。
Artif Intell Med. 2014 Jun;61(2):63-78. doi: 10.1016/j.artmed.2014.03.003. Epub 2014 Mar 20.
3
Network analysis of differential expression for the identification of disease-causing genes.用于鉴定致病基因的差异表达网络分析
PLoS One. 2009;4(5):e5526. doi: 10.1371/journal.pone.0005526. Epub 2009 May 13.
4
Gene- and evidence-based candidate gene selection for schizophrenia and gene feature analysis.基于基因和证据的精神分裂症候选基因选择和基因特征分析。
Artif Intell Med. 2010 Feb-Mar;48(2-3):99-106. doi: 10.1016/j.artmed.2009.07.009. Epub 2009 Nov 26.
5
Adaptive diffusion kernel learning from biological networks for protein function prediction.基于生物网络的自适应扩散核学习用于蛋白质功能预测
BMC Bioinformatics. 2008 Mar 25;9:162. doi: 10.1186/1471-2105-9-162.
6
Disease gene prioritization using network and feature.利用网络和特征对疾病基因进行优先级排序。
J Comput Biol. 2015 Apr;22(4):313-23. doi: 10.1089/cmb.2015.0001.
7
PANDA: Prioritization of autism-genes using network-based deep-learning approach.基于网络的深度学习方法对自闭症基因进行优先级排序。
Genet Epidemiol. 2020 Jun;44(4):382-394. doi: 10.1002/gepi.22282. Epub 2020 Feb 10.
8
GuiltyTargets: Prioritization of Novel Therapeutic Targets With Network Representation Learning.有罪靶点:基于网络表示学习的新型治疗靶点的优先级排序。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jan-Feb;19(1):491-500. doi: 10.1109/TCBB.2020.3003830. Epub 2022 Feb 3.
9
Estimation and testing for the effect of a genetic pathway on a disease outcome using logistic kernel machine regression via logistic mixed models.通过逻辑混合模型,使用逻辑核机器回归估计和检验遗传通路对疾病结局的影响。
BMC Bioinformatics. 2008 Jun 24;9:292. doi: 10.1186/1471-2105-9-292.
10
An ensemble rank learning approach for gene prioritization.一种用于基因优先级排序的集成排序学习方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2013;2013:3507-10. doi: 10.1109/EMBC.2013.6610298.

引用本文的文献

1
Galactose-Induced Cataracts in Rats: A Machine Learning Analysis.大鼠半乳糖诱导性白内障:机器学习分析
Int J Med Sci. 2025 Feb 10;22(5):1138-1149. doi: 10.7150/ijms.103892. eCollection 2025.
2
The combined use of scRNA-seq and network propagation highlights key features of pan-cancer Tumor-Infiltrating T cells.单细胞RNA测序(scRNA-seq)与网络传播的联合应用突出了泛癌肿瘤浸润性T细胞的关键特征。
PLoS One. 2024 Dec 27;19(12):e0315980. doi: 10.1371/journal.pone.0315980. eCollection 2024.
3
Mesocorticolimbic and Cardiometabolic Diseases-Two Faces of the Same Coin?

本文引用的文献

1
A guide to web tools to prioritize candidate genes.候选基因优先级排序的网络工具指南
Brief Bioinform. 2011 Jan;12(1):22-32. doi: 10.1093/bib/bbq007. Epub 2010 Mar 21.
2
An integrative -omics approach to identify functional sub-networks in human colorectal cancer.一种综合组学方法,用于鉴定人类结直肠癌中的功能子网络。
PLoS Comput Biol. 2010 Jan 15;6(1):e1000639. doi: 10.1371/journal.pcbi.1000639.
3
The IntAct molecular interaction database in 2010.2010 年的 IntAct 分子相互作用数据库。
中脑边缘和心脏代谢疾病——同一枚硬币的两面?
Int J Mol Sci. 2024 Sep 6;25(17):9682. doi: 10.3390/ijms25179682.
4
Unveiling the Mechanisms Underlying the Immunotherapeutic Potential of Gene-miRNA and Drugs in Head and Neck Cancer.揭示基因-微小RNA与药物在头颈癌中的免疫治疗潜力的潜在机制
Pharmaceuticals (Basel). 2024 Jul 10;17(7):921. doi: 10.3390/ph17070921.
5
Yves Moreau has received the 2023 Einstein Foundation Individual Award for Promoting Quality in Research.伊夫·莫罗荣获2023年爱因斯坦基金会促进研究质量个人奖。
Bioinform Adv. 2024 Mar 29;4(1):vbae039. doi: 10.1093/bioadv/vbae039. eCollection 2024.
6
Identification of gene biomarkers for brain diseases via multi-network topological semantics extraction and graph convolutional network.通过多网络拓扑语义提取和图卷积网络鉴定脑部疾病的基因生物标志物。
BMC Genomics. 2024 Feb 14;25(1):175. doi: 10.1186/s12864-024-09967-9.
7
DeepGenePrior: A deep learning model for prioritizing genes affected by copy number variants.深度基因优先级:一种用于优先考虑受拷贝数变异影响的基因的深度学习模型。
PLoS Comput Biol. 2023 Jul 24;19(7):e1011249. doi: 10.1371/journal.pcbi.1011249. eCollection 2023 Jul.
8
Network neighborhood operates as a drug repositioning method for cancer treatment.网络邻居可作为癌症治疗的药物再利用方法。
PeerJ. 2023 Jul 10;11:e15624. doi: 10.7717/peerj.15624. eCollection 2023.
9
HetIG-PreDiG: A Heterogeneous Integrated Graph Model for Predicting Human Disease Genes based on gene expression.HetIG-PreDiG:一种基于基因表达的用于预测人类疾病基因的异构集成图模型。
PLoS One. 2023 Feb 15;18(2):e0280839. doi: 10.1371/journal.pone.0280839. eCollection 2023.
10
NIAPU: network-informed adaptive positive-unlabeled learning for disease gene identification.NIAPU:用于疾病基因识别的基于网络信息的自适应阳性无标签学习。
Bioinformatics. 2023 Feb 3;39(2). doi: 10.1093/bioinformatics/btac848.
Nucleic Acids Res. 2010 Jan;38(Database issue):D525-31. doi: 10.1093/nar/gkp878. Epub 2009 Oct 22.
4
Network analysis of differential expression for the identification of disease-causing genes.用于鉴定致病基因的差异表达网络分析
PLoS One. 2009;4(5):e5526. doi: 10.1371/journal.pone.0005526. Epub 2009 May 13.
5
Disease candidate gene identification and prioritization using protein interaction networks.利用蛋白质相互作用网络进行疾病候选基因的识别与优先级排序。
BMC Bioinformatics. 2009 Feb 27;10:73. doi: 10.1186/1471-2105-10-73.
6
Human Protein Reference Database--2009 update.人类蛋白质参考数据库——2009年更新版
Nucleic Acids Res. 2009 Jan;37(Database issue):D767-72. doi: 10.1093/nar/gkn892. Epub 2008 Nov 6.
7
STRING 8--a global view on proteins and their functional interactions in 630 organisms.STRING 8——关于630种生物中蛋白质及其功能相互作用的全局视图。
Nucleic Acids Res. 2009 Jan;37(Database issue):D412-6. doi: 10.1093/nar/gkn760. Epub 2008 Oct 21.
8
iRefIndex: a consolidated protein interaction database with provenance.iRefIndex:一个具有来源信息的整合蛋白质相互作用数据库。
BMC Bioinformatics. 2008 Sep 30;9:405. doi: 10.1186/1471-2105-9-405.
9
Walking the interactome for prioritization of candidate disease genes.遍历相互作用组以对候选疾病基因进行优先级排序。
Am J Hum Genet. 2008 Apr;82(4):949-58. doi: 10.1016/j.ajhg.2008.02.013. Epub 2008 Mar 27.
10
Network-based classification of breast cancer metastasis.基于网络的乳腺癌转移分类
Mol Syst Biol. 2007;3:140. doi: 10.1038/msb4100180. Epub 2007 Oct 16.