• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

功能知识转移可实现对研究不足的生物过程的高精度预测。

Functional knowledge transfer for high-accuracy prediction of under-studied biological processes.

机构信息

Department of Computer Science, Princeton University, Princeton, New Jersey, USA.

出版信息

PLoS Comput Biol. 2013;9(3):e1002957. doi: 10.1371/journal.pcbi.1002957. Epub 2013 Mar 14.

DOI:10.1371/journal.pcbi.1002957
PMID:23516347
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3597527/
Abstract

A key challenge in genetics is identifying the functional roles of genes in pathways. Numerous functional genomics techniques (e.g. machine learning) that predict protein function have been developed to address this question. These methods generally build from existing annotations of genes to pathways and thus are often unable to identify additional genes participating in processes that are not already well studied. Many of these processes are well studied in some organism, but not necessarily in an investigator's organism of interest. Sequence-based search methods (e.g. BLAST) have been used to transfer such annotation information between organisms. We demonstrate that functional genomics can complement traditional sequence similarity to improve the transfer of gene annotations between organisms. Our method transfers annotations only when functionally appropriate as determined by genomic data and can be used with any prediction algorithm to combine transferred gene function knowledge with organism-specific high-throughput data to enable accurate function prediction. We show that diverse state-of-art machine learning algorithms leveraging functional knowledge transfer (FKT) dramatically improve their accuracy in predicting gene-pathway membership, particularly for processes with little experimental knowledge in an organism. We also show that our method compares favorably to annotation transfer by sequence similarity. Next, we deploy FKT with state-of-the-art SVM classifier to predict novel genes to 11,000 biological processes across six diverse organisms and expand the coverage of accurate function predictions to processes that are often ignored because of a dearth of annotated genes in an organism. Finally, we perform in vivo experimental investigation in Danio rerio and confirm the regulatory role of our top predicted novel gene, wnt5b, in leftward cell migration during heart development. FKT is immediately applicable to many bioinformatics techniques and will help biologists systematically integrate prior knowledge from diverse systems to direct targeted experiments in their organism of study.

摘要

在遗传学中,一个关键的挑战是确定基因在途径中的功能作用。已经开发了许多预测蛋白质功能的功能基因组学技术(例如机器学习)来解决这个问题。这些方法通常是基于基因到途径的现有注释构建的,因此往往无法识别参与尚未充分研究的过程的其他基因。这些过程中的许多在某些生物体中得到了很好的研究,但在研究人员感兴趣的生物体中不一定得到了很好的研究。基于序列的搜索方法(例如 BLAST)已被用于在生物体之间转移这种注释信息。我们证明功能基因组学可以补充传统的序列相似性,以提高基因注释在生物体之间的转移。我们的方法仅在功能上合适时才会转移注释,这是由基因组数据确定的,并且可以与任何预测算法一起使用,将转移的基因功能知识与特定于生物体的高通量数据相结合,以实现准确的功能预测。我们表明,利用功能知识转移(FKT)的各种最先进的机器学习算法可以显著提高它们预测基因途径成员的准确性,特别是对于在生物体中实验知识很少的过程。我们还表明,我们的方法与基于序列相似性的注释转移相比具有优势。接下来,我们使用 FKT 和最先进的 SVM 分类器来预测 6 个不同生物体的 11000 个生物学过程中的新基因,并将准确功能预测的覆盖范围扩展到经常由于生物体中注释基因缺乏而被忽略的过程。最后,我们在 Danio rerio 中进行了体内实验研究,并证实了我们预测的新基因 wnt5b 在心脏发育过程中向左细胞迁移的调节作用。FKT 立即适用于许多生物信息学技术,并将帮助生物学家系统地整合来自不同系统的先验知识,以指导其研究生物体中的靶向实验。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/776718a0bccd/pcbi.1002957.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/3f8ce67ff3be/pcbi.1002957.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/feb8f0b36e6e/pcbi.1002957.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/39df91327b1b/pcbi.1002957.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/0e64289bb706/pcbi.1002957.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/d0e7a13d49cf/pcbi.1002957.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/776718a0bccd/pcbi.1002957.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/3f8ce67ff3be/pcbi.1002957.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/feb8f0b36e6e/pcbi.1002957.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/39df91327b1b/pcbi.1002957.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/0e64289bb706/pcbi.1002957.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/d0e7a13d49cf/pcbi.1002957.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/57e8/3597527/776718a0bccd/pcbi.1002957.g006.jpg

相似文献

1
Functional knowledge transfer for high-accuracy prediction of under-studied biological processes.功能知识转移可实现对研究不足的生物过程的高精度预测。
PLoS Comput Biol. 2013;9(3):e1002957. doi: 10.1371/journal.pcbi.1002957. Epub 2013 Mar 14.
2
Cross-organism learning method to discover new gene functionalities.跨生物学习方法发现新基因功能。
Comput Methods Programs Biomed. 2016 Apr;126:20-34. doi: 10.1016/j.cmpb.2015.12.002. Epub 2015 Dec 17.
3
4
Machine learning approach to gene essentiality prediction: a review.机器学习在基因必需性预测中的应用:综述。
Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab128.
5
Prediction of gene expression in embryonic structures of Drosophila melanogaster.黑腹果蝇胚胎结构中基因表达的预测
PLoS Comput Biol. 2007 Jul;3(7):e144. doi: 10.1371/journal.pcbi.0030144.
6
mGene: accurate SVM-based gene finding with an application to nematode genomes.mGene:基于 SVM 的精确基因预测方法及其在线虫基因组中的应用。
Genome Res. 2009 Nov;19(11):2133-43. doi: 10.1101/gr.090597.108. Epub 2009 Jun 29.
7
IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks.IMP:一个用于整合、可视化和预测蛋白质功能和网络的多物种功能基因组学门户。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W484-90. doi: 10.1093/nar/gks458. Epub 2012 Jun 7.
8
Gene function finding through cross-organism ensemble learning.通过跨物种集成学习进行基因功能发现。
BioData Min. 2021 Feb 12;14(1):14. doi: 10.1186/s13040-021-00239-w.
9
Signalogs: orthology-based identification of novel signaling pathway components in three metazoans.信号蛋白:三种后生动物中基于直系同源的信号通路新组分的鉴定。
PLoS One. 2011 May 3;6(5):e19240. doi: 10.1371/journal.pone.0019240.
10
DBPMod: a supervised learning model for computational recognition of DNA-binding proteins in model organisms.DBPMod:一种用于在模式生物中计算识别 DNA 结合蛋白的监督学习模型。
Brief Funct Genomics. 2024 Jul 19;23(4):363-372. doi: 10.1093/bfgp/elad039.

引用本文的文献

1
Computational strategies for cross-species knowledge transfer and translational biomedicine.跨物种知识转移与转化医学的计算策略
ArXiv. 2024 Aug 16:arXiv:2408.08503v1.
2
Joint representation of molecular networks from multiple species improves gene classification.来自多个物种的分子网络的联合表示改善了基因分类。
PLoS Comput Biol. 2024 Jan 10;20(1):e1011773. doi: 10.1371/journal.pcbi.1011773. eCollection 2024 Jan.
3
Joint embedding of biological networks for cross-species functional alignment.生物网络的联合嵌入用于跨物种功能对齐。

本文引用的文献

1
IMP: a multi-species functional genomics portal for integration, visualization and prediction of protein functions and networks.IMP:一个用于整合、可视化和预测蛋白质功能和网络的多物种功能基因组学门户。
Nucleic Acids Res. 2012 Jul;40(Web Server issue):W484-90. doi: 10.1093/nar/gks458. Epub 2012 Jun 7.
2
The KEGG databases and tools facilitating omics analysis: latest developments involving human diseases and pharmaceuticals.助力组学分析的KEGG数据库及工具:涉及人类疾病与药物的最新进展
Methods Mol Biol. 2012;802:19-39. doi: 10.1007/978-1-61779-400-1_2.
3
The IntAct molecular interaction database in 2012.
Bioinformatics. 2023 Sep 2;39(9). doi: 10.1093/bioinformatics/btad529.
4
Supervised biological network alignment with graph neural networks.基于图神经网络的监督生物网络比对。
Bioinformatics. 2023 Jun 30;39(39 Suppl 1):i465-i474. doi: 10.1093/bioinformatics/btad241.
5
The impact of ethnicity and intra-pancreatic fat on the postprandial metabolome response to whey protein in overweight Asian Chinese and European Caucasian women with prediabetes.种族和胰腺内脂肪对超重的亚洲华裔和欧洲白种人糖尿病前期女性餐后代谢组对乳清蛋白反应的影响。
Front Clin Diabetes Healthc. 2022 Oct 14;3:980856. doi: 10.3389/fcdhc.2022.980856. eCollection 2022.
6
PrismEXP: gene annotation prediction from stratified gene-gene co-expression matrices.PrismEXP:基于分层基因-基因共表达矩阵的基因注释预测。
PeerJ. 2023 Feb 27;11:e14927. doi: 10.7717/peerj.14927. eCollection 2023.
7
Lack of a site-specific phosphorylation of Presenilin 1 disrupts microglial gene networks and progenitors during development.早老素 1 缺乏特异性磷酸化会在发育过程中破坏小神经胶质细胞的基因网络和祖细胞。
PLoS One. 2020 Aug 21;15(8):e0237773. doi: 10.1371/journal.pone.0237773. eCollection 2020.
8
A Literature Review of Gene Function Prediction by Modeling Gene Ontology.基于基因本体建模的基因功能预测文献综述
Front Genet. 2020 Apr 24;11:400. doi: 10.3389/fgene.2020.00400. eCollection 2020.
9
Supervised learning is an accurate method for network-based gene classification.监督学习是一种基于网络的基因分类的精确方法。
Bioinformatics. 2020 Jun 1;36(11):3457-3465. doi: 10.1093/bioinformatics/btaa150.
10
Accurate genome-wide predictions of spatio-temporal gene expression during embryonic development.在胚胎发育过程中准确预测全基因组时空基因表达。
PLoS Genet. 2019 Sep 25;15(9):e1008382. doi: 10.1371/journal.pgen.1008382. eCollection 2019 Sep.
IntAct 分子相互作用数据库,2012 年版。
Nucleic Acids Res. 2012 Jan;40(Database issue):D841-6. doi: 10.1093/nar/gkr1088. Epub 2011 Nov 24.
4
The MetaCyc database of metabolic pathways and enzymes and the BioCyc collection of pathway/genome databases.MetaCyc 数据库包含代谢途径和酶,以及 BioCyc 集合的途径/基因组数据库。
Nucleic Acids Res. 2012 Jan;40(Database issue):D742-53. doi: 10.1093/nar/gkr1014. Epub 2011 Nov 18.
5
MINT, the molecular interaction database: 2012 update.MINT,分子相互作用数据库:2012 年更新。
Nucleic Acids Res. 2012 Jan;40(Database issue):D857-61. doi: 10.1093/nar/gkr930. Epub 2011 Nov 16.
6
BioMart: driving a paradigm change in biological data management.生物数据管理领域的范式转变推动者——生物集市(BioMart)
Database (Oxford). 2011 Nov 13;2011:bar049. doi: 10.1093/database/bar049. Print 2011.
7
Birth prevalence of congenital heart disease worldwide: a systematic review and meta-analysis.先天性心脏病的全球出生患病率:系统评价和荟萃分析。
J Am Coll Cardiol. 2011 Nov 15;58(21):2241-7. doi: 10.1016/j.jacc.2011.08.025.
8
Def6 is required for convergent extension movements during zebrafish gastrulation downstream of Wnt5b signaling.Def6 在斑马鱼原肠胚形成中 Wnt5b 信号下游的会聚延伸运动中是必需的。
PLoS One. 2011;6(10):e26548. doi: 10.1371/journal.pone.0026548. Epub 2011 Oct 21.
9
Bmp and nodal independently regulate lefty1 expression to maintain unilateral nodal activity during left-right axis specification in zebrafish.Bmp 和 nodal 独立调节 Lefty1 的表达,以在斑马鱼左右轴特化过程中维持单侧 nodal 活性。
PLoS Genet. 2011 Sep;7(9):e1002289. doi: 10.1371/journal.pgen.1002289. Epub 2011 Sep 29.
10
Two additional midline barriers function with midline lefty1 expression to maintain asymmetric Nodal signaling during left-right axis specification in zebrafish.在斑马鱼左右轴特化过程中,两个额外的中线屏障与中线 lefty1 表达一起发挥作用,以维持不对称的 Nodal 信号。
Development. 2011 Oct;138(20):4405-10. doi: 10.1242/dev.071092.