• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过机器学习对特殊代谢基因进行可靠预测。

Robust predictions of specialized metabolism genes through machine learning.

作者信息

Moore Bethany M, Wang Peipei, Fan Pengxiang, Leong Bryan, Schenck Craig A, Lloyd John P, Lehti-Shiu Melissa D, Last Robert L, Pichersky Eran, Shiu Shin-Han

机构信息

Department of Plant Biology, Michigan State University, East Lansing, MI 48824.

Ecology, Evolutionary Biology, and Behavior Program, Michigan State University, East Lansing, MI 48824.

出版信息

Proc Natl Acad Sci U S A. 2019 Feb 5;116(6):2344-2353. doi: 10.1073/pnas.1817074116. Epub 2019 Jan 23.

DOI:10.1073/pnas.1817074116
PMID:30674669
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6369796/
Abstract

Plant specialized metabolism (SM) enzymes produce lineage-specific metabolites with important ecological, evolutionary, and biotechnological implications. Using as a model, we identified distinguishing characteristics of SM and GM (general metabolism, traditionally referred to as primary metabolism) genes through a detailed study of features including duplication pattern, sequence conservation, transcription, protein domain content, and gene network properties. Analysis of multiple sets of benchmark genes revealed that SM genes tend to be tandemly duplicated, coexpressed with their paralogs, narrowly expressed at lower levels, less conserved, and less well connected in gene networks relative to GM genes. Although the values of each of these features significantly differed between SM and GM genes, any single feature was ineffective at predicting SM from GM genes. Using machine learning methods to integrate all features, a prediction model was established with a true positive rate of 87% and a true negative rate of 71%. In addition, 86% of known SM genes not used to create the machine learning model were predicted. We also demonstrated that the model could be further improved when we distinguished between SM, GM, and junction genes responsible for reactions shared by SM and GM pathways, indicating that topological considerations may further improve the SM prediction model. Application of the prediction model led to the identification of 1,220 genes with previously unknown functions, each assigned a confidence measure called an SM score, providing a global estimate of SM gene content in a plant genome.

摘要

植物特殊代谢(SM)酶产生具有重要生态、进化和生物技术意义的谱系特异性代谢产物。以 为模型,我们通过对包括复制模式、序列保守性、转录、蛋白质结构域含量和基因网络特性等特征的详细研究,确定了SM和GM(一般代谢,传统上称为初级代谢)基因的区别特征。对多组基准基因的分析表明,相对于GM基因,SM基因倾向于串联重复,与其旁系同源基因共表达,表达水平较低且范围较窄,保守性较差,在基因网络中的连接性也较差。尽管这些特征在SM和GM基因之间的每一个值都有显著差异,但任何单个特征都无法有效地从GM基因中预测SM基因。使用机器学习方法整合所有特征,建立了一个预测模型,其真阳性率为87%,真阴性率为71%。此外,86%未用于创建机器学习模型的已知SM基因也被预测出来。我们还证明,当我们区分负责SM和GM途径共享反应的SM、GM和连接基因时,该模型可以进一步改进,这表明拓扑学考虑可能会进一步改进SM预测模型。预测模型的应用导致鉴定出1220个功能未知的基因,每个基因都被赋予一个称为SM分数的置信度度量,从而对植物基因组中的SM基因含量进行全局估计。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/a0c3d55c4420/pnas.1817074116fig05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/7d08890a93fd/pnas.1817074116fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/5136af702c13/pnas.1817074116fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/d2cd21f7b2c1/pnas.1817074116fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/099c8a754273/pnas.1817074116fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/a0c3d55c4420/pnas.1817074116fig05.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/7d08890a93fd/pnas.1817074116fig01.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/5136af702c13/pnas.1817074116fig02.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/d2cd21f7b2c1/pnas.1817074116fig03.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/099c8a754273/pnas.1817074116fig04.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a046/6369796/a0c3d55c4420/pnas.1817074116fig05.jpg

相似文献

1
Robust predictions of specialized metabolism genes through machine learning.通过机器学习对特殊代谢基因进行可靠预测。
Proc Natl Acad Sci U S A. 2019 Feb 5;116(6):2344-2353. doi: 10.1073/pnas.1817074116. Epub 2019 Jan 23.
2
Within- and cross-species predictions of plant specialized metabolism genes using transfer learning.利用迁移学习对植物特殊代谢基因进行种内和跨物种预测。
In Silico Plants. 2020;2(1):diaa005. doi: 10.1093/insilicoplants/diaa005. Epub 2020 Jul 30.
3
A Global Coexpression Network Approach for Connecting Genes to Specialized Metabolic Pathways in Plants.一种用于将植物基因与特定代谢途径相连接的全球共表达网络方法。
Plant Cell. 2017 May;29(5):944-959. doi: 10.1105/tpc.17.00009. Epub 2017 Apr 13.
4
Characteristics of Plant Essential Genes Allow for within- and between-Species Prediction of Lethal Mutant Phenotypes.植物必需基因的特征有助于在种内和种间预测致死突变体表型。
Plant Cell. 2015 Aug;27(8):2133-47. doi: 10.1105/tpc.15.00051. Epub 2015 Aug 18.
5
Prediction of microRNA-regulated protein interaction pathways in Arabidopsis using machine learning algorithms.利用机器学习算法预测拟南芥中 miRNA 调控的蛋白质互作通路。
Comput Biol Med. 2013 Nov;43(11):1645-52. doi: 10.1016/j.compbiomed.2013.08.010. Epub 2013 Aug 22.
6
Gene prediction and gene classes in Arabidopsis thaliana.拟南芥中的基因预测与基因类别
J Biotechnol. 2000 Mar 31;78(3):293-9. doi: 10.1016/s0168-1656(00)00196-6.
7
Contribution of Sequence Motif, Chromatin State, and DNA Structure Features to Predictive Models of Transcription Factor Binding in Yeast.序列基序、染色质状态和DNA结构特征对酵母转录因子结合预测模型的贡献
PLoS Comput Biol. 2015 Aug 20;11(8):e1004418. doi: 10.1371/journal.pcbi.1004418. eCollection 2015 Aug.
8
Extensive divergence in alternative splicing patterns after gene and genome duplication during the evolutionary history of Arabidopsis.在拟南芥的进化历史中,基因和基因组复制后,选择性剪接模式发生了广泛的分歧。
Mol Biol Evol. 2010 Jul;27(7):1686-97. doi: 10.1093/molbev/msq054. Epub 2010 Feb 25.
9
Genome-wide identification and evolutionary analysis of Arabidopsis sm genes family.拟南芥 sm 基因家族的全基因组鉴定和进化分析。
J Biomol Struct Dyn. 2011 Feb;28(4):535-44. doi: 10.1080/07391102.2011.10508593.
10
Prediction of operon-like gene clusters in the Arabidopsis thaliana genome based on co-expression analysis of neighboring genes.基于邻近基因的共表达分析预测拟南芥基因组中的操纵子样基因簇。
Gene. 2012 Jul 15;503(1):56-64. doi: 10.1016/j.gene.2012.04.043. Epub 2012 Apr 24.

引用本文的文献

1
Machine learning reveals genes impacting oxidative stress resistance across yeasts.机器学习揭示了影响酵母抗氧化应激能力的基因。
Nat Commun. 2025 Jul 1;16(1):5866. doi: 10.1038/s41467-025-60189-3.
2
Using supervised machine-learning approaches to understand abiotic stress tolerance and design resilient crops.利用监督式机器学习方法来理解非生物胁迫耐受性并设计抗逆作物。
Philos Trans R Soc Lond B Biol Sci. 2025 May 29;380(1927):20240252. doi: 10.1098/rstb.2024.0252.
3
Interspecies predictions of growth traits from quantitative transcriptome data acquired during fruit development.

本文引用的文献

1
Molecular basis of the evolution of alternative tyrosine biosynthetic routes in plants.植物中替代酪氨酸生物合成途径进化的分子基础。
Nat Chem Biol. 2017 Sep;13(9):1029-1035. doi: 10.1038/nchembio.2414. Epub 2017 Jun 26.
2
AraNet: A Network Biology Server for Arabidopsis thaliana and Other Non-Model Plant Species.AraNet:一个用于拟南芥和其他非模式植物物种的网络生物学服务器。
Methods Mol Biol. 2017;1629:225-238. doi: 10.1007/978-1-4939-7125-1_15.
3
A Global Coexpression Network Approach for Connecting Genes to Specialized Metabolic Pathways in Plants.
基于果实发育过程中获取的定量转录组数据对生长性状进行种间预测。
J Exp Bot. 2025 Aug 21;76(12):3390-3411. doi: 10.1093/jxb/eraf122.
4
A Guide to Metabolic Network Modeling for Plant Biology.植物生物学代谢网络建模指南
Plants (Basel). 2025 Feb 6;14(3):484. doi: 10.3390/plants14030484.
5
Linking plant genes to arthropod community dynamics: current progress and future challenges.将植物基因与节肢动物群落动态联系起来:当前进展与未来挑战
Plant Cell Physiol. 2025 May 17;66(4):506-513. doi: 10.1093/pcp/pcaf015.
6
Understanding metabolic diversification in plants: branchpoints in the evolution of specialized metabolism.理解植物代谢多样化:特化代谢进化中的分支点。
Philos Trans R Soc Lond B Biol Sci. 2024 Nov 18;379(1914):20230359. doi: 10.1098/rstb.2023.0359. Epub 2024 Sep 30.
7
A developmental gradient reveals biosynthetic pathways to eukaryotic toxins in monocot geophytes.发育梯度揭示了单子叶植物地生植物中真核毒素的生物合成途径。
Cell. 2024 Oct 3;187(20):5620-5637.e10. doi: 10.1016/j.cell.2024.08.027. Epub 2024 Sep 13.
8
Enhancing Withanolide Production in the Species: Advances in In Vitro Culture and Synthetic Biology Approaches.提高该物种中睡茄内酯的产量:体外培养和合成生物学方法的进展
Plants (Basel). 2024 Aug 5;13(15):2171. doi: 10.3390/plants13152171.
9
Machine learning assists prediction of genes responsible for plant specialized metabolite biosynthesis by integrating multi-omics data.机器学习通过整合多组学数据来辅助预测负责植物特化代谢物生物合成的基因。
BMC Genomics. 2024 Apr 29;25(1):418. doi: 10.1186/s12864-024-10258-6.
10
Machine learning enables identification of an alternative yeast galactose utilization pathway.机器学习能够鉴定酵母半乳糖利用的替代途径。
Proc Natl Acad Sci U S A. 2024 Apr 30;121(18):e2315314121. doi: 10.1073/pnas.2315314121. Epub 2024 Apr 26.
一种用于将植物基因与特定代谢途径相连接的全球共表达网络方法。
Plant Cell. 2017 May;29(5):944-959. doi: 10.1105/tpc.17.00009. Epub 2017 Apr 13.
4
Genome-Wide Prediction of Metabolic Enzymes, Pathways, and Gene Clusters in Plants.植物中代谢酶、代谢途径和基因簇的全基因组预测
Plant Physiol. 2017 Apr;173(4):2041-2059. doi: 10.1104/pp.16.01942. Epub 2017 Feb 22.
5
Utility and Limitations of Using Gene Expression Data to Identify Functional Associations.利用基因表达数据识别功能关联的效用与局限性
PLoS Comput Biol. 2016 Dec 9;12(12):e1005244. doi: 10.1371/journal.pcbi.1005244. eCollection 2016 Dec.
6
Expression Pattern Similarities Support the Prediction of Orthologs Retaining Common Functions after Gene Duplication Events.表达模式相似性支持对基因复制事件后保留共同功能的直系同源基因的预测。
Plant Physiol. 2016 Aug;171(4):2343-57. doi: 10.1104/pp.15.01207. Epub 2016 Jun 14.
7
Evolution of Gene Duplication in Plants.植物中基因复制的进化
Plant Physiol. 2016 Aug;171(4):2294-316. doi: 10.1104/pp.16.00523. Epub 2016 Jun 10.
8
The Pfam protein families database: towards a more sustainable future.Pfam蛋白质家族数据库:迈向更可持续的未来。
Nucleic Acids Res. 2016 Jan 4;44(D1):D279-85. doi: 10.1093/nar/gkv1344. Epub 2015 Dec 15.
9
Characteristics of Plant Essential Genes Allow for within- and between-Species Prediction of Lethal Mutant Phenotypes.植物必需基因的特征有助于在种内和种间预测致死突变体表型。
Plant Cell. 2015 Aug;27(8):2133-47. doi: 10.1105/tpc.15.00051. Epub 2015 Aug 18.
10
Something Old, Something New: Conserved Enzymes and the Evolution of Novelty in Plant Specialized Metabolism.旧物新事:保守酶与植物特殊代谢中新颖性的进化
Plant Physiol. 2015 Nov;169(3):1512-23. doi: 10.1104/pp.15.00994. Epub 2015 Aug 14.