利用机器学习方法从基因组特征预测水稻对高温和干旱胁迫的转录反应。

Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice.

作者信息

Smet Dajo, Opdebeeck Helder, Vandepoele Klaas

机构信息

Department of Plant Biotechnology and Bioinformatics, Ghent University, Ghent, Belgium.

Center for Plant Systems Biology, Vlaams Instituut voor Biotechnologie (VIB), Ghent, Belgium.

出版信息

Front Plant Sci. 2023 Jul 17;14:1212073. doi: 10.3389/fpls.2023.1212073. eCollection 2023.

DOI:10.3389/fpls.2023.1212073

PMID:37528982

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10390317/

Abstract

Plants have evolved various mechanisms to adapt to adverse environmental stresses, such as the modulation of gene expression. Expression of stress-responsive genes is controlled by specific regulators, including transcription factors (TFs), that bind to sequence-specific binding sites, representing key components of cis-regulatory elements and regulatory networks. Our understanding of the underlying regulatory code remains, however, incomplete. Recent studies have shown that, by training machine learning (ML) algorithms on genomic sequence features, it is possible to predict which genes will transcriptionally respond to a specific stress. By identifying the most important features for gene expression prediction, these trained ML models allow, in theory, to further elucidate the regulatory code underlying the transcriptional response to abiotic stress. Here, we trained random forest ML models to predict gene expression in rice () in response to heat or drought stress. Apart from thoroughly assessing model performance and robustness across various input training data, the importance of promoter and gene body sequence features to train ML models was evaluated. The use of enriched promoter oligomers, complementing known TF binding sites, allowed us to gain novel insights in DNA motifs contributing to the stress regulatory code. By comparing genomic feature importance scores for drought and heat stress over time, general and stress-specific genomic features contributing to the performance of the learned models and their temporal variation were identified. This study provides a solid foundation to build and interpret ML models accurately predicting transcriptional responses and enables novel insights in biological sequence features that are important for abiotic stress responses.

摘要

植物已经进化出各种机制来适应不利的环境胁迫，例如基因表达的调控。胁迫响应基因的表达由特定的调节因子控制，包括转录因子（TFs），它们与序列特异性结合位点结合，这些位点代表顺式调控元件和调控网络的关键组成部分。然而，我们对潜在调控密码的理解仍然不完整。最近的研究表明，通过在基因组序列特征上训练机器学习（ML）算法，可以预测哪些基因将对特定胁迫产生转录响应。通过识别基因表达预测中最重要的特征，这些经过训练的ML模型理论上可以进一步阐明非生物胁迫转录响应背后的调控密码。在这里，我们训练了随机森林ML模型来预测水稻在热胁迫或干旱胁迫下的基因表达。除了全面评估模型在各种输入训练数据上的性能和稳健性外，还评估了启动子和基因体序列特征对训练ML模型的重要性。使用富集的启动子寡聚体，补充已知的TF结合位点，使我们能够对有助于胁迫调控密码的DNA基序有新的认识。通过比较干旱和热胁迫随时间的基因组特征重要性得分，确定了有助于学习模型性能及其时间变化的一般和胁迫特异性基因组特征。这项研究为构建和解释准确预测转录响应的ML模型提供了坚实的基础，并能够对非生物胁迫响应重要的生物序列特征有新的认识。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbeb/10390317/faf23ea29711/fpls-14-1212073-g001.jpg

相似文献

Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice.利用机器学习方法从基因组特征预测水稻对高温和干旱胁迫的转录反应。

Front Plant Sci. 2023 Jul 17;14:1212073. doi: 10.3389/fpls.2023.1212073. eCollection 2023.

Synergistic regulatory networks mediated by microRNAs and transcription factors under drought, heat and salt stresses in Oryza Sativa spp.水稻在干旱、高温和盐胁迫下由微小RNA和转录因子介导的协同调控网络

Gene. 2015 Jan 25;555(2):127-39. doi: 10.1016/j.gene.2014.10.054. Epub 2014 Oct 31.

Five novel transcription factors as potential regulators of OsNHX1 gene expression in a salt tolerant rice genotype.五个新型转录因子作为耐盐水稻基因型中OsNHX1基因表达的潜在调节因子。

Plant Mol Biol. 2017 Jan;93(1-2):61-77. doi: 10.1007/s11103-016-0547-7. Epub 2016 Oct 20.

Using Network-Based Machine Learning to Predict Transcription Factors Involved in Drought Resistance.使用基于网络的机器学习预测参与抗旱性的转录因子。

Front Genet. 2021 Jun 24;12:652189. doi: 10.3389/fgene.2021.652189. eCollection 2021.

The -regulatory codes of response to combined heat and drought stress in .植物对高温和干旱复合胁迫响应的调控机制

NAR Genom Bioinform. 2020 Jul 21;2(3):lqaa049. doi: 10.1093/nargab/lqaa049. eCollection 2020 Sep.

Identification of abiotic stress miRNA transcription factor binding motifs (TFBMs) in rice.鉴定水稻中非生物胁迫 miRNA 转录因子结合基序 (TFBMs)。

Gene. 2013 Nov 15;531(1):15-22. doi: 10.1016/j.gene.2013.08.060. Epub 2013 Aug 28.

Microarray Analysis of Rice d1 (RGA1) Mutant Reveals the Potential Role of G-Protein Alpha Subunit in Regulating Multiple Abiotic Stresses Such as Drought, Salinity, Heat, and Cold.水稻d1（RGA1）突变体的微阵列分析揭示了G蛋白α亚基在调节干旱、盐度、高温和低温等多种非生物胁迫中的潜在作用。

Front Plant Sci. 2016 Jan 28;7:11. doi: 10.3389/fpls.2016.00011. eCollection 2016.

Integrated ATAC-Seq and RNA-Seq Data Analysis to Reveal Function in Rice in Response to Heat Stress.整合 ATAC-Seq 和 RNA-Seq 数据分析揭示水稻响应热胁迫的功能。

Int J Mol Sci. 2023 Mar 15;24(6):5619. doi: 10.3390/ijms24065619.

Differential quantitative regulation of specific gene groups and pathways under drought stress in rice.在干旱胁迫下，水稻中特定基因群和途径的差异定量调控。

Genomics. 2019 Dec;111(6):1699-1712. doi: 10.1016/j.ygeno.2018.11.024. Epub 2018 Nov 26.

The transcriptional regulatory network in the drought response and its crosstalk in abiotic stress responses including drought, cold, and heat.干旱响应中的转录调控网络及其在包括干旱、寒冷和高温在内的非生物胁迫响应中的相互作用。

Front Plant Sci. 2014 May 16;5:170. doi: 10.3389/fpls.2014.00170. eCollection 2014.

引用本文的文献

Bisphenol A causes melatonin biosynthesis epigenetic reprogramming of melatonin biosynthesis genes in arabidopsis thaliana.双酚A导致拟南芥中褪黑素生物合成基因的褪黑素生物合成表观遗传重编程。

Commun Biol. 2025 Jul 30;8(1):1128. doi: 10.1038/s42003-025-08575-x.

Gaining insights into epigenetic memories through artificial intelligence and omics science in plants.通过人工智能和植物组学科学深入了解表观遗传记忆。

J Integr Plant Biol. 2025 Sep;67(9):2320-2349. doi: 10.1111/jipb.13953. Epub 2025 Jun 24.

Using supervised machine-learning approaches to understand abiotic stress tolerance and design resilient crops.利用监督式机器学习方法来理解非生物胁迫耐受性并设计抗逆作物。

Philos Trans R Soc Lond B Biol Sci. 2025 May 29;380(1927):20240252. doi: 10.1098/rstb.2024.0252.

Predicting gene expression responses to environment in using natural variation in DNA sequence.利用DNA序列中的自然变异预测基因对环境的表达响应。

bioRxiv. 2025 Mar 4:2024.04.25.591174. doi: 10.1101/2024.04.25.591174.

Deep learning the cis-regulatory code for gene expression in selected model plants.深度学习选定模式植物中基因表达的顺式调控密码。

Nat Commun. 2024 Apr 25;15(1):3488. doi: 10.1038/s41467-024-47744-0.

本文引用的文献

Regulatory network established by transcription factors transmits drought stress signals in plant.转录因子建立的调控网络在植物中传递干旱胁迫信号。

Stress Biol. 2022 Jul 14;2(1):26. doi: 10.1007/s44154-022-00048-z.

On the Role of TATA Boxes and TATA-Binding Protein in .关于TATA框和TATA结合蛋白在……中的作用

Plants (Basel). 2023 Feb 22;12(5):1000. doi: 10.3390/plants12051000.

Toward learning the principles of plant gene regulation.朝向学习植物基因调控原理。

Trends Plant Sci. 2022 Dec;27(12):1206-1208. doi: 10.1016/j.tplants.2022.08.010. Epub 2022 Sep 11.

RSAT 2022: regulatory sequence analysis tools.RSAT 2022：调控序列分析工具。

Nucleic Acids Res. 2022 Jul 5;50(W1):W670-W676. doi: 10.1093/nar/gkac312.

Phytohormones Trigger Drought Tolerance in Crop Plants: Outlook and Future Perspectives.植物激素引发作物的耐旱性：展望与未来前景

Front Plant Sci. 2022 Jan 13;12:799318. doi: 10.3389/fpls.2021.799318. eCollection 2021.

Modeling temporal and hormonal regulation of plant transcriptional response to wounding.建模植物对创伤的转录反应的时间和激素调节。

Plant Cell. 2022 Feb 3;34(2):867-888. doi: 10.1093/plcell/koab287.

Navigating the pitfalls of applying machine learning in genomics.在基因组学中应用机器学习的陷阱。

Nat Rev Genet. 2022 Mar;23(3):169-181. doi: 10.1038/s41576-021-00434-9. Epub 2021 Nov 26.

Ensembl 2022.Ensembl 2022.

Nucleic Acids Res. 2022 Jan 7;50(D1):D988-D995. doi: 10.1093/nar/gkab1049.

PLAZA 5.0: extending the scope and power of comparative and functional genomics in plants.PLAZA 5.0：拓展植物比较和功能基因组学的范围和力量。

Nucleic Acids Res. 2022 Jan 7;50(D1):D1468-D1474. doi: 10.1093/nar/gkab1024.

Prediction of conserved and variable heat and cold stress response in maize using cis-regulatory information.利用顺式调控信息预测玉米中保守和可变的热激和冷激响应。

Plant Cell. 2022 Jan 20;34(1):514-534. doi: 10.1093/plcell/koab267.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

利用机器学习方法从基因组特征预测水稻对高温和干旱胁迫的转录反应。

Predicting transcriptional responses to heat and drought stress from genomic features using a machine learning approach in rice.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献