Graduate School of Environmental and Life Science, Okayama University, Okayama 700-8530, Japan.
JST, PRESTO, Kawaguchi-Shi, Saitama 332-0012, Japan.
Plant Cell. 2022 May 24;34(6):2174-2187. doi: 10.1093/plcell/koac079.
In the evolutionary history of plants, variation in cis-regulatory elements (CREs) resulting in diversification of gene expression has played a central role in driving the evolution of lineage-specific traits. However, it is difficult to predict expression behaviors from CRE patterns to properly harness them, mainly because the biological processes are complex. In this study, we used cistrome datasets and explainable convolutional neural network (CNN) frameworks to predict genome-wide expression patterns in tomato (Solanum lycopersicum) fruit from the DNA sequences in gene regulatory regions. By fixing the effects of trans-acting factors using single cell-type spatiotemporal transcriptome data for the response variables, we developed a prediction model for crucial expression patterns in the initiation of tomato fruit ripening. Feature visualization of the CNNs identified nucleotide residues critical to the objective expression pattern in each gene, and their effects were validated experimentally in ripening tomato fruit. This cis-decoding framework will not only contribute to the understanding of the regulatory networks derived from CREs and transcription factor interactions, but also provides a flexible means of designing alleles for optimized expression.
在植物的进化历史中,导致基因表达多样化的顺式调控元件 (CREs) 的变异在驱动谱系特异性特征的进化中发挥了核心作用。然而,要从 CRE 模式预测表达行为并正确利用它们是很困难的,主要是因为生物过程很复杂。在这项研究中,我们使用顺式作用元件数据集和可解释卷积神经网络 (CNN) 框架,根据基因调控区域的 DNA 序列,预测番茄 (Solanum lycopersicum) 果实的全基因组表达模式。通过使用单细胞类型时空转录组数据固定转录因子的效应作为响应变量,我们开发了一个用于预测番茄果实成熟起始的关键表达模式的预测模型。CNN 的特征可视化确定了每个基因中对目标表达模式至关重要的核苷酸残基,并且它们的效应在成熟番茄果实中进行了实验验证。这个顺式解码框架不仅有助于理解来自 CRE 和转录因子相互作用的调控网络,还为设计优化表达的等位基因提供了一种灵活的方法。