Suppr超能文献

调整启动子边界可改善非模式植物中调控基序的发现:以桃树为例

Tuning promoter boundaries improves regulatory motif discovery in nonmodel plants: the peach example.

作者信息

Ksouri Najla, Castro-Mondragón Jaime A, Montardit-Tarda Francesc, van Helden Jacques, Contreras-Moreira Bruno, Gogorcena Yolanda

机构信息

Laboratory of Genomics, Genetics and Breeding of Fruits and Grapevine, Estación Experimental de Aula Dei-Consejo Superior de Investigaciones Científicas, Zaragoza, Spain.

Aix-Marseille Univ, INSERM UMR_S 1090, Theory and Approaches of Genome Complexity (TAGC), F-13288 Marseille, France.

出版信息

Plant Physiol. 2021 Apr 2;185(3):1242-1258. doi: 10.1093/plphys/kiaa091.

Abstract

The identification of functional elements encoded in plant genomes is necessary to understand gene regulation. Although much attention has been paid to model species like Arabidopsis (Arabidopsis thaliana), little is known about regulatory motifs in other plants. Here, we describe a bottom-up approach for de novo motif discovery using peach (Prunus persica) as an example. These predictions require pre-computed gene clusters grouped by their expression similarity. After optimizing the boundaries of proximal promoter regions, two motif discovery algorithms from RSAT::Plants (http://plants.rsat.eu) were tested (oligo and dyad analysis). Overall, 18 out of 45 co-expressed modules were enriched in motifs typical of well-known transcription factor (TF) families (bHLH, bZip, BZR, CAMTA, DOF, E2FE, AP2-ERF, Myb-like, NAC, TCP, and WRKY) and a few uncharacterized motifs. Our results indicate that small modules and promoter window of [-500 bp, +200 bp] relative to the transcription start site (TSS) maximize the number of motifs found and reduce low-complexity signals in peach. The distribution of discovered regulatory sites was unbalanced, as they accumulated around the TSS. This approach was benchmarked by testing two different expression-based clustering algorithms (network-based and hierarchical) and, as control, genes grouped for harboring ChIPseq peaks of the same Arabidopsis TF. The method was also verified on maize (Zea mays), a species with a large genome. In summary, this article presents a glimpse of the peach regulatory components at genome scale and provides a general protocol that can be applied to other species. A Docker software container is released to facilitate the reproduction of these analyses.

摘要

识别植物基因组中编码的功能元件对于理解基因调控至关重要。尽管人们已经对拟南芥等模式物种给予了很多关注,但对于其他植物中的调控基序却知之甚少。在这里,我们以桃(Prunus persica)为例,描述一种自下而上的从头基序发现方法。这些预测需要根据表达相似性预先计算的基因簇。在优化近端启动子区域的边界后,测试了RSAT::Plants(http://plants.rsat.eu)中的两种基序发现算法(寡核苷酸和二元分析)。总体而言,45个共表达模块中有18个富含著名转录因子(TF)家族(bHLH、bZip、BZR、CAMTA、DOF、E2FE、AP2-ERF、Myb样、NAC、TCP和WRKY)典型的基序以及一些未表征的基序。我们的结果表明,相对于转录起始位点(TSS)的小模块和[-500 bp, +200 bp]的启动子窗口可使桃中发现的基序数量最大化,并减少低复杂性信号。发现的调控位点分布不均衡,因为它们聚集在TSS周围。通过测试两种不同的基于表达的聚类算法(基于网络的和层次的)对该方法进行了基准测试,并作为对照,将具有相同拟南芥TF的ChIPseq峰的基因进行分组。该方法也在玉米(Zea mays)上进行了验证,玉米是一种基因组较大的物种。总之,本文展示了桃在基因组规模上的调控元件,并提供了一种可应用于其他物种的通用方案。发布了一个Docker软件容器以促进这些分析的重现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90a8/8133646/affd0020cbcc/kiaa091f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验