Van de Velde Jan, Van Bel Michiel, Vaneechoutte Dries, Vandepoele Klaas
Department of Plant Systems Biology, Vlaams Instituut voor Biotechnologie, B-9052 Ghent, Belgium (J.V.d.V., M.V.B., D.V., K.V.); andDepartment of Plant Biotechnology and Bioinformatics, Ghent University, B-9052 Ghent, Belgium (J.V.d.V., M.V.B., D.V., K.V.).
Department of Plant Systems Biology, Vlaams Instituut voor Biotechnologie, B-9052 Ghent, Belgium (J.V.d.V., M.V.B., D.V., K.V.); andDepartment of Plant Biotechnology and Bioinformatics, Ghent University, B-9052 Ghent, Belgium (J.V.d.V., M.V.B., D.V., K.V.)
Plant Physiol. 2016 Aug;171(4):2586-98. doi: 10.1104/pp.16.00821. Epub 2016 Jun 3.
Transcription factors (TFs) regulate gene expression by binding cis-regulatory elements, of which the identification remains an ongoing challenge owing to the prevalence of large numbers of nonfunctional TF binding sites. Powerful comparative genomics methods, such as phylogenetic footprinting, can be used for the detection of conserved noncoding sequences (CNSs), which are functionally constrained and can greatly help in reducing the number of false-positive elements. In this study, we applied a phylogenetic footprinting approach for the identification of CNSs in 10 dicot plants, yielding 1,032,291 CNSs associated with 243,187 genes. To annotate CNSs with TF binding sites, we made use of binding site information for 642 TFs originating from 35 TF families in Arabidopsis (Arabidopsis thaliana). In three species, the identified CNSs were evaluated using TF chromatin immunoprecipitation sequencing data, resulting in significant overlap for the majority of data sets. To identify ultraconserved CNSs, we included genomes of additional plant families and identified 715 binding sites for 501 genes conserved in dicots, monocots, mosses, and green algae. Additionally, we found that genes that are part of conserved mini-regulons have a higher coherence in their expression profile than other divergent gene pairs. All identified CNSs were integrated in the PLAZA 3.0 Dicots comparative genomics platform (http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/) together with new functionalities facilitating the exploration of conserved cis-regulatory elements and their associated genes. The availability of this data set in a user-friendly platform enables the exploration of functional noncoding DNA to study gene regulation in a variety of plant species, including crops.
转录因子(TFs)通过结合顺式调控元件来调节基因表达,由于存在大量无功能的TF结合位点,其识别仍然是一个持续存在的挑战。强大的比较基因组学方法,如系统发育足迹法,可用于检测保守非编码序列(CNSs),这些序列受到功能限制,有助于大幅减少假阳性元件的数量。在本研究中,我们应用系统发育足迹法在10种双子叶植物中鉴定CNSs,得到了与243,187个基因相关的1,032,291个CNSs。为了用TF结合位点注释CNSs,我们利用了源自拟南芥35个TF家族的642个TF的结合位点信息。在三个物种中,使用TF染色质免疫沉淀测序数据对鉴定出的CNSs进行评估,大多数数据集都有显著重叠。为了鉴定超保守CNSs,我们纳入了其他植物家族的基因组,并鉴定出了在双子叶植物、单子叶植物、苔藓和绿藻中保守的501个基因的715个结合位点。此外,我们发现,作为保守小调控子一部分的基因,其表达谱比其他差异基因对具有更高的一致性。所有鉴定出的CNSs都与促进保守顺式调控元件及其相关基因探索的新功能一起整合到了PLAZA 3.0双子叶植物比较基因组学平台(http://bioinformatics.psb.ugent.be/plaza/versions/plaza_v3_dicots/)中。这个数据集在用户友好平台上的可用性,使得能够探索功能性非编码DNA,以研究包括作物在内的多种植物物种中的基因调控。