TSPTFBS 2.0：植物中转录因子结合位点的跨物种预测及其核心基序的鉴定

TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants.

作者信息

Cheng Huiling, Liu Lifen, Zhou Yuying, Deng Kaixuan, Ge Yuanxin, Hu Xuehai

机构信息

College of Informatics, Hubei Key Laboratory of Agricultural Bioinformatics, Huazhong Agricultural University, Wuhan, Hubei, China.

出版信息

Front Plant Sci. 2023 May 9;14:1175837. doi: 10.3389/fpls.2023.1175837. eCollection 2023.

DOI:10.3389/fpls.2023.1175837

PMID:37229121

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10203575/

Abstract

INTRODUCTION

An emerging approach using promoter tiling deletion via genome editing is beginning to become popular in plants. Identifying the precise positions of core motifs within plant gene promoter is of great demand but they are still largely unknown. We previously developed TSPTFBS of 265 transcription factor binding sites (TFBSs) prediction models, which now cannot meet the above demand of identifying the core motif.

METHODS

Here, we additionally introduced 104 maize and 20 rice TFBS datasets and utilized DenseNet for model construction on a large-scale dataset of a total of 389 plant TFs. More importantly, we combined three biological interpretability methods including DeepLIFT, tiling deletion, and mutagenesis to identify the potential core motifs of any given genomic region.

RESULTS

For the results, DenseNet not only has achieved greater predictability than baseline methods such as LS-GKM and MEME for above 389 TFs from Arabidopsis, maize and rice, but also has greater performance on trans-species prediction of a total of 15 TFs from other six plant species. A motif analysis based on TF-MoDISco and global importance analysis (GIA) further provide the biological implication of the core motif identified by three interpretability methods. Finally, we developed a pipeline of TSPTFBS 2.0, which integrates 389 DenseNet-based models of TF binding and the above three interpretability methods.

DISCUSSION

TSPTFBS 2.0 was implemented as a user-friendly web-server (http://www.hzau-hulab.com/TSPTFBS/), which can support important references for editing targets of any given plant promoters and it has great potentials to provide reliable editing target of genetic screen experiments in plants.

摘要

引言

一种通过基因组编辑进行启动子平铺缺失的新兴方法在植物中开始流行起来。确定植物基因启动子内核心基序的精确位置需求迫切，但目前仍知之甚少。我们之前开发了包含265个转录因子结合位点（TFBSs）预测模型的TSPTFBS，现在该模型已无法满足上述识别核心基序的需求。

方法

在此，我们额外引入了104个玉米和20个水稻TFBS数据集，并利用DenseNet在总共389个植物转录因子的大规模数据集上构建模型。更重要的是，我们结合了三种生物学可解释性方法，包括DeepLIFT、平铺缺失和诱变，以识别任何给定基因组区域的潜在核心基序。

结果

在结果方面，对于来自拟南芥、玉米和水稻的上述389个转录因子，DenseNet不仅比LS-GKM和MEME等基线方法具有更高的预测能力，而且在对来自其他六个植物物种的总共15个转录因子的跨物种预测中也表现出更好的性能。基于TF-MoDISco的基序分析和全局重要性分析（GIA）进一步揭示了三种可解释性方法所识别的核心基序的生物学意义。最后，我们开发了TSPTFBS 2.0流程，它整合了基于DenseNet的389个转录因子结合模型以及上述三种可解释性方法。

讨论

TSPTFBS 2.0被实现为一个用户友好的网络服务器（http://www.hzau-hulab.com/TSPTFBS/），它可以为任何给定植物启动子的编辑靶点提供重要参考，并且在为植物遗传筛选实验提供可靠编辑靶点方面具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ba74/10203575/7bc9f0522080/fpls-14-1175837-g001.jpg

相似文献

TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants.TSPTFBS 2.0：植物中转录因子结合位点的跨物种预测及其核心基序的鉴定

Front Plant Sci. 2023 May 9;14:1175837. doi: 10.3389/fpls.2023.1175837. eCollection 2023.

TSPTFBS: a Docker image for trans-species prediction of transcription factor binding sites in plants.TSPTFBS：一种用于植物转录因子结合位点跨物种预测的Docker镜像。

Bioinformatics. 2021 Apr 19;37(2):260-262. doi: 10.1093/bioinformatics/btaa1100.

Plant-DTI: Extending the landscape of TF protein and DNA interaction in plants by a machine learning-based approach.植物DTI：通过基于机器学习的方法拓展植物中转录因子蛋白与DNA相互作用的研究领域。

Front Plant Sci. 2022 Aug 23;13:970018. doi: 10.3389/fpls.2022.970018. eCollection 2022.

Promzea: a pipeline for discovery of co-regulatory motifs in maize and other plant species and its application to the anthocyanin and phlobaphene biosynthetic pathways and the Maize Development Atlas.Promzea：一个在玉米和其他植物物种中发现共调控基序的管道，及其在花青素和叶红质生物合成途径以及玉米发育图谱中的应用。

BMC Plant Biol. 2013 Mar 15;13:42. doi: 10.1186/1471-2229-13-42.

Bioinformatic prediction of transcription factor binding sites at promoter regions of genes for photoperiod and vernalization responses in model and temperate cereal plants.模式植物和温带谷类作物中光周期及春化反应相关基因启动子区域转录因子结合位点的生物信息学预测

BMC Genomics. 2016 Aug 8;17:573. doi: 10.1186/s12864-016-2916-7.

Transcriptome dynamics of developing maize leaves and genomewide prediction of cis elements and their cognate transcription factors.发育中的玉米叶片转录组动态变化以及顺式元件及其同源转录因子的全基因组预测

Proc Natl Acad Sci U S A. 2015 May 12;112(19):E2477-86. doi: 10.1073/pnas.1500605112. Epub 2015 Apr 27.

Discovering unknown human and mouse transcription factor binding sites and their characteristics from ChIP-seq data.从 ChIP-seq 数据中发现未知的人和小鼠转录因子结合位点及其特征。

Proc Natl Acad Sci U S A. 2021 May 18;118(20). doi: 10.1073/pnas.2026754118.

High-resolution transcription factor binding sites prediction improved performance and interpretability by deep learning method.深度学习方法提高了高分辨率转录因子结合位点预测的性能和可解释性。

Brief Bioinform. 2021 Nov 5;22(6). doi: 10.1093/bib/bbab273.

TOBFAC: the database of tobacco transcription factors.TOBFAC：烟草转录因子数据库。

BMC Bioinformatics. 2008 Jan 25;9:53. doi: 10.1186/1471-2105-9-53.

Identifying functional transcription factor binding sites in yeast by considering their positional preference in the promoters.通过考虑启动子中位置偏好来识别酵母中的功能转录因子结合位点。

PLoS One. 2013 Dec 26;8(12):e83791. doi: 10.1371/journal.pone.0083791. eCollection 2013.

引用本文的文献

Genome-wide identification and functional roles relating to anthocyanin biosynthesis analysis in maize.玉米中与花青素生物合成分析相关的全基因组鉴定及功能作用

BMC Plant Biol. 2025 Jan 15;25(1):57. doi: 10.1186/s12870-025-06053-4.

Comprehensive analysis of computational approaches in plant transcription factors binding regions discovery.植物转录因子结合区域发现中计算方法的综合分析

Heliyon. 2024 Oct 10;10(20):e39140. doi: 10.1016/j.heliyon.2024.e39140. eCollection 2024 Oct 30.

Deciphering the molecular logic of WOX5 function in the root stem cell organizer.解析WOX5在根尖干细胞组织中心功能的分子逻辑。

EMBO J. 2025 Jan;44(1):281-303. doi: 10.1038/s44318-024-00302-2. Epub 2024 Nov 18.

PTFSpot: deep co-learning on transcription factors and their binding regions attains impeccable universality in plants.PTFSpot：在转录因子及其结合区域上进行深度协同学习，在植物中实现了完美的通用性。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae324.

Recent advances in exploring transcriptional regulatory landscape of crops.作物转录调控格局探索的最新进展。

Front Plant Sci. 2024 Jun 5;15:1421503. doi: 10.3389/fpls.2024.1421503. eCollection 2024.

本文引用的文献

PlantBind: an attention-based multi-label neural network for predicting plant transcription factor binding sites.植物结合域预测：基于注意力的多标签神经网络方法

Brief Bioinform. 2022 Nov 19;23(6). doi: 10.1093/bib/bbac425.

TDTHub, a web server tool for the analysis of transcription factor binding sites in plants.TDTHub，一个用于分析植物转录因子结合位点的网络服务器工具。

Plant J. 2022 Aug;111(4):1203-1215. doi: 10.1111/tpj.15873. Epub 2022 Jul 1.

ChIP-Hub provides an integrative platform for exploring plant regulome.ChIP-Hub 为探索植物调控组学提供了一个综合性平台。

Nat Commun. 2022 Jun 14;13(1):3413. doi: 10.1038/s41467-022-30770-1.

DeepSTARR predicts enhancer activity from DNA sequence and enables the de novo design of synthetic enhancers.DeepSTARR 可根据 DNA 序列预测增强子活性，并能够从头设计合成增强子。

Nat Genet. 2022 May;54(5):613-624. doi: 10.1038/s41588-022-01048-5. Epub 2022 May 12.

Targeting a gene regulatory element enhances rice grain yield by decoupling panicle number and size.靶向一个基因调控元件通过分离穗数和大小来提高水稻产量。

Nat Biotechnol. 2022 Sep;40(9):1403-1411. doi: 10.1038/s41587-022-01281-7. Epub 2022 Apr 21.

Global importance analysis: An interpretability method to quantify importance of genomic features in deep neural networks.全局重要性分析：一种用于量化深度神经网络中基因组特征重要性的可解释性方法。

PLoS Comput Biol. 2021 May 13;17(5):e1008925. doi: 10.1371/journal.pcbi.1008925. eCollection 2021 May.

Genome engineering for crop improvement and future agriculture.作物改良与未来农业的基因组工程。

Cell. 2021 Mar 18;184(6):1621-1635. doi: 10.1016/j.cell.2021.01.005. Epub 2021 Feb 12.

TSPTFBS: a Docker image for trans-species prediction of transcription factor binding sites in plants.TSPTFBS：一种用于植物转录因子结合位点跨物种预测的Docker镜像。

Bioinformatics. 2021 Apr 19;37(2):260-262. doi: 10.1093/bioinformatics/btaa1100.

Reconstructing the maize leaf regulatory network using ChIP-seq data of 104 transcription factors.利用 104 个转录因子的 ChIP-seq 数据重建玉米叶片调控网络。

Nat Commun. 2020 Oct 9;11(1):5089. doi: 10.1038/s41467-020-18832-8.

A survey on deep learning in DNA/RNA motif mining.深度学习在 DNA/RNA 基序挖掘中的应用调查。

Brief Bioinform. 2021 Jul 20;22(4). doi: 10.1093/bib/bbaa229.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

TSPTFBS 2.0：植物中转录因子结合位点的跨物种预测及其核心基序的鉴定

TSPTFBS 2.0: trans-species prediction of transcription factor binding sites and identification of their core motifs in plants.

作者信息

机构信息

出版信息

INTRODUCTION

METHODS

RESULTS

DISCUSSION

引言

方法

结果

讨论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献