Achterberg Tristan, de Jong Anne
Department of Molecular Genetics, Groningen, Biomolecular Sciences and Biotechnology Institute, University of Groningen, Nijenborgh 7, 9747 AG Groningen, the Netherlands.
NAR Genom Bioinform. 2025 Jan 7;7(1):lqae188. doi: 10.1093/nargab/lqae188. eCollection 2025 Mar.
σ serves as an unconventional sigma factor with a distinct mechanism of transcription initiation, which depends on the involvement of a transcription activator. This unique sigma factor σ is indispensable for orchestrating the transcription of genes crucial to nitrogen regulation, flagella biosynthesis, motility, chemotaxis and various other essential cellular processes. Currently, no comprehensive tools are available to determine σ promoters and regulon in bacterial genomes. Here, we report a σ promoter prediction method ProPr54, based on a convolutional neural network trained on a set of 446 validated σ binding sites derived from 33 bacterial species. Model performance was tested and compared with respect to bacterial intergenic regions, demonstrating robust applicability. ProPr54 exhibits high performance when tested on various bacterial species, highly surpassing other available σ regulon identification methods. Furthermore, analysis on bacterial genomes, which have no experimentally validated σ binding sites, demonstrates the generalization of the model. ProPr54 is the first reliable method for predicting σ binding sites, making it a valuable tool to support experimental studies on σ. In conclusion, ProPr54 offers a reliable, broadly applicable tool for predicting σ promoters and regulon genes in bacterial genome sequences. A web server is freely accessible at http://propr54.molgenrug.nl.
σ作为一种非常规的sigma因子,具有独特的转录起始机制,这取决于转录激活因子的参与。这种独特的sigma因子σ对于协调对氮调节、鞭毛生物合成、运动性、趋化性和各种其他重要细胞过程至关重要的基因的转录是不可或缺的。目前,尚无全面的工具可用于确定细菌基因组中的σ启动子和调控子。在此,我们报告了一种基于卷积神经网络的σ启动子预测方法ProPr54,该网络在一组来自33种细菌物种的446个经过验证的σ结合位点上进行训练。对模型性能进行了测试,并与细菌基因间区域进行了比较,证明了其强大的适用性。ProPr54在对各种细菌物种进行测试时表现出高性能,大大超过了其他可用的σ调控子识别方法。此外,对没有经过实验验证的σ结合位点的细菌基因组进行分析,证明了该模型的通用性。ProPr54是第一种可靠的预测σ结合位点的方法,使其成为支持关于σ的实验研究的有价值工具。总之,ProPr54为预测细菌基因组序列中的σ启动子和调控子基因提供了一种可靠、广泛适用的工具。可通过http://propr54.molgenrug.nl免费访问网络服务器。