Nikolaichik Yevgeny, Damienikan Aliaksandr U
Department of Molecular Biology, Belarusian State University , Minsk , Belarus.
PeerJ. 2016 May 24;4:e2056. doi: 10.7717/peerj.2056. eCollection 2016.
The majority of bacterial genome annotations are currently automated and based on a 'gene by gene' approach. Regulatory signals and operon structures are rarely taken into account which often results in incomplete and even incorrect gene function assignments. Here we present SigmoID, a cross-platform (OS X, Linux and Windows) open-source application aiming at simplifying the identification of transcription regulatory sites (promoters, transcription factor binding sites and terminators) in bacterial genomes and providing assistance in correcting annotations in accordance with regulatory information. SigmoID combines a user-friendly graphical interface to well known command line tools with a genome browser for visualising regulatory elements in genomic context. Integrated access to online databases with regulatory information (RegPrecise and RegulonDB) and web-based search engines speeds up genome analysis and simplifies correction of genome annotation. We demonstrate some features of SigmoID by constructing a series of regulatory protein binding site profiles for two groups of bacteria: Soft Rot Enterobacteriaceae (Pectobacterium and Dickeya spp.) and Pseudomonas spp. Furthermore, we inferred over 900 transcription factor binding sites and alternative sigma factor promoters in the annotated genome of Pectobacterium atrosepticum. These regulatory signals control putative transcription units covering about 40% of the P. atrosepticum chromosome. Reviewing the annotation in cases where it didn't fit with regulatory information allowed us to correct product and gene names for over 300 loci.
目前,大多数细菌基因组注释是自动化的,且基于“逐个基因”的方法。调控信号和操纵子结构很少被考虑在内,这常常导致基因功能分配不完整甚至错误。在此,我们展示了SigmoID,这是一款跨平台(OS X、Linux和Windows)的开源应用程序,旨在简化细菌基因组中转录调控位点(启动子、转录因子结合位点和终止子)的识别,并根据调控信息协助校正注释。SigmoID将用户友好的图形界面与知名的命令行工具相结合,并配有一个基因组浏览器,用于在基因组背景下可视化调控元件。通过集成对带有调控信息的在线数据库(RegPrecise和RegulonDB)的访问以及基于网络的搜索引擎,加快了基因组分析并简化了基因组注释的校正。我们通过为两组细菌构建一系列调控蛋白结合位点图谱,展示了SigmoID的一些特性:软腐肠杆菌科(果胶杆菌属和迪基氏菌属)和假单胞菌属。此外,我们在黑胫果胶杆菌的注释基因组中推断出900多个转录因子结合位点和替代西格玛因子启动子。这些调控信号控制着约占黑胫果胶杆菌染色体40%的假定转录单元。在注释与调控信息不符的情况下进行审查,使我们能够校正300多个基因座的产物和基因名称。