Arun P V Parvati Sai, Prakash Jogadhenu S S
Department of Plant Sciences, School of Life Sciences, University of Hyderabad, Hyderabad, 500046, India.
Department of Biotechnology and Bioinformatics, School of Life Sciences, University of Hyderabad, P. O. Central University, Hyderabad, 500046, India.
3 Biotech. 2016 Jun;6(1):74. doi: 10.1007/s13205-016-0363-4. Epub 2016 Feb 16.
UpCoT is a pipeline tool developed by automating the series of steps involved in prediction of cis-regulatory elements. UpCoT generates orthologs for each gene in target genome using bi-directional best blast hit against the reference genomes, then identifies potential orthologous transcriptional units using intergenic distance. Finally it generates the FASTA files containing upstream sequences of orthologous transcriptional units of each gene in target genome. The inputs of UpCoT are protein sequence files (.faa), genome sequence files (.fna) and gene co-ordinate files (*.ptt) for target and reference genomes. The clustered-upstream DNA sequences can be used by motif prediction tool, such as MEME, Bio-prospector, Gibbs motif sampler, MDscan for prediction of conserved DNA elements. We tested the performance of UpCoT by selecting the genome of Synechocystis sp PCC 6803 as the target and 13 different cyanobacterial genomes as reference. The clustered upstream sequences generated by UpCoT of groES, ycf24 and nirA were used for cis-regulatory element prediction. The results were consistent with the experimentally identified cis-regulatory elements. Therefore, UpCoT is a reliable and automated pipeline package for prediction of orthologs, orthologous transcriptional units, and orthologous upstream sequences of a selected prokaryotic genome. UpCoT can be downloaded from http://jssplab.uohyd.ac.in/upcot/ .
UpCoT是一种通过自动化预测顺式调控元件所涉及的一系列步骤而开发的管道工具。UpCoT利用针对参考基因组的双向最佳比对为目标基因组中的每个基因生成直系同源物,然后使用基因间距离识别潜在的直系同源转录单元。最后,它生成包含目标基因组中每个基因的直系同源转录单元上游序列的FASTA文件。UpCoT的输入是目标基因组和参考基因组的蛋白质序列文件(.faa)、基因组序列文件(.fna)和基因坐标文件(*.ptt)。聚类的上游DNA序列可被模体预测工具(如MEME、Bio-prospector、Gibbs模体采样器、MDscan)用于预测保守DNA元件。我们通过选择集胞藻属PCC 6803的基因组作为目标,13个不同的蓝藻基因组作为参考来测试UpCoT的性能。UpCoT生成的groES、ycf24和nirA的聚类上游序列用于顺式调控元件预测。结果与实验鉴定的顺式调控元件一致。因此,UpCoT是一个用于预测所选原核基因组的直系同源物、直系同源转录单元和直系同源上游序列的可靠且自动化的管道软件包。UpCoT可从http://jssplab.uohyd.ac.in/upcot/下载。