Witte Paz Mathias, Vogel Thomas, Nieselt Kay
Institute for Bioinformatics and Medical Informatics, Department of Computer Science, University of Tübingen, Sand 14, Tübingen 72076, Germany.
NAR Genom Bioinform. 2024 Dec 18;6(4):lqae168. doi: 10.1093/nargab/lqae168. eCollection 2024 Dec.
RNA-seq and its 5'-enrichment methods for prokaryotes have enabled the precise identification of transcription start sites (TSSs), improving gene expression analysis. Computational methods are applied to these data to identify TSSs and classify them based on proximal annotated genes. While some TSSs cannot be classified at all (orphan TSSs), other TSSs are found on the reverse strand of known genes (antisense TSSs) but are not associated with the direct transcription of any known gene. Here, we introduce TSS-Captur, a novel pipeline, which uses computational approaches to characterize genomic regions starting from experimentally confirmed but unclassified TSSs. By analyzing TSS data, TSS-Captur characterizes unclassified signals, complementing prokaryotic genome annotation tools. TSS-Captur categorizes extracted transcripts as either messenger RNA for genes with coding potential or non-coding RNA (ncRNA) for non-translated genes. Additionally, it predicts the transcription termination site for each putative transcript. For ncRNA genes, the secondary structure is computed. Moreover, all putative promoter regions are analyzed to identify enriched motifs. An interactive report allows seamless data exploration. We validated TSS-Captur with a dataset and characterized unlabeled ncRNAs in . TSS-Captur is available both as a web-application and as a command-line tool.
用于原核生物的RNA测序及其5'端富集方法能够精确识别转录起始位点(TSS),改善基因表达分析。计算方法应用于这些数据以识别TSS并根据近端注释基因对其进行分类。虽然一些TSS根本无法分类(孤儿TSS),但其他TSS是在已知基因的反义链上发现的(反义TSS),但与任何已知基因的直接转录无关。在这里,我们介绍了TSS-Captur,这是一种新颖的流程,它使用计算方法从实验确认但未分类的TSS开始表征基因组区域。通过分析TSS数据,TSS-Captur对未分类的信号进行表征,补充原核生物基因组注释工具。TSS-Captur将提取的转录本分类为具有编码潜力的基因的信使RNA或非翻译基因的非编码RNA(ncRNA)。此外,它还预测每个推定转录本的转录终止位点。对于ncRNA基因,计算其二级结构。此外,对所有推定的启动子区域进行分析以识别富集的基序。交互式报告允许无缝的数据探索。我们用一个数据集验证了TSS-Captur,并对[具体物种]中未标记的ncRNA进行了表征。TSS-Captur既可以作为网络应用程序也可以作为命令行工具使用。