Key Laboratory of Aquatic Genomics, Ministry of Agriculture, CAFS Key Laboratory of Aquatic Genomics and Beijing Key Laboratory of Fishery Biotechnology, Chinese Academy of Fishery Sciences, Beijing, 100141, China.
College of Fisheries and Life Science, Shanghai Ocean University, Shanghai, 201306, China.
BMC Genomics. 2018 Mar 2;19(1):175. doi: 10.1186/s12864-018-4567-3.
Obtaining complete gene structures is one major goal of genome assembly. Some gene regions are fragmented in low quality and high-quality assemblies. Therefore, new approaches are needed to recover gene regions. Genomes are widely transcribed, generating messenger and non-coding RNAs. These widespread transcripts can be used to scaffold genomes and complete transcribed regions.
We present P_RNA_scaffolder, a fast and accurate tool using paired-end RNA-sequencing reads to scaffold genomes. This tool aims to improve the completeness of both protein-coding and non-coding genes. After this tool was applied to scaffolding human contigs, the structures of both protein-coding genes and circular RNAs were almost completely recovered and equivalent to those in a complete genome, especially for long proteins and long circular RNAs. Tested in various species, P_RNA_scaffolder exhibited higher speed and efficiency than the existing state-of-the-art scaffolders. This tool also improved the contiguity of genome assemblies generated by current mate-pair scaffolding and third-generation single-molecule sequencing assembly.
The P_RNA_scaffolder can improve the contiguity of genome assembly and benefit gene prediction. This tool is available at http://www.fishbrowser.org/software/P_RNA_scaffolder .
获得完整的基因结构是基因组组装的主要目标之一。一些基因区域在低质量和高质量的组装中是碎片化的。因此,需要新的方法来恢复基因区域。基因组广泛转录,产生信使 RNA 和非编码 RNA。这些广泛转录的 RNA 可以用来支架基因组并完成转录区域。
我们提出了 P_RNA_scaffolder,这是一种使用配对末端 RNA-seq reads 来支架基因组的快速而准确的工具。该工具旨在提高蛋白质编码基因和非编码基因的完整性。在将该工具应用于人类基因的支架后,蛋白质编码基因和环状 RNA 的结构几乎完全恢复,与完整基因组相当,尤其是对于长蛋白和长环状 RNA。在各种物种中的测试表明,P_RNA_scaffolder 的速度和效率均高于现有的最先进的支架。该工具还提高了当前 mate-pair 支架和第三代单分子测序支架生成的基因组组装的连续性。
P_RNA_scaffolder 可以提高基因组组装的连续性,并有助于基因预测。该工具可在 http://www.fishbrowser.org/software/P_RNA_scaffolder 获得。