Faculty of Computer Engineering, Federal University of Pará campus Tucuruí (CAMTUC-UFPA), Pará, Brazil.
Federal Rural University of Amazonia campus Tomé-Açu (UFRA), Pará, Brazil.
PLoS One. 2018 Oct 26;13(10):e0206000. doi: 10.1371/journal.pone.0206000. eCollection 2018.
The availability of biological information in public databases has increased exponentially. To ensure the accuracy of this information, researchers have adopted several methods and refinements to avoid the dissemination of incorrect information; for example, several automated tools are available for annotation processes. However, manual curation ensures and enriches biological information. Additionally, the genomic finishing process is complex, resulting in increased deposition of drafts genomes. This introduces bias in other omics analyses because incomplete genomic content is used. This is also observed for complete genomes. For example, genomes generated by reference assembly may not include new products in the new sequence or errors or bias can occur during the assembly process. Thus, we developed ImproveAssembly, a tool capable of identifying new products missing from genomic sequences, which can be used for complete and draft genomes. The identified products can improve the annotation of complete genomes and drafts while significantly reducing the bias when the information is used in other omics analyses.
公共数据库中生物信息的可用性呈指数级增长。为了确保这些信息的准确性,研究人员已经采用了多种方法和改进措施来避免错误信息的传播;例如,有几个自动化工具可用于注释过程。然而,人工校对可以确保和丰富生物信息。此外,基因组完成过程非常复杂,导致草稿基因组的提交量增加。这会导致其他组学分析出现偏差,因为使用的是不完整的基因组内容。这在完整基因组中也会出现。例如,通过参考组装生成的基因组可能不包括新序列中的新产品,或者在组装过程中可能会出现错误或偏差。因此,我们开发了 ImproveAssembly,这是一种能够识别基因组序列中缺失新产品的工具,可用于完整和草稿基因组。所识别的产物可改进完整基因组和草稿基因组的注释,同时在将这些信息用于其他组学分析时显著减少偏差。
Bioinformatics. 2015-11-1
FEMS Microbiol Lett. 2016-12
Sci Rep. 2016-10-10
BMC Bioinformatics. 2018-11-6
Genet Mol Biol. 2017
Genome Med. 2017-5-30
PLoS One. 2017-5-24
Funct Integr Genomics. 2015-2-27
Gigascience. 2012-12-27
BMC Genomics. 2012-1-10
Nat Methods. 2010-11-21