Suppr超能文献

获取新月柄杆菌CB13的准确序列和注释数据。

Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.

作者信息

Berrios Louis, Ely Bert

机构信息

Department of Biological Sciences, University of South Carolina, Columbia, SC, 29208, USA.

出版信息

Curr Microbiol. 2018 Dec;75(12):1642-1648. doi: 10.1007/s00284-018-1572-3. Epub 2018 Sep 26.

Abstract

Annotated sequence data are instrumental in nearly all realms of biology. However, the advent of next-generation sequencing has rapidly facilitated an imbalance between accurate sequence data and accurate annotation data. To increase the annotation accuracy of the Caulobacter vibrioides CB13b1a (CB13) genome, we compared the PGAP and RAST annotations of the CB13 genome. A total of 64 unique genes were identified in the PGAP annotation that were either completely or partially absent in the RAST annotation, and a total of 16 genes were identified in the RAST annotation that were not included in the PGAP annotation. Moreover, PGAP identified 73 frameshifted genes and 22 genes with an internal stop. In contrast, RAST annotated the larger segment of these frameshifted genes without indicating a change in reading frame may have occurred. The RAST annotation did not include any genes with internal stop codons, since it chose start codons that were after the internal stop. To confirm the discrepancies between the two annotations and verify the accuracy of the CB13 genome sequence data, we re-sequenced and re-annotated the entire genome and obtained an identical sequence, except in a small number of homopolymer regions. A genome sequence comparison between the two versions allowed us to determine the correct number of bases in each homopolymer region, which eliminated frameshifts for 31 genes annotated as frameshifted genes and removed 24 pseudogenes from the PGAP annotation. Both annotation systems correctly identified genes that were missed by the other system. In addition, PGAP identified conserved gene fragments that represented the beginning of genes, but it employed no corrective method to adjust the reading frame of frameshifted genes or the start sites of genes harboring an internal stop codon. In doing so, the PGAP annotation identified a large number of pseudogenes, which may reflect evolutionary history but likely do not produce gene products. These results demonstrate that re-sequencing and annotation comparisons can be used to increase the accuracy of genomic data and the corresponding gene annotation.

摘要

带注释的序列数据在几乎所有生物学领域都发挥着重要作用。然而,新一代测序技术的出现迅速加剧了准确序列数据与准确注释数据之间的不平衡。为了提高新月柄杆菌CB13b1a(CB13)基因组的注释准确性,我们比较了CB13基因组的PGAP注释和RAST注释。在PGAP注释中总共鉴定出64个独特基因,这些基因在RAST注释中完全或部分缺失,而在RAST注释中总共鉴定出16个基因未包含在PGAP注释中。此外,PGAP鉴定出73个移码基因和22个带有内部终止密码子的基因。相比之下,RAST对这些移码基因的较大片段进行了注释,但未表明可能发生了读框变化。RAST注释不包括任何带有内部终止密码子的基因,因为它选择的起始密码子在内部终止密码子之后。为了确认两种注释之间的差异并验证CB13基因组序列数据的准确性,我们对整个基因组进行了重新测序和重新注释,除了少数同聚物区域外,获得了相同的序列。两个版本之间的基因组序列比较使我们能够确定每个同聚物区域中的正确碱基数,这消除了31个被注释为移码基因的基因的移码,并从PGAP注释中删除了24个假基因。两种注释系统都正确鉴定出了另一个系统遗漏的基因。此外,PGAP鉴定出了代表基因起始的保守基因片段,但它没有采用任何校正方法来调整移码基因的读框或带有内部终止密码子的基因的起始位点。这样一来,PGAP注释鉴定出了大量假基因,这些假基因可能反映了进化历史,但可能不产生基因产物。这些结果表明,重新测序和注释比较可用于提高基因组数据及相应基因注释的准确性。

相似文献

1
Achieving Accurate Sequence and Annotation Data for Caulobacter vibrioides CB13.
Curr Microbiol. 2018 Dec;75(12):1642-1648. doi: 10.1007/s00284-018-1572-3. Epub 2018 Sep 26.
2
Correction of the Caulobacter crescentus NA1000 genome annotation.
PLoS One. 2014 Mar 12;9(3):e91668. doi: 10.1371/journal.pone.0091668. eCollection 2014.
3
Re-annotation of genome microbial coding-sequences: finding new genes and inaccurately annotated genes.
BMC Bioinformatics. 2002;3:5. doi: 10.1186/1471-2105-3-5. Epub 2002 Feb 5.
8
Comparative omics-driven genome annotation refinement: application across Yersiniae.
PLoS One. 2012;7(3):e33903. doi: 10.1371/journal.pone.0033903. Epub 2012 Mar 27.
9
Reannotation of translational start sites in the genome of Mycobacterium tuberculosis.
Tuberculosis (Edinb). 2013 Jan;93(1):18-25. doi: 10.1016/j.tube.2012.11.012. Epub 2012 Dec 26.
10
Genome sequence and phenotypic characterization of Caulobacter segnis.
Curr Microbiol. 2015 Mar;70(3):355-63. doi: 10.1007/s00284-014-0726-1. Epub 2014 Nov 15.

引用本文的文献

1
Novel sp. SCA7 Promotes Plant Growth in Two Plant Families and Induces Systemic Resistance in .
Front Microbiol. 2022 Jun 27;13:923515. doi: 10.3389/fmicb.2022.923515. eCollection 2022.
2
Recombination and gene loss occur simultaneously during bacterial horizontal gene transfer.
PLoS One. 2020 Jan 28;15(1):e0227987. doi: 10.1371/journal.pone.0227987. eCollection 2020.
3
Genome Comparisons of Wild Isolates of Caulobacter crescentus Reveal Rates of Inversion and Horizontal Gene Transfer.
Curr Microbiol. 2019 Feb;76(2):159-167. doi: 10.1007/s00284-018-1606-x. Epub 2018 Nov 27.

本文引用的文献

1
Dynamic translation regulation in Caulobacter cell cycle control.
Proc Natl Acad Sci U S A. 2016 Nov 1;113(44):E6859-E6867. doi: 10.1073/pnas.1614795113. Epub 2016 Oct 17.
2
NCBI prokaryotic genome annotation pipeline.
Nucleic Acids Res. 2016 Aug 19;44(14):6614-24. doi: 10.1093/nar/gkw569. Epub 2016 Jun 24.
3
4
Comparison of genome sequencing technology and assembly methods for the analysis of a GC-rich bacterial genome.
Curr Microbiol. 2015 Mar;70(3):338-44. doi: 10.1007/s00284-014-0721-6. Epub 2014 Nov 7.
5
Correction of the Caulobacter crescentus NA1000 genome annotation.
PLoS One. 2014 Mar 12;9(3):e91668. doi: 10.1371/journal.pone.0091668. eCollection 2014.
6
The SEED and the Rapid Annotation of microbial genomes using Subsystems Technology (RAST).
Nucleic Acids Res. 2014 Jan;42(Database issue):D206-14. doi: 10.1093/nar/gkt1226. Epub 2013 Nov 29.
7
Advantages of Single-Molecule Real-Time Sequencing in High-GC Content Genomes.
PLoS One. 2013 Jul 23;8(7):e68824. doi: 10.1371/journal.pone.0068824. Print 2013.
8
NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy.
Nucleic Acids Res. 2012 Jan;40(Database issue):D130-5. doi: 10.1093/nar/gkr1079. Epub 2011 Nov 24.
9
The essential genome of a bacterium.
Mol Syst Biol. 2011 Aug 30;7:528. doi: 10.1038/msb.2011.58.
10
Mauve assembly metrics.
Bioinformatics. 2011 Oct 1;27(19):2756-7. doi: 10.1093/bioinformatics/btr451. Epub 2011 Aug 2.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验