Suppr超能文献

一种用于从连续片段中获得带注释基因组的后组装基因组改进工具包(PAGIT)。

A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.

机构信息

Wellcome Trust Sanger Institute, Wellcome Trust Genome Campus, Cambridge, UK.

出版信息

Nat Protoc. 2012 Jun 7;7(7):1260-84. doi: 10.1038/nprot.2012.068.

Abstract

Genome projects now produce draft assemblies within weeks owing to advanced high-throughput sequencing technologies. For milestone projects such as Escherichia coli or Homo sapiens, teams of scientists were employed to manually curate and finish these genomes to a high standard. Nowadays, this is not feasible for most projects, and the quality of genomes is generally of a much lower standard. This protocol describes software (PAGIT) that is used to improve the quality of draft genomes. It offers flexible functionality to close gaps in scaffolds, correct base errors in the consensus sequence and exploit reference genomes (if available) in order to improve scaffolding and generating annotations. The protocol is most accessible for bacterial and small eukaryotic genomes (up to 300 Mb), such as pathogenic bacteria, malaria and parasitic worms. Applying PAGIT to an E. coli assembly takes ∼24 h: it doubles the average contig size and annotates over 4,300 gene models.

摘要

由于先进的高通量测序技术,基因组项目现在可以在数周内生成草稿组装。对于里程碑项目,如大肠杆菌或人类,科学家团队被雇用来手动编辑和完成这些基因组,以达到高标准。如今,对于大多数项目来说,这是不可行的,而且基因组的质量通常要低得多。本协议描述了用于提高草稿基因组质量的软件(PAGIT)。它提供了灵活的功能,可用于闭合支架中的缺口、纠正共识序列中的碱基错误,并利用参考基因组(如果有)来改进支架和生成注释。该协议最适用于细菌和小型真核生物基因组(高达 300 Mb),如病原菌、疟疾和寄生虫。将 PAGIT 应用于大肠杆菌组装需要约 24 小时:它将平均 contig 大小增加一倍,并注释了超过 4300 个基因模型。

相似文献

1
A post-assembly genome-improvement toolkit (PAGIT) to obtain annotated genomes from contigs.
Nat Protoc. 2012 Jun 7;7(7):1260-84. doi: 10.1038/nprot.2012.068.
2
CAR: contig assembly of prokaryotic draft genomes using rearrangements.
BMC Bioinformatics. 2014 Nov 28;15(1):381. doi: 10.1186/s12859-014-0381-3.
3
Multi-CAR: a tool of contig scaffolding using multiple references.
BMC Bioinformatics. 2016 Dec 23;17(Suppl 17):469. doi: 10.1186/s12859-016-1328-7.
4
GFinisher: a new strategy to refine and finish bacterial genome assemblies.
Sci Rep. 2016 Oct 10;6:34963. doi: 10.1038/srep34963.
5
Using genomic sequencing for classical genetics in E. coli K12.
PLoS One. 2011 Feb 25;6(2):e16717. doi: 10.1371/journal.pone.0016717.
6
Subset selection of high-depth next generation sequencing reads for de novo genome assembly using MapReduce framework.
BMC Genomics. 2015;16 Suppl 12(Suppl 12):S9. doi: 10.1186/1471-2164-16-S12-S9. Epub 2015 Dec 9.
7
Mapping contigs using CONTIGuator.
Methods Mol Biol. 2015;1231:163-76. doi: 10.1007/978-1-4939-1720-4_11.
8
Assembly of chromosome-scale contigs by efficiently resolving repetitive sequences with long reads.
Nat Commun. 2019 Nov 25;10(1):5360. doi: 10.1038/s41467-019-13355-3.
10
SIS: a program to generate draft genome sequence scaffolds for prokaryotes.
BMC Bioinformatics. 2012 May 14;13:96. doi: 10.1186/1471-2105-13-96.

引用本文的文献

1
Hybrid reference genome assemblies for , a primary agent of mucocutaneous leishmaniasis.
Microbiol Resour Announc. 2025 Jul 10;14(7):e0131724. doi: 10.1128/mra.01317-24. Epub 2025 Jun 17.
4
From contigs towards chromosomes: automatic improvement of long read assemblies (ILRA).
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad248.
5
Repeat infections with chlamydia in women may be more transcriptionally active with lower responses from some immune genes.
Front Public Health. 2022 Oct 10;10:1012835. doi: 10.3389/fpubh.2022.1012835. eCollection 2022.
7
The complete chloroplast genome of hemiparasitic flowering plant .
Mitochondrial DNA B Resour. 2016 Oct 18;1(1):767-769. doi: 10.1080/23802359.2016.1238753.
8
A novel terpene synthase controls differences in anti-aphrodisiac pheromone production between closely related Heliconius butterflies.
PLoS Biol. 2021 Jan 19;19(1):e3001022. doi: 10.1371/journal.pbio.3001022. eCollection 2021 Jan.

本文引用的文献

1
A systematically improved high quality genome and transcriptome of the human blood fluke Schistosoma mansoni.
PLoS Negl Trop Dis. 2012 Jan;6(1):e1455. doi: 10.1371/journal.pntd.0001455. Epub 2012 Jan 10.
2
GAGE: A critical evaluation of genome assemblies and assembly algorithms.
Genome Res. 2012 Mar;22(3):557-67. doi: 10.1101/gr.131383.111. Epub 2012 Jan 6.
3
NCBI Reference Sequences (RefSeq): current status, new features and genome annotation policy.
Nucleic Acids Res. 2012 Jan;40(Database issue):D130-5. doi: 10.1093/nar/gkr1079. Epub 2011 Nov 24.
4
Chromosome and gene copy number variation allow major structural change between species and strains of Leishmania.
Genome Res. 2011 Dec;21(12):2129-42. doi: 10.1101/gr.122945.111. Epub 2011 Oct 28.
5
Graph accordance of next-generation sequence assemblies.
Bioinformatics. 2012 Jan 1;28(1):13-6. doi: 10.1093/bioinformatics/btr588. Epub 2011 Oct 23.
6
Opera: reconstructing optimal genomic scaffolds with high-throughput paired-end sequences.
J Comput Biol. 2011 Nov;18(11):1681-91. doi: 10.1089/cmb.2011.0170. Epub 2011 Sep 19.
7
Genomic insights into the origin of parasitism in the emerging plant pathogen Bursaphelenchus xylophilus.
PLoS Pathog. 2011 Sep;7(9):e1002219. doi: 10.1371/journal.ppat.1002219. Epub 2011 Sep 1.
8
Cestode genomics - progress and prospects for advancing basic and applied aspects of flatworm biology.
Parasite Immunol. 2012 Feb-Mar;34(2-3):130-50. doi: 10.1111/j.1365-3024.2011.01319.x.
9
CONTIGuator: a bacterial genomes finishing tool for structural insights on draft genomes.
Source Code Biol Med. 2011 Jun 21;6:11. doi: 10.1186/1751-0473-6-11.
10
Genome sequence of Staphylococcus lugdunensis N920143 allows identification of putative colonization and virulence factors.
FEMS Microbiol Lett. 2011 Sep;322(1):60-7. doi: 10.1111/j.1574-6968.2011.02339.x. Epub 2011 Jul 4.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验