Suppr超能文献

将基因组组装与 MAIA 整合。

Integrating genome assemblies with MAIA.

机构信息

Department of Mediamatics, Delft University of Technology, Delft, The Netherlands.

出版信息

Bioinformatics. 2010 Sep 15;26(18):i433-9. doi: 10.1093/bioinformatics/btq366.

Abstract

MOTIVATION

De novo assembly of a eukaryotic genome with next-generation sequencing data is still a challenging task. Over the past few years several assemblers have been developed, often suitable for one specific type of sequencing data. The number of known genomes is expanding rapidly, therefore it becomes possible to use multiple reference genomes for assembly projects. We introduce an assembly integrator that makes use of all available data, i.e. multiple de novo assemblies and mappings against multiple related genomes, by optimizing a weighted combination of criteria.

RESULTS

The developed algorithm was applied on the de novo sequencing of the Saccharomyces cerevisiae CEN.PK 113-7D strain. Using Solexa and 454 read data, two de novo and three comparative assemblies were constructed and subsequently integrated, yielding 29 contigs, covering more than 12 Mbp; a drastic improvement compared with the single assemblies.

AVAILABILITY

MAIA is available as a Matlab package and can be downloaded from http://bioinformatics.tudelft.nl.

摘要

动机

利用下一代测序数据从头组装真核生物基因组仍然是一项具有挑战性的任务。在过去的几年中,已经开发了几种组装程序,通常适用于一种特定类型的测序数据。已知基因组的数量正在迅速增加,因此可以将多个参考基因组用于组装项目。我们引入了一种组装集成器,通过优化加权组合标准,利用所有可用的数据,即多个从头组装和多个相关基因组的映射。

结果

所开发的算法应用于酿酒酵母 CEN.PK 113-7D 菌株的从头测序。使用 Solexa 和 454 读数据,构建了两个从头组装和三个比较组装,随后进行了集成,生成了 29 个覆盖超过 12 Mbp 的 contigs;与单个组装相比有了显著的改进。

可用性

MAIA 作为一个 Matlab 程序包提供,可以从 http://bioinformatics.tudelft.nl 下载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7299/2935414/a13e380eb05b/btq366f1.jpg

相似文献

1
Integrating genome assemblies with MAIA.
Bioinformatics. 2010 Sep 15;26(18):i433-9. doi: 10.1093/bioinformatics/btq366.
3
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
4
AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references.
Bioinformatics. 2014 Jun 15;30(12):i319-i328. doi: 10.1093/bioinformatics/btu291.
5
The complex task of choosing a de novo assembly: lessons from fungal genomes.
Comput Biol Chem. 2014 Dec;53 Pt A:97-107. doi: 10.1016/j.compbiolchem.2014.08.014. Epub 2014 Aug 29.
6
A new strategy for better genome assembly from very short reads.
BMC Bioinformatics. 2011 Dec 30;12:493. doi: 10.1186/1471-2105-12-493.
8
De novo detection of copy number variation by co-assembly.
Bioinformatics. 2012 Dec 15;28(24):3195-202. doi: 10.1093/bioinformatics/bts601. Epub 2012 Oct 9.
9
Integrative Meta-Assembly Pipeline (IMAP): Chromosome-level genome assembler combining multiple de novo assemblies.
PLoS One. 2019 Aug 27;14(8):e0221858. doi: 10.1371/journal.pone.0221858. eCollection 2019.
10
dnAQET: a framework to compute a consolidated metric for benchmarking quality of de novo assemblies.
BMC Genomics. 2019 Sep 11;20(1):706. doi: 10.1186/s12864-019-6070-x.

引用本文的文献

1
A high-quality reference genome for the fish pathogen .
Microb Genom. 2022 Mar;8(3). doi: 10.1099/mgen.0.000777.
3
MAC: Merging Assemblies by Using Adjacency Algebraic Model and Classification.
Front Genet. 2020 Jan 31;10:1396. doi: 10.3389/fgene.2019.01396. eCollection 2019.
4
CAMSA: a tool for comparative analysis and merging of scaffold assemblies.
BMC Bioinformatics. 2017 Dec 6;18(Suppl 15):496. doi: 10.1186/s12859-017-1919-y.
5
Single molecule sequencing-guided scaffolding and correction of draft assemblies.
BMC Genomics. 2017 Dec 6;18(Suppl 10):879. doi: 10.1186/s12864-017-4271-8.
7
Approaches for in silico finishing of microbial genome sequences.
Genet Mol Biol. 2017;40(3):553-576. doi: 10.1590/1678-4685-GMB-2016-0230.
8
A comparative evaluation of genome assembly reconciliation tools.
Genome Biol. 2017 May 18;18(1):93. doi: 10.1186/s13059-017-1213-3.
9
Metassembler: merging and optimizing de novo genome assemblies.
Genome Biol. 2015 Sep 24;16:207. doi: 10.1186/s13059-015-0764-4.
10
ScaffoldScaffolder: solving contig orientation via bidirected to directed graph reduction.
Bioinformatics. 2016 Jan 1;32(1):17-24. doi: 10.1093/bioinformatics/btv548. Epub 2015 Sep 17.

本文引用的文献

2
Genome structure of a Saccharomyces cerevisiae strain widely used in bioethanol production.
Genome Res. 2009 Dec;19(12):2258-70. doi: 10.1101/gr.091777.109. Epub 2009 Oct 7.
3
ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.
Genome Biol. 2009;10(10):R103. doi: 10.1186/gb-2009-10-10-r103. Epub 2009 Oct 1.
4
De novo genome sequence assembly of a filamentous fungus using Sanger, 454 and Illumina sequence data.
Genome Biol. 2009;10(9):R94. doi: 10.1186/gb-2009-10-9-r94. Epub 2009 Sep 11.
5
The Sequence Alignment/Map format and SAMtools.
Bioinformatics. 2009 Aug 15;25(16):2078-9. doi: 10.1093/bioinformatics/btp352. Epub 2009 Jun 8.
6
Fast and accurate short read alignment with Burrows-Wheeler transform.
Bioinformatics. 2009 Jul 15;25(14):1754-60. doi: 10.1093/bioinformatics/btp324. Epub 2009 May 18.
7
ABySS: a parallel assembler for short read sequence data.
Genome Res. 2009 Jun;19(6):1117-23. doi: 10.1101/gr.089532.108. Epub 2009 Feb 27.
8
Comprehensive polymorphism survey elucidates population structure of Saccharomyces cerevisiae.
Nature. 2009 Mar 19;458(7236):342-5. doi: 10.1038/nature07670. Epub 2009 Feb 11.
10
De novo assembly using low-coverage short read sequence data from the rice pathogen Pseudomonas syringae pv. oryzae.
Genome Res. 2009 Feb;19(2):294-305. doi: 10.1101/gr.083311.108. Epub 2008 Nov 17.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验