• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

AGORA:基于光学限制对齐的组装。

AGORA: Assembly Guided by Optical Restriction Alignment.

机构信息

Center for Bioinformatics and Computational Biology, University of Maryland-College Park, College Park, MD, USA.

出版信息

BMC Bioinformatics. 2012 Aug 2;13:189. doi: 10.1186/1471-2105-13-189.

DOI:10.1186/1471-2105-13-189
PMID:22856673
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3431216/
Abstract

BACKGROUND

Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome.

RESULTS

We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences.Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly.

CONCLUSIONS

Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding.

摘要

背景

由于基因组内存在重复序列,基因组组装较为困难,这些重复序列造成了歧义,导致最终的组装被分成许多单独的序列(重叠群)。长程连接信息,如 mate-pairs 或映射数据,对于帮助组装软件解决重复问题是必要的,从而可以更完整地重建基因组。先前的工作已经在初始组装完成后使用光学图谱来验证组装和支架重叠群。然而,在基因组组装过程中尚未使用过光学图谱。在这里,我们在流行的 de Bruijn 图组装范例中使用光学图谱信息来消除 de Bruijn 图中与光学图谱不一致的路径,从而有助于确定基因组的正确重建。

结果

我们开发了一种名为 AGORA 的新算法:基于光学限制比对的组装。AGORA 是第一个在 de Bruijn 图框架内直接使用光学图谱信息来帮助生成与提供的光学图谱信息一致的基因组准确组装的算法。我们在细菌基因组上的模拟表明,AGORA 能够有效地生成与参考序列高度匹配的组装结果。此外,我们表明,光学图谱中的噪声会对某些复杂基因组的最终组装质量产生强烈影响,并且我们还测量了起始 de Bruijn 图的各种特征如何影响最终组装质量。最后,我们表明,选择适当的限制酶进行光学图谱可以大大提高最终组装的质量。

结论

我们的工作表明,光学图谱可以在 de Bruijn 图组装框架内有效地用于组装基因组。我们的实验还深入了解了映射数据的特征,这些特征最能影响我们算法的性能,表明更精确的光学图谱技术(如纳米编码)的潜在优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/f31a241bc570/1471-2105-13-189-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/6fafe121d2a6/1471-2105-13-189-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/7f1dc2bc6673/1471-2105-13-189-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/706985f836db/1471-2105-13-189-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/8f449cdc8b8b/1471-2105-13-189-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/f31a241bc570/1471-2105-13-189-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/6fafe121d2a6/1471-2105-13-189-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/7f1dc2bc6673/1471-2105-13-189-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/706985f836db/1471-2105-13-189-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/8f449cdc8b8b/1471-2105-13-189-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/f31a241bc570/1471-2105-13-189-5.jpg

相似文献

1
AGORA: Assembly Guided by Optical Restriction Alignment.AGORA:基于光学限制对齐的组装。
BMC Bioinformatics. 2012 Aug 2;13:189. doi: 10.1186/1471-2105-13-189.
2
Read mapping on de Bruijn graphs.在德布鲁因图上进行读段映射。
BMC Bioinformatics. 2016 Jun 16;17(1):237. doi: 10.1186/s12859-016-1103-9.
3
Benchmarking of de novo assembly algorithms for Nanopore data reveals optimal performance of OLC approaches.用于纳米孔数据的从头组装算法基准测试揭示了重叠布局一致(OLC)方法的最佳性能。
BMC Genomics. 2016 Aug 22;17 Suppl 7(Suppl 7):507. doi: 10.1186/s12864-016-2895-8.
4
Aligning optical maps to de Bruijn graphs.将光学图谱比对到 De Bruijn 图上。
Bioinformatics. 2019 Sep 15;35(18):3250-3256. doi: 10.1093/bioinformatics/btz069.
5
Paired de bruijn graphs: a novel approach for incorporating mate pair information into genome assemblers.配对德布鲁因图:一种将配对末端信息整合到基因组组装工具中的新方法。
J Comput Biol. 2011 Nov;18(11):1625-34. doi: 10.1089/cmb.2011.0151. Epub 2011 Oct 14.
6
OMACC: an Optical-Map-Assisted Contig Connector for improving de novo genome assembly.OMACC:一种用于改进从头基因组组装的光学图谱辅助重叠群连接工具。
BMC Syst Biol. 2013;7 Suppl 6(Suppl 6):S7. doi: 10.1186/1752-0509-7-S6-S7. Epub 2013 Dec 13.
7
OMGS: Optical Map-Based Genome Scaffolding.OMGS:基于光学图谱的基因组支架构建
J Comput Biol. 2020 Apr;27(4):519-533. doi: 10.1089/cmb.2019.0310. Epub 2019 Dec 3.
8
Evaluation of short read metagenomic assembly.短读宏基因组组装评估。
BMC Genomics. 2011;12 Suppl 2(Suppl 2):S8. doi: 10.1186/1471-2164-12-S2-S8. Epub 2011 Jul 27.
9
FastEtch: A Fast Sketch-Based Assembler for Genomes.FastEtch:一种基于草图的快速基因组装配器。
IEEE/ACM Trans Comput Biol Bioinform. 2019 Jul-Aug;16(4):1091-1106. doi: 10.1109/TCBB.2017.2737999. Epub 2017 Sep 11.
10
A clone-free, single molecule map of the domestic cow (Bos taurus) genome.家牛(Bos taurus)基因组的无克隆单分子图谱。
BMC Genomics. 2015 Aug 28;16(1):644. doi: 10.1186/s12864-015-1823-7.

引用本文的文献

1
HGGA: hierarchical guided genome assembler.HGGA:层次引导基因组组装器。
BMC Bioinformatics. 2022 May 7;23(1):167. doi: 10.1186/s12859-022-04701-2.
2
Filling gaps of genome scaffolds via probabilistic searching optical maps against assembly graph.基于组装图的概率搜索光学图谱填补基因组支架的缺口。
BMC Bioinformatics. 2021 Oct 30;22(1):533. doi: 10.1186/s12859-021-04448-2.
3
Signal-based optical map alignment.基于信号的光学图谱比对。

本文引用的文献

1
GAGE: A critical evaluation of genome assemblies and assembly algorithms.盖奇:基因组组装和算法的关键评估。
Genome Res. 2012 Mar;22(3):557-67. doi: 10.1101/gr.131383.111. Epub 2012 Jan 6.
2
Assessing the benefits of using mate-pairs to resolve repeats in de novo short-read prokaryotic assemblies.评估使用 Mate-Pairs 解决从头组装的短读 prokaryotic 重复的好处。
BMC Bioinformatics. 2011 Apr 13;12:95. doi: 10.1186/1471-2105-12-95.
3
High-quality draft assemblies of mammalian genomes from massively parallel sequence data.
PLoS One. 2021 Sep 30;16(9):e0253102. doi: 10.1371/journal.pone.0253102. eCollection 2021.
4
Advances in optical mapping for genomic research.基因组研究中光学图谱技术的进展。
Comput Struct Biotechnol J. 2020 Aug 1;18:2051-2062. doi: 10.1016/j.csbj.2020.07.018. eCollection 2020.
5
Optical map guided genome assembly.光学图谱指导的基因组组装。
BMC Bioinformatics. 2020 Jul 6;21(1):285. doi: 10.1186/s12859-020-03623-1.
6
Random transposon insertion in the Mycoplasma hominis minimal genome.随机转座子插入人型支原体最小基因组。
Sci Rep. 2019 Sep 19;9(1):13554. doi: 10.1038/s41598-019-49919-y.
7
Fast and accurate correction of optical mapping data via spaced seeds.通过间隔种子实现光学作图数据的快速准确校正。
Bioinformatics. 2020 Feb 1;36(3):682-689. doi: 10.1093/bioinformatics/btz663.
8
Kermit: linkage map guided long read assembly.克米特:连锁图谱引导的长读长序列组装。
Algorithms Mol Biol. 2019 Mar 20;14:8. doi: 10.1186/s13015-019-0143-x. eCollection 2019.
9
Genetic Crosses and Linkage Mapping in Schistosome Parasites.血吸虫寄生虫的遗传杂交与连锁图谱构建。
Trends Parasitol. 2018 Nov;34(11):982-996. doi: 10.1016/j.pt.2018.08.001. Epub 2018 Aug 24.
10
OMSV enables accurate and comprehensive identification of large structural variations from nanochannel-based single-molecule optical maps.OMSV 能够基于纳米孔单分子光学图谱实现对大型结构变异的精确和全面识别。
Genome Biol. 2017 Dec 1;18(1):230. doi: 10.1186/s13059-017-1356-2.
利用大规模平行测序数据生成高质量的哺乳动物基因组草图组装。
Proc Natl Acad Sci U S A. 2011 Jan 25;108(4):1513-8. doi: 10.1073/pnas.1017351108. Epub 2010 Dec 27.
4
Scoring-and-unfolding trimmed tree assembler: concepts, constructs and comparisons.修剪树组装器的评分与展开:概念、结构和比较。
Bioinformatics. 2011 Jan 15;27(2):153-60. doi: 10.1093/bioinformatics/btq646. Epub 2010 Nov 18.
5
High-resolution human genome structure by single-molecule analysis.基于单分子分析的高分辨率人类基因组结构。
Proc Natl Acad Sci U S A. 2010 Jun 15;107(24):10848-53. doi: 10.1073/pnas.0914638107. Epub 2010 Jun 1.
6
Assembly complexity of prokaryotic genomes using short reads.使用短读长组装原核基因组的复杂性。
BMC Bioinformatics. 2010 Jan 12;11:21. doi: 10.1186/1471-2105-11-21.
7
A single molecule scaffold for the maize genome.一个用于玉米基因组的单分子支架。
PLoS Genet. 2009 Nov;5(11):e1000711. doi: 10.1371/journal.pgen.1000711. Epub 2009 Nov 20.
8
ALLPATHS 2: small genomes assembled accurately and with high continuity from short paired reads.ALLPATHS 2:使用短配对读取准确且高度连续地组装小基因组。
Genome Biol. 2009;10(10):R103. doi: 10.1186/gb-2009-10-10-r103. Epub 2009 Oct 1.
9
Parametric complexity of sequence assembly: theory and applications to next generation sequencing.序列组装的参数复杂性:理论及其在新一代测序中的应用
J Comput Biol. 2009 Jul;16(7):897-908. doi: 10.1089/cmb.2009.0005.
10
SOAP2: an improved ultrafast tool for short read alignment.SOAP2:一种用于短读序列比对的改进型超快速工具。
Bioinformatics. 2009 Aug 1;25(15):1966-7. doi: 10.1093/bioinformatics/btp336. Epub 2009 Jun 3.