Suppr超能文献

AGORA:基于光学限制对齐的组装。

AGORA: Assembly Guided by Optical Restriction Alignment.

机构信息

Center for Bioinformatics and Computational Biology, University of Maryland-College Park, College Park, MD, USA.

出版信息

BMC Bioinformatics. 2012 Aug 2;13:189. doi: 10.1186/1471-2105-13-189.

Abstract

BACKGROUND

Genome assembly is difficult due to repeated sequences within the genome, which create ambiguities and cause the final assembly to be broken up into many separate sequences (contigs). Long range linking information, such as mate-pairs or mapping data, is necessary to help assembly software resolve repeats, thereby leading to a more complete reconstruction of genomes. Prior work has used optical maps for validating assemblies and scaffolding contigs, after an initial assembly has been produced. However, optical maps have not previously been used within the genome assembly process. Here, we use optical map information within the popular de Bruijn graph assembly paradigm to eliminate paths in the de Bruijn graph which are not consistent with the optical map and help determine the correct reconstruction of the genome.

RESULTS

We developed a new algorithm called AGORA: Assembly Guided by Optical Restriction Alignment. AGORA is the first algorithm to use optical map information directly within the de Bruijn graph framework to help produce an accurate assembly of a genome that is consistent with the optical map information provided. Our simulations on bacterial genomes show that AGORA is effective at producing assemblies closely matching the reference sequences.Additionally, we show that noise in the optical map can have a strong impact on the final assembly quality for some complex genomes, and we also measure how various characteristics of the starting de Bruijn graph may impact the quality of the final assembly. Lastly, we show that a proper choice of restriction enzyme for the optical map may substantially improve the quality of the final assembly.

CONCLUSIONS

Our work shows that optical maps can be used effectively to assemble genomes within the de Bruijn graph assembly framework. Our experiments also provide insights into the characteristics of the mapping data that most affect the performance of our algorithm, indicating the potential benefit of more accurate optical mapping technologies, such as nano-coding.

摘要

背景

由于基因组内存在重复序列,基因组组装较为困难,这些重复序列造成了歧义,导致最终的组装被分成许多单独的序列(重叠群)。长程连接信息,如 mate-pairs 或映射数据,对于帮助组装软件解决重复问题是必要的,从而可以更完整地重建基因组。先前的工作已经在初始组装完成后使用光学图谱来验证组装和支架重叠群。然而,在基因组组装过程中尚未使用过光学图谱。在这里,我们在流行的 de Bruijn 图组装范例中使用光学图谱信息来消除 de Bruijn 图中与光学图谱不一致的路径,从而有助于确定基因组的正确重建。

结果

我们开发了一种名为 AGORA 的新算法:基于光学限制比对的组装。AGORA 是第一个在 de Bruijn 图框架内直接使用光学图谱信息来帮助生成与提供的光学图谱信息一致的基因组准确组装的算法。我们在细菌基因组上的模拟表明,AGORA 能够有效地生成与参考序列高度匹配的组装结果。此外,我们表明,光学图谱中的噪声会对某些复杂基因组的最终组装质量产生强烈影响,并且我们还测量了起始 de Bruijn 图的各种特征如何影响最终组装质量。最后,我们表明,选择适当的限制酶进行光学图谱可以大大提高最终组装的质量。

结论

我们的工作表明,光学图谱可以在 de Bruijn 图组装框架内有效地用于组装基因组。我们的实验还深入了解了映射数据的特征,这些特征最能影响我们算法的性能,表明更精确的光学图谱技术(如纳米编码)的潜在优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/90bc/3431216/6fafe121d2a6/1471-2105-13-189-1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验