Suppr超能文献

IMOS:改进的基于 Spark 的 Meta-aligner 和 Minimap2

IMOS: improved Meta-aligner and Minimap2 On Spark.

机构信息

Department of Computer Engineering, Sharif University of Technology, Azadi, Tehran, Iran.

出版信息

BMC Bioinformatics. 2019 Jan 24;20(1):51. doi: 10.1186/s12859-018-2592-5.

Abstract

BACKGROUND

Long reads provide valuable information regarding the sequence composition of genomes. Long reads are usually very noisy which renders their alignments on the reference genome a daunting task. It may take days to process datasets enough to sequence a human genome on a single node. Hence, it is of primary importance to have an aligner which can operate on distributed clusters of computers with high performance in accuracy and speed.

RESULTS

In this paper, we presented IMOS, an aligner for mapping noisy long reads to the reference genome. It can be used on a single node as well as on distributed nodes. In its single-node mode, IMOS is an Improved version of Meta-aligner (IM) enhancing both its accuracy and speed. IM is up to 6x faster than the original Meta-aligner. It is also implemented to run IM and Minimap2 on Apache Spark for deploying on a cluster of nodes. Moreover, multi-node IMOS is faster than SparkBWA while executing both IM (1.5x) and Minimap2 (25x).

CONCLUSION

In this paper, we purposed an architecture for mapping long reads to a reference. Due to its implementation, IMOS speed can increase almost linearly with respect to the number of nodes in a cluster. Also, it is a multi-platform application able to operate on Linux, Windows, and macOS.

摘要

背景

长读提供了有关基因组序列组成的有价值信息。长读通常非常嘈杂,这使得它们在参考基因组上的对齐成为一项艰巨的任务。在单个节点上处理足够的数据集以对人类基因组进行测序可能需要数天的时间。因此,拥有一个可以在具有高性能的分布式计算机集群上运行的对齐器非常重要,该对齐器在准确性和速度方面都具有优势。

结果

在本文中,我们提出了 IMOS,这是一种将嘈杂的长读映射到参考基因组的对齐器。它既可以在单个节点上使用,也可以在分布式节点上使用。在其单节点模式下,IMOS 是对元对齐器(IM)的改进版本,提高了其准确性和速度。IM 比原始的元对齐器快 6 倍。它还被实现为在 Apache Spark 上运行 IM 和 Minimap2,以便在节点集群上部署。此外,多节点 IMOS 比 SparkBWA 快,在执行 IM(1.5 倍)和 Minimap2(25 倍)时都更快。

结论

在本文中,我们提出了一种将长读映射到参考的架构。由于其实现,IMOS 的速度可以几乎线性地随着集群中节点的数量而增加。此外,它是一个多平台应用程序,能够在 Linux、Windows 和 macOS 上运行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/378f/6345043/06a3ee553136/12859_2018_2592_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验