Suppr超能文献

pmTM-align:基于 Apache Spark 和 OpenMP 的可扩展的两两和多重结构比对。

pmTM-align: scalable pairwise and multiple structure alignment with Apache Spark and OpenMP.

机构信息

School of Software Engineering, Huazhong University of Science and Technology, Wuhan, 430074, China.

School of Life Science, Huazhong University of Science and Technology, Wuhan, China.

出版信息

BMC Bioinformatics. 2020 Sep 29;21(1):426. doi: 10.1186/s12859-020-03757-2.

Abstract

BACKGROUND

Structure comparison can provide useful information to identify functional and evolutionary relationship between proteins. With the dramatic increase of protein structure data in the Protein Data Bank, computation time quickly becomes the bottleneck for large scale structure comparisons. To more efficiently deal with informative multiple structure alignment tasks, we propose pmTM-align, a parallel protein structure alignment approach based on mTM-align/TM-align. pmTM-align contains two stages to handle pairwise structure alignments with Spark and the phylogenetic tree-based multiple structure alignment task on a single computer with OpenMP.

RESULTS

Experiments with the SABmark dataset showed that parallelization along with data structure optimization provided considerable speedup for mTM-align. The Spark-based structure alignments achieved near ideal scalability with large datasets, and the OpenMP-based construction of the phylogenetic tree accelerated the incremental alignment of multiple structures and metrics computation by a factor of about 2-5.

CONCLUSIONS

pmTM-align enables scalable pairwise and multiple structure alignment computing and offers more timely responses for medium to large-sized input data than existing alignment tools such as mTM-align.

摘要

背景

结构比对可以提供有用的信息,以识别蛋白质之间的功能和进化关系。随着蛋白质结构数据库中蛋白质结构数据的急剧增加,计算时间迅速成为大规模结构比对的瓶颈。为了更有效地处理信息丰富的多结构比对任务,我们提出了 pmTM-align,这是一种基于 mTM-align/TM-align 的并行蛋白质结构比对方法。pmTM-align 包含两个阶段,使用 Spark 处理两两结构比对,使用 OpenMP 在单台计算机上处理基于系统发育树的多结构比对任务。

结果

使用 SABmark 数据集进行的实验表明,并行化和数据结构优化为 mTM-align 提供了相当大的加速。基于 Spark 的结构比对在处理大型数据集时实现了近乎理想的可扩展性,而基于 OpenMP 的系统发育树构建加速了多个结构的增量比对和度量计算,速度提高了约 2-5 倍。

结论

pmTM-align 实现了可扩展的两两和多结构比对计算,为中等至大型输入数据提供了比现有比对工具(如 mTM-align)更及时的响应。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/0de2/7526426/f8fab730c157/12859_2020_3757_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验