Suppr超能文献

基于 Tuning 迭代空间切片的分块多核代码实现 Nussinov 的 RNA 折叠。

Tuning iteration space slicing based tiled multi-core code implementing Nussinov's RNA folding.

机构信息

West Pomeranian University of Technology, Faculty of Computer Science, Zolnierska 49, Szczecin, 71-210, Poland.

出版信息

BMC Bioinformatics. 2018 Jan 15;19(1):12. doi: 10.1186/s12859-018-2008-6.

Abstract

BACKGROUND

RNA folding is an ongoing compute-intensive task of bioinformatics. Parallelization and improving code locality for this kind of algorithms is one of the most relevant areas in computational biology. Fortunately, RNA secondary structure approaches, such as Nussinov's recurrence, involve mathematical operations over affine control loops whose iteration space can be represented by the polyhedral model. This allows us to apply powerful polyhedral compilation techniques based on the transitive closure of dependence graphs to generate parallel tiled code implementing Nussinov's RNA folding. Such techniques are within the iteration space slicing framework - the transitive dependences are applied to the statement instances of interest to produce valid tiles. The main problem at generating parallel tiled code is defining a proper tile size and tile dimension which impact parallelism degree and code locality.

RESULTS

To choose the best tile size and tile dimension, we first construct parallel parametric tiled code (parameters are variables defining tile size). With this purpose, we first generate two nonparametric tiled codes with different fixed tile sizes but with the same code structure and then derive a general affine model, which describes all integer factors available in expressions of those codes. Using this model and known integer factors present in the mentioned expressions (they define the left-hand side of the model), we find unknown integers in this model for each integer factor available in the same fixed tiled code position and replace in this code expressions, including integer factors, with those including parameters. Then we use this parallel parametric tiled code to implement the well-known tile size selection (TSS) technique, which allows us to discover in a given search space the best tile size and tile dimension maximizing target code performance.

CONCLUSIONS

For a given search space, the presented approach allows us to choose the best tile size and tile dimension in parallel tiled code implementing Nussinov's RNA folding. Experimental results, received on modern Intel multi-core processors, demonstrate that this code outperforms known closely related implementations when the length of RNA strands is bigger than 2500.

摘要

背景

RNA 折叠是生物信息学中一项持续的计算密集型任务。此类算法的并行化和提高代码局部性是计算生物学中最相关的领域之一。幸运的是,RNA 二级结构方法(如 Nussinov 的递归)涉及到对仿射控制循环的数学运算,其迭代空间可以用多面体模型表示。这使得我们能够应用基于依赖图传递闭包的强大多面体编译技术来生成并行平铺代码,以实现 Nussinov 的 RNA 折叠。此类技术属于迭代空间切片框架 - 传递依赖关系应用于感兴趣的语句实例,以生成有效的平铺。在生成并行平铺代码时,主要问题是定义适当的平铺大小和平铺维度,这会影响并行度和代码局部性。

结果

为了选择最佳的平铺大小和平铺维度,我们首先构建了具有并行参数化平铺代码(参数是定义平铺大小的变量)。为此,我们首先生成两个具有不同固定平铺大小但具有相同代码结构的非参数化平铺代码,然后推导出一个通用的仿射模型,该模型描述了那些代码表达式中所有可用的整数因子。使用这个模型和已知的在提到的表达式中存在的整数因子(它们定义了模型的左侧),我们为每个在相同固定平铺代码位置可用的整数因子在这个模型中找到未知的整数,并将这个代码中的表达式(包括整数因子)替换为包含参数的表达式。然后,我们使用这个并行参数化平铺代码来实现著名的平铺大小选择(TSS)技术,该技术允许我们在给定的搜索空间中发现最佳的平铺大小和平铺维度,以最大化目标代码的性能。

结论

对于给定的搜索空间,所提出的方法允许我们在实现 Nussinov 的 RNA 折叠的并行平铺代码中选择最佳的平铺大小和平铺维度。在 RNA 链长度大于 2500 时,在现代 Intel 多核处理器上获得的实验结果表明,与已知的紧密相关的实现相比,该代码具有更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/68f1/5769393/df6010759546/12859_2018_2008_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验