IEEE/ACM Trans Comput Biol Bioinform. 2018 Sep-Oct;15(5):1611-1624. doi: 10.1109/TCBB.2017.2731339. Epub 2017 Jul 24.
The multifractal analysis has allowed to quantify the genetic variability and non-linear stability along the human genome sequence. It has some implications in explaining several genetic diseases given by some chromosome abnormalities, among other genetic particularities. The multifractal analysis of a genome is carried out by dividing the complete DNA sequence in smaller fragments and calculating the generalized dimension spectrum of each fragment using the chaos game representation and the box-counting method. This is a time consuming process because it involves the processing of large data sets using floating-point representation. In order to reduce the computation time, we designed an application-specific processor, here called multifractal processor, which is based on our proposed hardware-oriented algorithm for calculating efficiently the generalized dimension spectrum of DNA sequences. The multifractal processor was implemented on a low-cost SoC-FPGA and was verified by processing a complete human genome. The execution time and numeric results of the Multifractal processor were compared with the results obtained from the software implementation executed in a 20-core workstation, achieving a speed up of 2.6x and an average error of 0.0003 percent.
多重分形分析允许定量分析人类基因组序列中的遗传变异性和非线性稳定性。它在解释一些由染色体异常引起的遗传疾病以及其他遗传特征方面具有一些意义。基因组的多重分形分析是通过将完整的 DNA 序列划分为较小的片段,并使用混沌游戏表示和盒子计数法计算每个片段的广义维谱来完成的。这是一个耗时的过程,因为它涉及到使用浮点数表示处理大数据集。为了减少计算时间,我们设计了一个专用的处理器,即多重分形处理器,它基于我们提出的硬件导向算法,用于高效地计算 DNA 序列的广义维谱。多重分形处理器在低成本的 SoC-FPGA 上实现,并通过处理完整的人类基因组进行了验证。与在 20 核工作站上执行的软件实现的结果进行了比较,多重分形处理器的执行时间和数值结果实现了 2.6 倍的加速,平均误差为 0.0003%。