Karabiber Fethullah
Department of Computer Engineering, Yildiz Technical University, 34220, Istanbul, Turkey.
J Bioinform Comput Biol. 2013 Oct;11(5):1350011. doi: 10.1142/S021972001350011X. Epub 2013 Jul 29.
Alignment of peaks in electropherograms or chromatograms obtained from experimental techniques such capillary electrophoresis remains a significant challenge. Accurate alignment is critical for accurate interpretation of various classes of nucleic acid analysis technologies, including conventional DNA sequencing and new RNA structure probing technologies. An automated alignment algorithm was developed based on dynamic programming to align multiple-peak time-series data both globally and locally. This algorithm relies on a new peak similarity measure and other features such as time penalties, global constraints, and minimum-similarity scores and results in rapid, highly accurate comparisons of complex time-series datasets. As a demonstrative case study, the developed algorithm was applied to analysis of capillary electrophoresis data from a Selective 2'-Hydroxyl Acylation analyzed by Primer Extension (SHAPE) evaluation of RNA secondary structure. The algorithm yielded robust analysis of challenging SHAPE probing data. Experimental results show that the peak alignment algorithm corrects retention time variation efficiently due to the presence of fluorescent tags on fragments and differences in capillaries. The tools can be readily adapted for the analysis other biological datasets in which peak retention times vary.
从诸如毛细管电泳等实验技术获得的电泳图或色谱图中的峰对齐仍然是一项重大挑战。准确对齐对于准确解释各类核酸分析技术至关重要,包括传统的DNA测序和新的RNA结构探测技术。基于动态规划开发了一种自动对齐算法,用于对多峰时间序列数据进行全局和局部对齐。该算法依赖于一种新的峰相似性度量以及其他特征,如时间惩罚、全局约束和最小相似性分数,能够快速、高度准确地比较复杂的时间序列数据集。作为一个示范性案例研究,将所开发的算法应用于对通过RNA二级结构的选择性2'-羟基酰化引物延伸分析(SHAPE)评估得到的毛细管电泳数据进行分析。该算法对具有挑战性的SHAPE探测数据进行了稳健的分析。实验结果表明,由于片段上存在荧光标签以及毛细管的差异,峰对齐算法能够有效地校正保留时间变化。这些工具可以很容易地适用于分析峰保留时间变化的其他生物数据集。