Suppr超能文献

使用动态时间规整对基因组信号进行分类。

Classification of genomic signals using dynamic time warping.

出版信息

BMC Bioinformatics. 2013;14 Suppl 10(Suppl 10):S1. doi: 10.1186/1471-2105-14-S10-S1. Epub 2013 Aug 12.

Abstract

BACKGROUND

Classification methods of DNA most commonly use comparison of the differences in DNA symbolic records, which requires the global multiple sequence alignment. This solution is often inappropriate, causing a number of imprecisions and requires additional user intervention for exact alignment of the similar segments. The similar segments in DNA represented as a signal are characterized by a similar shape of the curve. The DNA alignment in genomic signals may adjust whole sections not only individual symbols. The dynamic time warping (DTW) is suitable for this purpose and can replace the multiple alignment of symbolic sequences in applications, such as phylogenetic analysis.

METHODS

The proposed method is composed of three main parts. The first part represent conversion of symbolic representation of DNA sequences in the form of a string of A,C,G,T symbols to signal representation in the form of cumulated phase of complex components defined for each symbol. Next part represents signals size adjustment realized by standard signal preprocessing methods: median filtration, detrendization and resampling. The final part necessary for genomic signals comparison is position and length alignment of genomic signals by dynamic time warping (DTW).

RESULTS

The application of the DTW on set of genomic signals was evaluated in dendrogram construction using cluster analysis. The resulting tree was compared with a classical phylogenetic tree reconstructed using multiple alignment. The classification of genomic signals using the DTW is evolutionary closer to phylogeny of organisms. This method is more resistant to errors in the sequences and less dependent on the number of input sequences.

CONCLUSIONS

Classification of genomic signals using dynamic time warping is an adequate variant to phylogenetic analysis using the symbolic DNA sequences alignment; in addition, it is robust, quick and more precise technique.

摘要

背景

DNA 的分类方法最常使用比较 DNA 符号记录的差异,这需要全局多序列比对。这种解决方案通常不合适,会导致许多不精确的结果,并需要额外的用户干预来精确对齐相似的片段。表示为信号的 DNA 中的相似片段的特征在于曲线的相似形状。基因组信号中的 DNA 比对可以调整整个部分,而不仅仅是单个符号。动态时间 warping(DTW)非常适合此目的,并可以在系统发育分析等应用中替代符号序列的多重比对。

方法

所提出的方法由三个主要部分组成。第一部分将 DNA 序列的符号表示转换为 A、C、G、T 符号组成的字符串的形式,转换为为每个符号定义的复数值的累积相位的信号表示。下一部分表示通过标准信号预处理方法(中值滤波、去趋势化和重采样)实现的信号大小调整。通过动态时间 warping(DTW)比较基因组信号所必需的最后一部分是基因组信号的位置和长度对齐。

结果

使用聚类分析在构建系统发育树时评估了 DTW 在一组基因组信号上的应用。将得到的树与使用多重比对重建的经典系统发育树进行了比较。使用 DTW 对基因组信号进行分类与生物体的系统发育更接近。该方法对序列中的错误更具鲁棒性,对输入序列的数量的依赖性更小。

结论

使用动态时间 warping 对基因组信号进行分类是使用符号 DNA 序列比对进行系统发育分析的一种适当变体;此外,它是一种稳健、快速且更精确的技术。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/61f3/3750471/fe23351e109f/1471-2105-14-S10-S1-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验