Bielińska-Wąż Dorota
Instytut Fizyki, Uniwersytet Mikołaja Kopernika, Grudziądzka 5, 87-100 Toruń, Poland.
J Math Chem. 2011;49(10):2345. doi: 10.1007/s10910-011-9890-8. Epub 2011 Aug 28.
New approaches aiming at a detailed similarity/dissimilarity analysis of DNA sequences are formulated. Several corrections that enrich the information which may be derived from the alignment methods are proposed. The corrections take into account the distributions along the sequences of the aligned bases (neglected in the standard alignment methods). As a consequence, different aspects of similarity, as for example asymmetry of the gene structure, may be studied either using new similarity measures associated with four-component spectral representation of the DNA sequences or using alignment methods with corrections introduced in this paper. The corrections to the alignment methods and the statistical distribution moment-based descriptors derived from the four-component spectral representation of the DNA sequences are applied to similarity/dissimilarity studies of -globin gene across species. The studies are supplemented by detailed similarity studies for histones H1 and H4 coding sequences. The data are described according to the latest version of the EMBL database. The work is supplemented by a concise review of the state-of-art graphical representations of DNA sequences.
提出了旨在对DNA序列进行详细的相似性/差异性分析的新方法。提出了几种校正方法,这些校正方法丰富了可从比对方法中获得的信息。这些校正考虑了比对碱基沿序列的分布(这在标准比对方法中被忽略)。因此,可以使用与DNA序列的四分量谱表示相关的新相似性度量,或者使用本文引入校正的比对方法,来研究相似性的不同方面,例如基因结构的不对称性。比对方法的校正以及从DNA序列的四分量谱表示导出的基于统计分布矩的描述符,被应用于跨物种的β-珠蛋白基因的相似性/差异性研究。这些研究通过对组蛋白H1和H4编码序列的详细相似性研究得到补充。数据根据EMBL数据库的最新版本进行描述。这项工作还辅以对DNA序列最新图形表示的简要综述。