Suppr超能文献

人类基因组中CpG岛的组成性搜索。

Compositional searching of CpG islands in the human genome.

作者信息

Luque-Escamilla Pedro Luis, Martínez-Aroza José, Oliver José L, Gómez-Lopera Juan Francisco, Román-Roldán Ramón

机构信息

Department of Engineering and Mining Mechanics, University of Jaén, Escuela Politécnica Superior, Campus Las Lagunillas s/n, 23071 Jaén, Spain.

出版信息

Phys Rev E Stat Nonlin Soft Matter Phys. 2005 Jun;71(6 Pt 1):061925. doi: 10.1103/PhysRevE.71.061925. Epub 2005 Jun 29.

Abstract

We report on an entropic edge detector based on the local calculation of the Jensen-Shannon divergence with application to the search for CpG islands. CpG islands are pieces of the genome related to gene expression and cell differentiation, and thus to cancer formation. Searching for these CpG islands is a major task in genetics and bioinformatics. Some algorithms have been proposed in the literature, based on moving statistics in a sliding window, but its size may greatly influence the results. The local use of Jensen-Shannon divergence is a completely different strategy: the nucleotide composition inside the islands is different from that in their environment, so a statistical distance--the Jensen-Shannon divergence--between the composition of two adjacent windows may be used as a measure of their dissimilarity. Sliding this double window over the entire sequence allows us to segment it compositionally. The fusion of those segments into greater ones that satisfy certain identification criteria must be achieved in order to obtain the definitive results. We find that the local use of Jensen-Shannon divergence is very suitable in processing DNA sequences for searching for compositionally different structures such as CpG islands, as compared to other algorithms in literature.

摘要

我们报告了一种基于詹森 - 香农散度局部计算的熵边缘检测器,并将其应用于寻找CpG岛。CpG岛是基因组中与基因表达和细胞分化相关的片段,因此与癌症形成有关。寻找这些CpG岛是遗传学和生物信息学中的一项主要任务。文献中已经提出了一些基于滑动窗口中移动统计量的算法,但其窗口大小可能会对结果产生很大影响。局部使用詹森 - 香农散度是一种完全不同的策略:岛内的核苷酸组成与其周围环境不同,因此两个相邻窗口组成之间的统计距离——詹森 - 香农散度——可以用作它们差异的度量。在整个序列上滑动这个双窗口使我们能够按组成对其进行分割。为了获得最终结果,必须将这些片段融合成满足某些识别标准的更大片段。我们发现,与文献中的其他算法相比,局部使用詹森 - 香农散度在处理DNA序列以寻找诸如CpG岛等组成不同的结构时非常合适。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验