Suppr超能文献

DNA的持续性指数。

The persistence exponent of DNA.

作者信息

Poland Douglas

机构信息

Department of Chemistry, The Johns Hopkins University, Baltimore, MD 21218, USA.

出版信息

Biophys Chem. 2004 Jul 1;110(1-2):59-72. doi: 10.1016/j.bpc.2004.01.003.

Abstract

Using the complete genome of Thermoplasma volcanium, as an example, we have examined the distribution functions for the amount of C or G in consecutive, non-overlapping blocks of m bases in this system. We find that these distributions are very much broader (by many factors) than those expected for a random distribution of bases. If we plot the widths of the C-G distributions relative to the widths expected for random distributions, as a function of the block size used, we obtain a power law with a characteristic exponent. The broadening of the C-G distributions follows from the empirical finding that blocks containing a given C-G content tend to be followed by blocks of similar C-G content thus indicating a statistical persistence of composition. The exponent associated with the power law thus measures the strength of persistence in a given DNA. This behavior can be understood using Mandelbrot's model of a fractional Brownian walk. In this model there is a hierarchy of persistence (correlation between blocks) between all parts of the system. The model gives us a way to scale the C-G distributions such that all these functions are collapsed onto a master curve. For a fractional Brownian walk, the fractal dimension of the C-G distribution is simply related to the persistence exponent for the power law. The persistence exponent for T. volcanium is found to be gamma = 0.29 while for a 10 million base segment of the human genome we obtain gamma = 0.39, similar to but not identical with the value found for the microbe.

摘要

以火山嗜热栖热菌的完整基因组为例,我们研究了该系统中长度为m个碱基的连续、非重叠片段中C或G含量的分布函数。我们发现,这些分布比碱基随机分布所预期的要宽泛得多(相差许多倍)。如果我们将C-G分布的宽度相对于随机分布预期的宽度绘制成所用片段大小的函数,就会得到一条具有特征指数的幂律。C-G分布的变宽源于一个经验发现,即含有给定C-G含量的片段之后往往跟着具有相似C-G含量的片段,这表明组成具有统计上的持续性。与幂律相关的指数因此衡量了给定DNA中持续性的强度。这种行为可以用曼德勃罗的分数布朗运动模型来理解。在这个模型中,系统各部分之间存在着持续性层次结构(片段之间的相关性)。该模型为我们提供了一种对C-G分布进行缩放的方法,使得所有这些函数都能汇聚到一条主曲线上。对于分数布朗运动,C-G分布的分形维数与幂律的持续性指数有简单的关系。发现火山嗜热栖热菌的持续性指数为γ = 0.29,而对于人类基因组的一个1000万个碱基的片段,我们得到γ = 0.39,与在微生物中发现的值相似但不完全相同。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验