Escuela de Ingeniería de Sistemas y Computación, Universidad del Valle, Santiago de Cali, Colombia.
BMC Genomics. 2011 Oct 14;12:506. doi: 10.1186/1471-2164-12-506.
Several studies have shown that genomes can be studied via a multifractal formalism. Recently, we used a multifractal approach to study the genetic information content of the Caenorhabditis elegans genome. Here we investigate the possibility that the human genome shows a similar behavior to that observed in the nematode.
We report here multifractality in the human genome sequence. This behavior correlates strongly on the presence of Alu elements and to a lesser extent on CpG islands and (G+C) content. In contrast, no or low relationship was found for LINE, MIR, MER, LTRs elements and DNA regions poor in genetic information. Gene function, cluster of orthologous genes, metabolic pathways, and exons tended to increase their frequencies with ranges of multifractality and large gene families were located in genomic regions with varied multifractality. Additionally, a multifractal map and classification for human chromosomes are proposed.
Based on these findings, we propose a descriptive non-linear model for the structure of the human genome, with some biological implications. This model reveals 1) a multifractal regionalization where many regions coexist that are far from equilibrium and 2) this non-linear organization has significant molecular and medical genetic implications for understanding the role of Alu elements in genome stability and structure of the human genome. Given the role of Alu sequences in gene regulation, genetic diseases, human genetic diversity, adaptation and phylogenetic analyses, these quantifications are especially useful.
多项研究表明,基因组可以通过多重分形形式来研究。最近,我们使用多重分形方法来研究秀丽隐杆线虫基因组的遗传信息含量。在这里,我们研究了人类基因组是否表现出与线虫相似的行为。
我们在这里报告了人类基因组序列中的多重分形性。这种行为与 Alu 元件的存在密切相关,与 CpG 岛和(G+C)含量的相关性较小。相比之下,LINE、MIR、MER、LTR 元件和遗传信息量低的 DNA 区域没有或相关性低。基因功能、直系同源基因簇、代谢途径和外显子往往随着多重分形的范围和大基因家族的存在而增加其频率,并且基因家族位于具有不同多重分形性的基因组区域中。此外,还提出了人类染色体的多重分形图谱和分类。
基于这些发现,我们提出了一种描述人类基因组结构的非线性模型,具有一些生物学意义。该模型揭示了 1)存在许多远离平衡的多重分形区域化,2)这种非线性组织对理解 Alu 元件在基因组稳定性和人类基因组结构中的作用具有重要的分子和医学遗传意义。鉴于 Alu 序列在基因调控、遗传疾病、人类遗传多样性、适应和系统发生分析中的作用,这些量化尤其有用。