Dipartimento di Matematica e Applicazioni, Università di Milano-Bicocca, 20125, Milano, Italy.
Dipartimento di Informatica, Università di Bologna, 40126, Bologna, Italy.
Sci Rep. 2018 Oct 25;8(1):15817. doi: 10.1038/s41598-018-34136-w.
Biologists have long sought a way to explain how statistical properties of genetic sequences emerged and are maintained through evolution. On the one hand, non-random structures at different scales indicate a complex genome organisation. On the other hand, single-strand symmetry has been scrutinised using neutral models in which correlations are not considered or irrelevant, contrary to empirical evidence. Different studies investigated these two statistical features separately, reaching minimal consensus despite sustained efforts. Here we unravel previously unknown symmetries in genetic sequences, which are organized hierarchically through scales in which non-random structures are known to be present. These observations are confirmed through the statistical analysis of the human genome and explained through a simple domain model. These results suggest that domain models which account for the cumulative action of mobile elements can explain simultaneously non-random structures and symmetries in genetic sequences.
生物学家长期以来一直寻求一种方法来解释遗传序列的统计属性是如何通过进化而出现并得以维持的。一方面,不同尺度上的非随机结构表明了复杂的基因组组织。另一方面,单链对称性一直受到中性模型的研究,这些模型不考虑或认为相关性不相关,这与经验证据相悖。尽管进行了持续的努力,不同的研究仍分别研究了这两个统计特征,结果很少达成共识。在这里,我们揭示了遗传序列中以前未知的对称性,这些对称性通过存在非随机结构的尺度进行层次组织。这些观察结果通过对人类基因组的统计分析得到了证实,并通过一个简单的结构域模型得到了解释。这些结果表明,考虑到移动元件累积作用的结构域模型可以同时解释遗传序列中的非随机结构和对称性。