Suppr超能文献

人类基因组中对称的破坏。

The breakdown of the word symmetry in the human genome.

机构信息

CIDMA - Center for Research and Development in Mathematics and Applications, Department of Mathematics, University of Aveiro, 3810-193 Aveiro, Portugal.

出版信息

J Theor Biol. 2013 Oct 21;335:153-9. doi: 10.1016/j.jtbi.2013.06.032. Epub 2013 Jul 2.

Abstract

Previous studies have suggested that Chargaff's second rule may hold for relatively long words (above 10nucleotides), but this has not been conclusively shown. In particular, the following questions remain open: Is the phenomenon of symmetry statistically significant? If so, what is the word length above which significance is lost? Can deviations in symmetry due to the finite size of the data be identified? This work addresses these questions by studying word symmetries in the human genome, chromosomes and transcriptome. To rule out finite-length effects, the results are compared with those obtained from random control sequences built to satisfy Chargaff's second parity rule. We use several techniques to evaluate the phenomenon of symmetry, including Pearson's correlation coefficient, total variational distance, a novel word symmetry distance, as well as traditional and equivalence statistical tests. We conclude that word symmetries are statistical significant in the human genome for word lengths up to 6nucleotides. For longer words, we present evidence that the phenomenon may not be as prevalent as previously thought.

摘要

先前的研究表明,Chargaff 的第二规则可能适用于相对较长的单词(超过 10 个核苷酸),但这尚未得到明确证明。特别是,以下问题仍未解决:对称现象在统计学上是否显著?如果是这样,失去显著性的单词长度是多少?能否识别由于数据有限大小而导致的对称性偏差?这项工作通过研究人类基因组、染色体和转录组中的单词对称性来解决这些问题。为了排除有限长度的影响,将结果与通过构建满足 Chargaff 第二奇偶校验规则的随机对照序列获得的结果进行比较。我们使用几种技术来评估对称现象,包括 Pearson 相关系数、总方差距离、新的单词对称距离以及传统和等价统计检验。我们的结论是,在人类基因组中,单词长度高达 6 个核苷酸的单词对称性在统计学上是显著的。对于更长的单词,我们提供的证据表明,这种现象可能不像以前想象的那么普遍。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验