• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

人类基因组中双核苷酸间距:全基因组及蛋白质编码分布分析

Inter-dinucleotide distances in the human genome: an analysis of the whole-genome and protein-coding distributions.

作者信息

Bastos Carlos A C, Afreixo Vera, Pinho Armando J, Garcia Sara P, Rodrigues João M O S, Ferreira Paulo J S G

机构信息

Signal Processing Lab, IEETA, University of Aveiro, 3810-193 Aveiro, Portugal.

出版信息

J Integr Bioinform. 2011 Sep 15;8(3):172. doi: 10.2390/biecoll-jib-2011-172.

DOI:10.2390/biecoll-jib-2011-172
PMID:21926435
Abstract

We study the inter-dinucleotide distance distributions in the human genome, both in the whole-genome and protein-coding regions. The inter-dinucleotide distance is defined as the distance to the next occurrence of the same dinucleotide. We consider the 16 sequences of inter-dinucleotide distances and two reading frames. Our results show a period-3 oscillation in the protein-coding inter-dinucleotide distance distributions that is absent from the whole-genome distributions. We also compare the distance distribution of each dinucleotide to a reference distribution, that of a random sequence generated with the same dinucleotide abundances, revealing the CG dinucleotide as the one with the highest cumulative relative error for the first 60 distances. Moreover, the distance distribution of each dinucleotide is compared to the distance distribution of all other dinucleotides using the Kullback-Leibler divergence. We find that the distance distribution of a dinucleotide and that of its reversed complement are very similar, hence, the divergence between them is very small. This is an interesting finding that may give evidence of a stronger parity rule than Chargaff's second parity rule.

摘要

我们研究了人类基因组中全基因组和蛋白质编码区域的双核苷酸间距分布。双核苷酸间距定义为到下一次出现相同双核苷酸的距离。我们考虑了16个双核苷酸间距序列和两个阅读框。我们的结果表明,蛋白质编码双核苷酸间距分布中存在全基因组分布所没有的3周期振荡。我们还将每个双核苷酸的间距分布与参考分布进行比较,该参考分布是具有相同双核苷酸丰度的随机序列的分布,结果显示CG双核苷酸在前60个间距中具有最高的累积相对误差。此外,使用库尔贝克-莱布勒散度将每个双核苷酸的间距分布与所有其他双核苷酸的间距分布进行比较。我们发现一个双核苷酸与其反向互补序列的间距分布非常相似,因此它们之间的散度非常小。这是一个有趣的发现,可能为比查加夫第二互补规则更强的互补规则提供证据。

相似文献

1
Inter-dinucleotide distances in the human genome: an analysis of the whole-genome and protein-coding distributions.人类基因组中双核苷酸间距:全基因组及蛋白质编码分布分析
J Integr Bioinform. 2011 Sep 15;8(3):172. doi: 10.2390/biecoll-jib-2011-172.
2
[Analysis, identification and correction of some errors of model refseqs appeared in NCBI Human Gene Database by in silico cloning and experimental verification of novel human genes].[通过新型人类基因的电子克隆和实验验证对NCBI人类基因数据库中出现的模型参考序列的一些错误进行分析、鉴定和校正]
Yi Chuan Xue Bao. 2004 May;31(5):431-43.
3
Features of coding and noncoding sequences based on 3-tuple distributions.基于三元组分布的编码和非编码序列特征。
Yi Chuan Xue Bao. 2005 Oct;32(10):1018-26.
4
Sequence context analysis of 8.2 million single nucleotide polymorphisms in the human genome.人类基因组中820万个单核苷酸多态性的序列上下文分析。
Gene. 2006 Feb 1;366(2):316-24. doi: 10.1016/j.gene.2005.08.024. Epub 2005 Nov 28.
5
Variations of the mononucleotide and short oligonucleotide distributions in the genomes of various organisms.各种生物体基因组中单核核苷酸和短寡核苷酸分布的变化。
J Theor Biol. 1999 Nov 21;201(2):141-56. doi: 10.1006/jtbi.1999.1019.
6
Comprehensive search for intra- and inter-specific sequence polymorphisms among coding envelope genes of retroviral origin found in the human genome: genes and pseudogenes.在人类基因组中发现的逆转录病毒起源的编码包膜基因内和种间序列多态性的全面搜索:基因和假基因。
BMC Genomics. 2005 Sep 9;6:117. doi: 10.1186/1471-2164-6-117.
7
Whole genome sequencing.全基因组测序
Methods Mol Biol. 2010;628:215-26. doi: 10.1007/978-1-60327-367-1_12.
8
[Correction of five different types of errors of model REFSEQs appeared in NCBI human gene database only by using two novel human genes C17orf32 and ZNF362].[仅通过使用两个新的人类基因C17orf32和ZNF362校正出现在NCBI人类基因数据库中的五种不同类型的模型REFSEQs错误]
Yi Chuan Xue Bao. 2004 Apr;31(4):325-34.
9
The breakdown of the word symmetry in the human genome.人类基因组中对称的破坏。
J Theor Biol. 2013 Oct 21;335:153-9. doi: 10.1016/j.jtbi.2013.06.032. Epub 2013 Jul 2.
10
Inter-STOP symbol distances for the identification of coding regions.用于识别编码区的终止子间符号距离。
J Integr Bioinform. 2013 Nov 14;10(3):230. doi: 10.2390/biecoll-jib-2013-230.

引用本文的文献

1
Modulation of host gene expression by the zinc finger antiviral protein.锌指抗病毒蛋白对宿主基因表达的调控
Proc Natl Acad Sci U S A. 2025 Apr;122(13):e2420819122. doi: 10.1073/pnas.2420819122. Epub 2025 Mar 27.
2
Comparative study of encoded and alignment-based methods for virus taxonomy classification.基于编码和比对的病毒分类学方法比较研究。
Sci Rep. 2023 Oct 31;13(1):18662. doi: 10.1038/s41598-023-45461-0.
3
Profound Non-Randomness in Dinucleotide Arrangements within Ultra-Conserved Non-Coding Elements and the Human Genome.
超保守非编码元件及人类基因组中双核苷酸排列的深度非随机性
Biology (Basel). 2023 Aug 12;12(8):1125. doi: 10.3390/biology12081125.
4
Abundant CpG-sequences in human genomes inhibit KIR3DL2-expressing NK cells.人类基因组中丰富的CpG序列会抑制表达KIR3DL2的自然杀伤细胞。
PeerJ. 2021 Nov 5;9:e12258. doi: 10.7717/peerj.12258. eCollection 2021.
5
A Markov chain-based feature extraction method for classification and identification of cancerous DNA sequences.一种基于马尔可夫链的用于癌症DNA序列分类和识别的特征提取方法。
Bioimpacts. 2021;11(2):87-99. doi: 10.34172/bi.2021.16. Epub 2020 Mar 24.
6
Statistical modelling of CG interdistance across multiple organisms.对多个生物体中 CG 间距的统计建模。
BMC Bioinformatics. 2018 Oct 15;19(Suppl 10):355. doi: 10.1186/s12859-018-2303-2.