Suppr超能文献

基因组序列中的零聚体和高阶零聚体

Nullomers and High Order Nullomers in Genomic Sequences.

作者信息

Vergni Davide, Santoni Daniele

机构信息

Istituto per le Applicazioni del Calcolo "Mauro Picone" - CNR, Via dei Taurini 19, 00185, Rome, Italy.

Istituto di Analisi dei Sistemi ed Informatica "Antonio Ruberti" - CNR, Via dei Taurini 19, 00185, Rome, Italy.

出版信息

PLoS One. 2016 Dec 1;11(12):e0164540. doi: 10.1371/journal.pone.0164540. eCollection 2016.

Abstract

A nullomer is an oligomer that does not occur as a subsequence in a given DNA sequence, i.e. it is an absent word of that sequence. The importance of nullomers in several applications, from drug discovery to forensic practice, is now debated in the literature. Here, we investigated the nature of nullomers, whether their absence in genomes has just a statistical explanation or it is a peculiar feature of genomic sequences. We introduced an extension of the notion of nullomer, namely high order nullomers, which are nullomers whose mutated sequences are still nullomers. We studied different aspects of them: comparison with nullomers of random sequences, CpG distribution and mean helical rise. In agreement with previous results we found that the number of nullomers in the human genome is much larger than expected by chance. Nevertheless antithetical results were found when considering a random DNA sequence preserving dinucleotide frequencies. The analysis of CpG frequencies in nullomers and high order nullomers revealed, as expected, a high CpG content but it also highlighted a strong dependence of CpG frequencies on the dinucleotide position, suggesting that nullomers have their own peculiar structure and are not simply sequences whose CpG frequency is biased. Furthermore, phylogenetic trees were built on eleven species based on both the similarities between the dinucleotide frequencies and the number of nullomers two species share, showing that nullomers are fairly conserved among close species. Finally the study of mean helical rise of nullomers sequences revealed significantly high mean rise values, reinforcing the hypothesis that those sequences have some peculiar structural features. The obtained results show that nullomers are the consequence of the peculiar structure of DNA (also including biased CpG frequency and CpGs islands), so that the hypermutability model, also taking into account CpG islands, seems to be not sufficient to explain nullomer phenomenon. Finally, high order nullomers could emphasize those features that already make simple nullomers useful in several applications.

摘要

零聚物是一种在给定DNA序列中不作为子序列出现的寡聚物,即它是该序列中不存在的单词。零聚物在从药物发现到法医实践等多种应用中的重要性,目前在文献中存在争议。在这里,我们研究了零聚物的性质,即它们在基因组中的缺失是仅仅有统计学解释,还是基因组序列的一个特殊特征。我们引入了零聚物概念的一个扩展,即高阶零聚物,它们是其突变序列仍然是零聚物的零聚物。我们研究了它们的不同方面:与随机序列的零聚物比较、CpG分布和平均螺旋上升。与先前的结果一致,我们发现人类基因组中的零聚物数量比随机预期的要多得多。然而,在考虑保留二核苷酸频率的随机DNA序列时,发现了相反的结果。对零聚物和高阶零聚物中CpG频率的分析表明,正如预期的那样,CpG含量很高,但也突出了CpG频率对二核苷酸位置的强烈依赖性,这表明零聚物有其自身独特的结构,而不仅仅是CpG频率有偏差的序列。此外,基于二核苷酸频率的相似性以及两个物种共有的零聚物数量,构建了11个物种的系统发育树,表明零聚物在亲缘关系较近的物种中相当保守。最后,对零聚物序列平均螺旋上升的研究揭示了显著较高的平均上升值,强化了这些序列具有一些特殊结构特征的假设。所获得的结果表明,零聚物是DNA特殊结构(也包括有偏差的CpG频率和CpG岛)的结果,因此,即使考虑到CpG岛,超突变模型似乎也不足以解释零聚物现象。最后,高阶零聚物可能会突出那些已经使简单零聚物在多种应用中有用的特征。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7053/5132333/34998137c303/pone.0164540.g001.jpg

相似文献

1
Nullomers and High Order Nullomers in Genomic Sequences.
PLoS One. 2016 Dec 1;11(12):e0164540. doi: 10.1371/journal.pone.0164540. eCollection 2016.
2
Nullomers: really a matter of natural selection?
PLoS One. 2007 Oct 10;2(10):e1022. doi: 10.1371/journal.pone.0001022.
4
Absent sequences: nullomers and primes.
Pac Symp Biocomput. 2007:355-66. doi: 10.1142/9789812772435_0034.
5
Sequence context analysis of 8.2 million single nucleotide polymorphisms in the human genome.
Gene. 2006 Feb 1;366(2):316-24. doi: 10.1016/j.gene.2005.08.024. Epub 2005 Nov 28.
7
A genome-wide analysis of CpG dinucleotides in the human genome distinguishes two distinct classes of promoters.
Proc Natl Acad Sci U S A. 2006 Jan 31;103(5):1412-7. doi: 10.1073/pnas.0510310103. Epub 2006 Jan 23.
8
CpGcluster: a distance-based algorithm for CpG-island detection.
BMC Bioinformatics. 2006 Oct 12;7:446. doi: 10.1186/1471-2105-7-446.
9
Distribution of DNA methylation, CpGs, and CpG islands in human isochores.
Genomics. 2010 Jan;95(1):25-8. doi: 10.1016/j.ygeno.2009.09.006. Epub 2009 Oct 1.
10
CpG island mapping by epigenome prediction.
PLoS Comput Biol. 2007 Jun;3(6):e110. doi: 10.1371/journal.pcbi.0030110. Epub 2007 May 2.

引用本文的文献

1
Leveraging sequences missing from the human genome to diagnose cancer.
Commun Med (Lond). 2025 Aug 21;5(1):363. doi: 10.1038/s43856-025-01067-3.
2
Cellular Activity of CQWW Nullomer-Derived Peptides.
ACS Omega. 2025 Feb 11;10(7):6794-6800. doi: 10.1021/acsomega.4c08860. eCollection 2025 Feb 25.
3
The topography of nullomer-emerging mutations and their relevance to human disease.
Comput Struct Biotechnol J. 2024 Dec 25;30:1-11. doi: 10.1016/j.csbj.2024.12.026. eCollection 2025.
4
A survey of k-mer methods and applications in bioinformatics.
Comput Struct Biotechnol J. 2024 May 21;23:2289-2303. doi: 10.1016/j.csbj.2024.05.025. eCollection 2024 Dec.
5
The determinants of the rarity of nucleic and peptide short sequences in nature.
NAR Genom Bioinform. 2024 Apr 4;6(2):lqae029. doi: 10.1093/nargab/lqae029. eCollection 2024 Jun.
6
Structural underpinnings of mutation rate variations in the human genome.
Nucleic Acids Res. 2023 Aug 11;51(14):7184-7197. doi: 10.1093/nar/gkad551.
7
Specificity Analysis of Genome Based on Statistically Identical K-Words With Same Base Combination.
IEEE Open J Eng Med Biol. 2020 Jul 14;1:214-219. doi: 10.1109/OJEMB.2020.3009055. eCollection 2020.
10
In the search of potential epitopes for Wuhan seafood market pneumonia virus using high order nullomers.
J Immunol Methods. 2020 Jun-Jul;481-482:112787. doi: 10.1016/j.jim.2020.112787. Epub 2020 Apr 23.

本文引用的文献

1
Three minimal sequences found in Ebola virus genomes and absent from human DNA.
Bioinformatics. 2015 Aug 1;31(15):2421-5. doi: 10.1093/bioinformatics/btv189. Epub 2015 Apr 2.
2
Safeguarding forensic DNA reference samples with nullomer barcodes.
J Forensic Leg Med. 2013 Jul;20(5):513-9. doi: 10.1016/j.jflm.2013.02.003. Epub 2013 Apr 17.
3
Nullomer derived anticancer peptides (NulloPs): differential lethal effects on normal and cancer cells in vitro.
Peptides. 2012 Dec;38(2):302-11. doi: 10.1016/j.peptides.2012.09.015. Epub 2012 Sep 20.
4
Preferential nucleosome occupancy at high values of DNA helical rise.
DNA Res. 2012;19(1):81-90. doi: 10.1093/dnares/dsr043. Epub 2012 Jan 9.
5
Minimal absent words in prokaryotic and eukaryotic genomes.
PLoS One. 2011 Jan 31;6(1):e16065. doi: 10.1371/journal.pone.0016065.
6
Sequence-dependent DNA helical rise and nucleosome stability.
BMC Mol Biol. 2009 Nov 27;10:105. doi: 10.1186/1471-2199-10-105.
7
On finding minimal absent words.
BMC Bioinformatics. 2009 May 8;10:137. doi: 10.1186/1471-2105-10-137.
8
Efficient computation of absent words in genomic sequences.
BMC Bioinformatics. 2008 Mar 26;9:167. doi: 10.1186/1471-2105-9-167.
9
Absent sequences: nullomers and primes.
Pac Symp Biocomput. 2007:355-66. doi: 10.1142/9789812772435_0034.
10
Nullomers: really a matter of natural selection?
PLoS One. 2007 Oct 10;2(10):e1022. doi: 10.1371/journal.pone.0001022.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验