• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

马尔可夫链分析发现,在真核细胞核DNA序列(包括蛋白质编码序列和非编码序列)中,相邻碱基对某一碱基出现的概率有显著影响。

Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncoding.

作者信息

Blaisdell B E

出版信息

J Mol Evol. 1984;21(3):278-88. doi: 10.1007/BF02102360.

DOI:10.1007/BF02102360
PMID:6443131
Abstract

Sixty-four eucaryotic nuclear DNA sequences, half of them coding and half noncoding, have been examined as expressions of first-, second-, or third-order Markov chains. Standard statistical tests found that most of the sequences required at least second-order Markov chains for their representation, and some required chains of third order. For all 64 sequences the observed one-step second-order transition count matrices were effective in predicting the two-step transition count matrices, and 56 of 64 were effective in predicting the three-step transition count matrices. The departure from random expectation of the observed first- and second-order transition count matrices meant that a considerable sample of eucaryotic nuclear DNA sequences, both protein coding and noncoding, have significant local structure over subsequences of three to five contiguous bases, and that this structure occurs throughout the total length of the sequence. These results suggested that present DNA sequences may have arisen from the duplication, concatenation, and gradual modification of very early short sequences.

摘要

64个真核细胞核DNA序列,其中一半是编码序列,一半是非编码序列,已被作为一阶、二阶或三阶马尔可夫链的表达形式进行了研究。标准统计测试发现,大多数序列至少需要二阶马尔可夫链来表示,有些则需要三阶链。对于所有64个序列,观察到的一步二阶转移计数矩阵在预测两步转移计数矩阵方面是有效的,64个中的56个在预测三步转移计数矩阵方面是有效的。观察到的一阶和二阶转移计数矩阵偏离随机预期,这意味着相当数量的真核细胞核DNA序列样本,包括蛋白质编码序列和非编码序列,在三到五个连续碱基的子序列上具有显著的局部结构,并且这种结构存在于序列的整个长度中。这些结果表明,目前的DNA序列可能是由非常早期的短序列的复制、串联和逐渐修饰产生的。

相似文献

1
Markov chain analysis finds a significant influence of neighboring bases on the occurrence of a base in eucaryotic nuclear DNA sequences both protein-coding and noncoding.马尔可夫链分析发现,在真核细胞核DNA序列(包括蛋白质编码序列和非编码序列)中,相邻碱基对某一碱基出现的概率有显著影响。
J Mol Evol. 1984;21(3):278-88. doi: 10.1007/BF02102360.
2
A measure of the similarity of sets of sequences not requiring sequence alignment.一种无需序列比对的序列集相似性度量方法。
Proc Natl Acad Sci U S A. 1986 Jul;83(14):5155-9. doi: 10.1073/pnas.83.14.5155.
3
A stochastic analysis of three viral sequences.三个病毒序列的随机分析。
Mol Biol Evol. 1992 Jul;9(4):666-77. doi: 10.1093/oxfordjournals.molbev.a040741.
4
A prevalent persistent global nonrandomness that distinguishes coding and non-coding eucaryotic nuclear DNA sequences.一种普遍存在的持续性全球非随机性,它区分了编码和非编码真核细胞核DNA序列。
J Mol Evol. 1983;19(2):122-33. doi: 10.1007/BF02300750.
5
A representation of DNA primary sequences by random walk.通过随机游走对DNA一级序列的一种表示。
Math Biosci. 2007 Sep;209(1):282-91. doi: 10.1016/j.mbs.2006.06.004. Epub 2006 Jun 30.
6
Numerical characterization of DNA sequences based on the k-step Markov chain transition probability.基于k步马尔可夫链转移概率的DNA序列数值表征
J Comput Chem. 2006 Nov 30;27(15):1830-42. doi: 10.1002/jcc.20471.
7
A method of estimating from two aligned present-day DNA sequences their ancestral composition and subsequent rates of substitution, possibly different in the two lineages, corrected for multiple and parallel substitutions at the same site.一种从两条比对好的现代DNA序列估计其祖先组成以及后续替换率(两条谱系中的替换率可能不同)的方法,该方法针对同一位点的多重和平行替换进行了校正。
J Mol Evol. 1985;22(1):69-81. doi: 10.1007/BF02105807.
8
Systematic analysis of coding and noncoding DNA sequences using methods of statistical linguistics.使用统计语言学方法对编码和非编码DNA序列进行系统分析。
Phys Rev E Stat Phys Plasmas Fluids Relat Interdiscip Topics. 1995 Sep;52(3):2939-50. doi: 10.1103/physreve.52.2939.
9
Analytical expression of the purine/pyrimidine autocorrelation function after and before random mutations.随机突变前后嘌呤/嘧啶自相关函数的解析表达式。
Math Biosci. 1994 Sep;123(1):103-25. doi: 10.1016/0025-5564(94)90020-5.
10
Bacterial genomes lacking long-range correlations may not be modeled by low-order Markov chains: the role of mixing statistics and frame shift of neighboring genes.缺乏长程相关性的细菌基因组可能无法用低阶马尔可夫链建模:混合统计和相邻基因移码的作用。
Comput Biol Chem. 2014 Dec;53 Pt A:15-25. doi: 10.1016/j.compbiolchem.2014.08.005. Epub 2014 Aug 30.

引用本文的文献

1
A New Context Tree Inference Algorithm for Variable Length Markov Chain Model with Applications to Biological Sequence Analyses.一种新的上下文树推断算法,用于具有应用于生物序列分析的变量长度马尔可夫链模型。
J Comput Biol. 2022 Aug;29(8):839-856. doi: 10.1089/cmb.2021.0604. Epub 2022 Apr 22.
2
MLR-OOD: A Markov Chain Based Likelihood Ratio Method for Out-Of-Distribution Detection of Genomic Sequences.MLR-OOD:基于马尔可夫链的基因组序列分布外检测似然比方法。
J Mol Biol. 2022 Aug 15;434(15):167586. doi: 10.1016/j.jmb.2022.167586. Epub 2022 Apr 12.
3
Note on DNA Analysis and Redesigning Using Markov Chain.

本文引用的文献

1
Enzymatic synthesis of deoxyribonucleic acid. XI. Further studies on nearest neighbor base sequences in deoxyribonucleic acids.脱氧核糖核酸的酶促合成。十一。对脱氧核糖核酸中相邻碱基序列的进一步研究。
J Biol Chem. 1962 Jun;237:1961-7.
2
Enzymatic synthesis of deoxyribonucleic acid. VIII. Frequencies of nearest neighbor base sequences in deoxyribonucleic acid.脱氧核糖核酸的酶促合成。VIII. 脱氧核糖核酸中相邻碱基序列的频率
J Biol Chem. 1961 Mar;236:864-75.
3
Nucleotide sequences of class-switch recombination region of the mouse immunoglobulin gamma 2b-chain gene.
关于使用马尔可夫链进行DNA分析与重新设计的注释
Genes (Basel). 2022 Mar 21;13(3):554. doi: 10.3390/genes13030554.
4
Confidence intervals for Markov chain transition probabilities based on next generation sequencing reads data.基于下一代测序 reads 数据的马尔可夫链转移概率的置信区间
Quant Biol. 2020 Jul 13;8(2):143-154. doi: 10.1007/s40484-020-0200-y. Epub 2020 May 25.
5
KIMI: Knockoff Inference for Motif Identification from molecular sequences with controlled false discovery rate.KIMI:具有控制假发现率的分子序列 motif 识别的仿射推理。
Bioinformatics. 2021 May 5;37(6):759-766. doi: 10.1093/bioinformatics/btaa912.
6
Alignment-Free Sequence Analysis and Applications.无比对序列分析及其应用
Annu Rev Biomed Data Sci. 2018 Jul;1:93-114. doi: 10.1146/annurev-biodatasci-080917-013431. Epub 2018 Apr 25.
7
A signal processing method for alignment-free metagenomic binning: multi-resolution genomic binary patterns.一种无对齐信号处理方法在宏基因组分箱中的应用:多分辨率基因组二值模式。
Sci Rep. 2019 Feb 15;9(1):2159. doi: 10.1038/s41598-018-38197-9.
8
Optimal choice of word length when comparing two Markov sequences using a χ -statistic.使用 χ ²统计量比较两个马尔可夫序列时的最佳字长选择。
BMC Genomics. 2017 Oct 3;18(Suppl 6):732. doi: 10.1186/s12864-017-4020-z.
9
Inference of Markovian properties of molecular sequences from NGS data and applications to comparative genomics.从二代测序数据推断分子序列的马尔可夫性质及其在比较基因组学中的应用。
Bioinformatics. 2016 Apr 1;32(7):993-1000. doi: 10.1093/bioinformatics/btv395. Epub 2015 Jun 30.
10
New developments of alignment-free sequence comparison: measures, statistics and next-generation sequencing.无比对序列比较的新进展:度量、统计学与新一代测序
Brief Bioinform. 2014 May;15(3):343-53. doi: 10.1093/bib/bbt067. Epub 2013 Sep 23.
小鼠免疫球蛋白γ2b链基因类别转换重组区域的核苷酸序列。
Gene. 1980 Oct;11(1-2):117-27. doi: 10.1016/0378-1119(80)90092-x.
4
Complete nucleotide sequence of the human delta-globin gene.人类δ-珠蛋白基因的完整核苷酸序列。
Cell. 1980 Oct;21(3):639-46. doi: 10.1016/0092-8674(80)90427-4.
5
Human fetal G gamma- and A gamma-globin genes: complete nucleotide sequences suggest that DNA can be exchanged between these duplicated genes.人类胎儿Gγ-和Aγ-珠蛋白基因:完整的核苷酸序列表明,这些重复基因之间可发生DNA交换。
Cell. 1980 Oct;21(3):627-38. doi: 10.1016/0092-8674(80)90426-2.
6
Some rules in the ordering of nucleotides in the DNA.DNA中核苷酸排列的一些规则。
Nucleic Acids Res. 1980 Oct 10;8(19):4545-62. doi: 10.1093/nar/8.19.4545.
7
The structure of a human alpha-globin pseudogene and its relationship to alpha-globin gene duplication.人类α-珠蛋白假基因的结构及其与α-珠蛋白基因重复的关系。
Cell. 1980 Sep;21(2):537-44. doi: 10.1016/0092-8674(80)90491-2.
8
The evolution of genes: the chicken preproinsulin gene.基因的进化:鸡的胰岛素原基因。
Cell. 1980 Jun;20(2):555-66. doi: 10.1016/0092-8674(80)90641-8.
9
The universal dinucleotide asymmetry rules in DNA and the amino acid codon choice.DNA中的通用二核苷酸不对称规则与氨基酸密码子选择。
J Mol Evol. 1981;17(4):237-44. doi: 10.1007/BF01732761.
10
Isolation and sequence of the gene for actin in Saccharomyces cerevisiae.酿酒酵母肌动蛋白基因的分离与测序。
Proc Natl Acad Sci U S A. 1980 Jul;77(7):3912-6. doi: 10.1073/pnas.77.7.3912.