DNA和蛋白质序列的比较统计：单序列分析

Comparative statistics for DNA and protein sequences: single sequence analysis.

作者信息

Karlin S, Ghandour G

出版信息

Proc Natl Acad Sci U S A. 1985 Sep;82(17):5800-4. doi: 10.1073/pnas.82.17.5800.

DOI:10.1073/pnas.82.17.5800

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC390640/

Abstract

Four categories of data representations are used to help interpret structures and similarities of nucleic acid and protein sequences. Statistical significance of the observed relationships revealed by these representations are assessed by a hierarchy of permutation procedures and by comparisons with theoretical random models. Applications are presented for various DNA sequences including papovaviruses, Epstein-Barr virus, mitochondrial genomes, and several globin and immunoglobulin genes.

摘要

四类数据表示法用于帮助解释核酸和蛋白质序列的结构及相似性。通过一系列置换程序以及与理论随机模型进行比较，来评估这些表示法所揭示的观察到的关系的统计显著性。文中展示了各种DNA序列的应用情况，包括乳头瘤病毒、爱泼斯坦-巴尔病毒、线粒体基因组以及多个珠蛋白和免疫球蛋白基因。

相似文献

1

Comparative statistics for DNA and protein sequences: single sequence analysis.DNA和蛋白质序列的比较统计：单序列分析

Proc Natl Acad Sci U S A. 1985 Sep;82(17):5800-4. doi: 10.1073/pnas.82.17.5800.

2

Comparative statistics for DNA and protein sequences: multiple sequence analysis.DNA和蛋白质序列的比较统计：多序列分析

Proc Natl Acad Sci U S A. 1985 Sep;82(18):6186-90. doi: 10.1073/pnas.82.18.6186.

3

New approaches for computer analysis of nucleic acid sequences.核酸序列计算机分析的新方法。

Proc Natl Acad Sci U S A. 1983 Sep;80(18):5660-4. doi: 10.1073/pnas.80.18.5660.

4

Clustered repeat sequences in the genome of Epstein Barr virus.爱泼斯坦-巴尔病毒基因组中的成簇重复序列。

Nucleic Acids Res. 1983 Jun 25;11(12):3919-37. doi: 10.1093/nar/11.12.3919.

5

Characterization of the DNA of the hamster papovavirus. III. Mapping of inverted repeated DNA sequences within the viral genome.仓鼠乳头多瘤空泡病毒DNA的特性。III. 病毒基因组内反向重复DNA序列的图谱绘制。

Biomed Biochim Acta. 1986;45(7):887-95.

6

Repetitive DNA sequences near three human beta-type globin genes.三种人类β-珠蛋白基因附近的重复DNA序列。

Nucleic Acids Res. 1980 Aug 11;8(15):3319-33. doi: 10.1093/nar/8.15.3319.

7

Evolutionary relationships between papovaviruses and their hosts.乳头多瘤空泡病毒与其宿主之间的进化关系。

Arch Geschwulstforsch. 1983;53(3):197-206.

8

De novo DNA methylation at nonrandom founder sites 5' from an unmethylated minimal origin of DNA replication in latent Epstein-Barr virus genomes.在潜伏的爱泼斯坦-巴尔病毒基因组中，未甲基化的最小DNA复制起点上游5'处的非随机起始位点发生从头DNA甲基化。

Biol Chem. 2000 Feb;381(2):95-105. doi: 10.1515/BC.2000.014.

9

Localization and characterization of members of a family of repetitive sequences in the goat beta globin locus.山羊β珠蛋白基因座中一个重复序列家族成员的定位与特征分析

Nucleic Acids Res. 1985 Mar 25;13(6):2171-86. doi: 10.1093/nar/13.6.2171.

10

Variability within the rabbit C repeats and sequences shared with other SINES.兔C重复序列内的变异性以及与其他短散在核元件共享的序列。

Nucleic Acids Res. 1985 Feb 25;13(4):1073-88. doi: 10.1093/nar/13.4.1073.

引用本文的文献

1

Applications of transformer-based language models in bioinformatics: a survey.基于Transformer的语言模型在生物信息学中的应用：一项综述。

Bioinform Adv. 2023 Jan 11;3(1):vbad001. doi: 10.1093/bioadv/vbad001. eCollection 2023.

2

The preferred nucleotide contexts of the AID/APOBEC cytidine deaminases have differential effects when mutating retrotransposon and virus sequences compared to host genes.与宿主基因相比，AID/APOBEC胞苷脱氨酶的偏好性核苷酸上下文在使逆转座子和病毒序列发生突变时具有不同的作用。

PLoS Comput Biol. 2017 Mar 31;13(3):e1005471. doi: 10.1371/journal.pcbi.1005471. eCollection 2017 Mar.

3

CLUSS: clustering of protein sequences based on a new similarity measure.CLUSS：基于一种新的相似性度量对蛋白质序列进行聚类。

BMC Bioinformatics. 2007 Aug 4;8:286. doi: 10.1186/1471-2105-8-286.

4

Modular sequence elements associated with origin regions in eukaryotic chromosomal DNA.与真核生物染色体DNA起始区域相关的模块化序列元件。

Nucleic Acids Res. 1994 Jul 11;22(13):2479-89. doi: 10.1093/nar/22.13.2479.

5

DNA sequence patterns in human, mouse, and rabbit immunoglobulin kappa-genes.人类、小鼠和兔免疫球蛋白κ基因中的DNA序列模式。

J Mol Evol. 1985;22(3):195-208. doi: 10.1007/BF02099749.

6

Comparative statistics for DNA and protein sequences: multiple sequence analysis.DNA和蛋白质序列的比较统计：多序列分析

Proc Natl Acad Sci U S A. 1985 Sep;82(18):6186-90. doi: 10.1073/pnas.82.18.6186.

7

A nonlinear measure of subalignment similarity and its significance levels.子比对相似性的非线性度量及其显著性水平。

Bull Math Biol. 1986;48(5-6):617-32. doi: 10.1007/BF02462327.

8

DNA turnover and the molecular clock.DNA周转与分子钟

J Mol Evol. 1987;26(1-2):47-58. doi: 10.1007/BF02111281.

9

Significant potential secondary structures in the Epstein-Barr virus genome.爱泼斯坦-巴尔病毒基因组中存在显著的潜在二级结构。

Proc Natl Acad Sci U S A. 1986 Sep;83(18):6915-9. doi: 10.1073/pnas.83.18.6915.

10

A model for the development of the tandem repeat units in the EBV ori-P region and a discussion of their possible function.

J Mol Evol. 1987;25(3):215-29. doi: 10.1007/BF02100015.

本文引用的文献

1

Similar amino acid sequences: chance or common ancestry?相似的氨基酸序列：偶然因素还是共同祖先？

Science. 1981 Oct 9;214(4517):149-59. doi: 10.1126/science.7280687.

2

Complete nucleotide sequence of bacteriophage T7 DNA and the locations of T7 genetic elements.噬菌体T7 DNA的完整核苷酸序列及T7遗传元件的定位

J Mol Biol. 1983 Jun 5;166(4):477-535. doi: 10.1016/s0022-2836(83)80282-4.

3

Random sequences.随机序列

J Mol Biol. 1983 Jan 15;163(2):171-6. doi: 10.1016/0022-2836(83)90002-5.

4

Statistical characterization of nucleic acid sequence functional domains.核酸序列功能域的统计学特征

Nucleic Acids Res. 1983 Apr 11;11(7):2205-20. doi: 10.1093/nar/11.7.2205.

5

Permutation methods for the structured exploratory data analysis (SEDA) of familial trait values.家族性状值的结构化探索性数据分析（SEDA）的排列方法。

Am J Hum Genet. 1984 Jul;36(4):873-98.

6

Organization of the Epstein-Barr virus DNA molecule. II. Fine mapping of the boundaries of the internal repeat cluster of B95-8 and identification of additional small tandem repeats adjacent to the HR-1 deletion.爱泼斯坦-巴尔病毒DNA分子的组织。II. B95-8内部重复簇边界的精细定位以及与HR-1缺失相邻的其他小串联重复序列的鉴定。

J Virol. 1982 Jul;43(1):201-12. doi: 10.1128/JVI.43.1.201-212.1982.

7

Translational coupling during expression of the tryptophan operon of Escherichia coli.大肠杆菌色氨酸操纵子表达过程中的翻译偶联。

Genetics. 1980 Aug;95(4):785-95. doi: 10.1093/genetics/95.4.785.

8

Unusual regulation of simian virus 40 early-region transcription in genomes containing two origins of DNA replication.在含有两个DNA复制起点的基因组中猿猴病毒40早期区域转录的异常调控

Mol Cell Biol. 1984 Sep;4(9):1915-28. doi: 10.1128/mcb.4.9.1915-1928.1984.

9

DNA sequence and expression of the B95-8 Epstein-Barr virus genome.B95-8型爱泼斯坦-巴尔病毒基因组的DNA序列与表达

Nature. 1984;310(5974):207-11. doi: 10.1038/310207a0.

10

Evidence for higher rates of nucleotide substitution in rodents than in man.啮齿动物的核苷酸替换率高于人类的证据。

Proc Natl Acad Sci U S A. 1985 Mar;82(6):1741-5. doi: 10.1073/pnas.82.6.1741.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验