分子序列中多个高分片段的应用与统计

Applications and statistics for multiple high-scoring segments in molecular sequences.

作者信息

Karlin S, Altschul S F

机构信息

Department of Mathematics, Stanford University, CA 94305.

出版信息

Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5873-7. doi: 10.1073/pnas.90.12.5873.

DOI:10.1073/pnas.90.12.5873

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC46825/

Abstract

Score-based measures of molecular-sequence features provide versatile aids for the study of proteins and DNA. They are used by many sequence data base search programs, as well as for identifying distinctive properties of single sequences. For any such measure, it is important to know what can be expected to occur purely by chance. The statistical distribution of high-scoring segments has been described elsewhere. However, molecular sequences will frequently yield several high-scoring segments for which some combined assessment is in order. This paper describes the statistical distribution for the sum of the scores of multiple high-scoring segments and illustrates its application to the identification of possible transmembrane segments and the evaluation of sequence similarity.

摘要

基于分数的分子序列特征度量为蛋白质和DNA研究提供了多功能辅助工具。许多序列数据库搜索程序都使用它们，同时也用于识别单序列的独特属性。对于任何此类度量，了解纯粹偶然情况下可能发生的情况很重要。高分片段的统计分布已在其他地方描述过。然而，分子序列经常会产生几个高分片段，对此需要进行一些综合评估。本文描述了多个高分片段得分总和的统计分布，并说明了其在识别可能的跨膜片段和评估序列相似性方面的应用。

相似文献

1

Applications and statistics for multiple high-scoring segments in molecular sequences.分子序列中多个高分片段的应用与统计

Proc Natl Acad Sci U S A. 1993 Jun 15;90(12):5873-7. doi: 10.1073/pnas.90.12.5873.

2

The interaction of bride of sevenless with sevenless is conserved between Drosophila virilis and Drosophila melanogaster.无七之妻与无七的相互作用在粗壮果蝇和黑腹果蝇之间是保守的。

Proc Natl Acad Sci U S A. 1993 Jun 1;90(11):5047-51. doi: 10.1073/pnas.90.11.5047.

3

Molecular identification of a G protein-coupled receptor family which is expressed in planarians.在涡虫中表达的一个G蛋白偶联受体家族的分子鉴定。

Gene. 1997 Aug 11;195(1):55-61. doi: 10.1016/s0378-1119(97)00152-2.

4

Statistical studies of biomolecular sequences: score-based methods.生物分子序列的统计研究：基于分数的方法。

Philos Trans R Soc Lond B Biol Sci. 1994 Jun 29;344(1310):391-402. doi: 10.1098/rstb.1994.0078.

5

sevenless: Seven found?七缺失：发现七个了？

Cell. 1990 Apr 6;61(1):15-6. doi: 10.1016/0092-8674(90)90209-w.

6

Methods for assessing the statistical significance of molecular sequence features by using general scoring schemes.使用通用评分方案评估分子序列特征统计显著性的方法。

Proc Natl Acad Sci U S A. 1990 Mar;87(6):2264-8. doi: 10.1073/pnas.87.6.2264.

7

The proto-oncogene c-ros codes for a transmembrane tyrosine protein kinase sharing sequence and structural homology with sevenless protein of Drosophila melanogaster.原癌基因c-ros编码一种跨膜酪氨酸蛋白激酶，该激酶与果蝇的无翅蛋白具有序列和结构同源性。

Oncogene. 1991 Feb;6(2):257-64.

8

Extracellular domain of the boss transmembrane ligand acts as an antagonist of the sev receptor.boss跨膜配体的细胞外结构域作为sev受体的拮抗剂。

Nature. 1993 Feb 25;361(6414):732-6. doi: 10.1038/361732a0.

9

Molecular drift of the bride of sevenless (boss) gene in Drosophila.果蝇中七无新娘（boss）基因的分子漂变

Mol Biol Evol. 1993 Sep;10(5):1030-40. doi: 10.1093/oxfordjournals.molbev.a040052.

10

Cloning and characterization of a Drosophila serotonin receptor that activates adenylate cyclase.激活腺苷酸环化酶的果蝇血清素受体的克隆与特性分析

Proc Natl Acad Sci U S A. 1990 Nov;87(22):8940-4. doi: 10.1073/pnas.87.22.8940.

引用本文的文献

1

IgG Antibody Responses to Epstein-Barr Virus in Myalgic Encephalomyelitis/Chronic Fatigue Syndrome: Their Effective Potential for Disease Diagnosis and Pathological Antigenic Mimicry.肌痛性脑脊髓炎/慢性疲劳综合征患者针对 Epstein-Barr 病毒的 IgG 抗体反应：用于疾病诊断和病理性抗原模拟的有效潜力。

Medicina (Kaunas). 2024 Jan 15;60(1):161. doi: 10.3390/medicina60010161.

2

Biopharmaceutics 4.0, Advanced Pre-Clinical Development of mRNA-Encoded Monoclonal Antibodies to Immunosuppressed Murine Models.生物药剂学4.0，针对免疫抑制小鼠模型的mRNA编码单克隆抗体的高级临床前开发。

Vaccines (Basel). 2021 Aug 11;9(8):890. doi: 10.3390/vaccines9080890.

3

Neutralizing Antibody Responses Induced by HIV-1 Envelope Glycoprotein SOSIP Trimers Derived from Elite Neutralizers.由精英中和抗体诱导的 HIV-1 包膜糖蛋白 SOSIP 三聚体产生的中和抗体反应。

J Virol. 2020 Nov 23;94(24). doi: 10.1128/JVI.01214-20.

4

A Scoring Algorithm for the Automated Analysis of Glycosaminoglycan MS/MS Data.糖胺聚糖 MS/MS 数据自动化分析的评分算法。

J Am Soc Mass Spectrom. 2019 Dec;30(12):2692-2703. doi: 10.1007/s13361-019-02338-9. Epub 2019 Oct 31.

5

Identifying Resistance in Strawberry Through Disease Screening of Multiple Populations and Image Based Phenotyping.通过多群体病害筛选和基于图像的表型分析鉴定草莓抗性

Front Plant Sci. 2019 Jul 18;10:924. doi: 10.3389/fpls.2019.00924. eCollection 2019.

6

Identification of powdery mildew resistance QTL in strawberry (Fragaria × ananassa).草莓（Fragaria × ananassa）白粉病抗性 QTL 的鉴定。

Theor Appl Genet. 2018 Sep;131(9):1995-2007. doi: 10.1007/s00122-018-3128-0. Epub 2018 Jul 3.

7

AOX1-Subfamily Gene Members in Olea europaea cv. "Galega Vulgar"-Gene Characterization and Expression of Transcripts during IBA-Induced in Vitro Adventitious Rooting.油橄榄 cv. “Galega Vulgar”-IBA 诱导离体不定根形成过程中 AOX1 亚家族基因成员的基因特征和转录本表达。

Int J Mol Sci. 2018 Feb 17;19(2):597. doi: 10.3390/ijms19020597.

8

Inferring joint sequence-structural determinants of protein functional specificity.推断蛋白质功能特异性的关节序列结构决定因素。

Elife. 2018 Jan 16;7:e29880. doi: 10.7554/eLife.29880.

9

GHOST: global hepatitis outbreak and surveillance technology.GHOST：全球肝炎爆发和监测技术。

BMC Genomics. 2017 Dec 6;18(Suppl 10):916. doi: 10.1186/s12864-017-4268-3.

10

Tracing the epidemic history of HIV-1 CRF01_AE clusters using near-complete genome sequences.利用近乎完整的基因组序列追踪 HIV-1 CRF01_AE 簇的流行病史。

Sci Rep. 2017 Jun 22;7(1):4024. doi: 10.1038/s41598-017-03820-8.

本文引用的文献

1

Identification of protein coding regions by database similarity search.通过数据库相似性搜索鉴定蛋白质编码区域。

Nat Genet. 1993 Mar;3(3):266-72. doi: 10.1038/ng0393-266.

2

The ovalbumin gene family: structure of the X gene and evolution of duplicated split genes.卵清蛋白基因家族：X基因的结构与重复分裂基因的进化

Cell. 1980 Jul;20(3):625-37. doi: 10.1016/0092-8674(80)90309-8.

3

Identification of common molecular subsequences.常见分子子序列的鉴定

J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.

4

Random sequences.随机序列

J Mol Biol. 1983 Jan 15;163(2):171-6. doi: 10.1016/0022-2836(83)90002-5.

5

Aligning amino acid sequences: comparison of commonly used methods.氨基酸序列比对：常用方法比较

J Mol Evol. 1984;21(2):112-25. doi: 10.1007/BF02100085.

6

Tests for comparing related amino-acid sequences. Cytochrome c and cytochrome c 551 .用于比较相关氨基酸序列的测试。细胞色素c和细胞色素c 551 。

J Mol Biol. 1971 Oct 28;61(2):409-24. doi: 10.1016/0022-2836(71)90390-1.

7

The statistical distribution of nucleic acid similarities.核酸相似性的统计分布。

Nucleic Acids Res. 1985 Jan 25;13(2):645-56. doi: 10.1093/nar/13.2.645.

8

Significance of nucleotide sequence alignments: a method for random sequence permutation that preserves dinucleotide and codon usage.核苷酸序列比对的意义：一种保留二核苷酸和密码子使用情况的随机序列置换方法。

Mol Biol Evol. 1985 Nov;2(6):526-38. doi: 10.1093/oxfordjournals.molbev.a040370.

9

On the PAM matrix model of protein evolution.关于蛋白质进化的PAM矩阵模型。

Mol Biol Evol. 1985 Sep;2(5):434-47. doi: 10.1093/oxfordjournals.molbev.a040360.

10

The significance of protein sequence similarities.蛋白质序列相似性的意义。

Comput Appl Biosci. 1988 Mar;4(1):67-71. doi: 10.1093/bioinformatics/4.1.67.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验