Suppr超能文献

论核酸相似性的统计学意义。

On the statistical significance of nucleic acid similarities.

作者信息

Lipman D J, Wilbur W J, Smith T F, Waterman M S

出版信息

Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):215-26. doi: 10.1093/nar/12.1part1.215.

Abstract

When evaluating sequence similarities among nucleic acids by the usual methods, statistical significance is often found when the biological significance of the similarity is dubious. We demonstrate that the known statistical properties of nucleic acid sequences strongly affect the statistical distribution of similarity values when calculated by standard procedures. We propose a series of models which account for some of these known statistical properties. The utility of the method is demonstrated in evaluating high relative similarity scores in four specific cases in which there is little biological context by which to judge the similarities. In two of the cases we identify the statistical properties which are responsible for the apparent similarity. In the other two cases the statistical significance of the similarity persists even when the known statistical properties of sequences are modelled. For one of these cases biological significance is likely while the other case remains an enigma.

摘要

在通过常规方法评估核酸之间的序列相似性时,常常会在相似性的生物学意义存疑的情况下发现统计学显著性。我们证明,核酸序列的已知统计特性在通过标准程序计算时会强烈影响相似性值的统计分布。我们提出了一系列考虑了其中一些已知统计特性的模型。该方法的实用性在四个特定案例中得到了证明,在这些案例中几乎没有生物学背景来判断相似性,却出现了较高的相对相似性得分。在其中两个案例中,我们确定了导致明显相似性的统计特性。在另外两个案例中,即使对序列的已知统计特性进行了建模,相似性的统计学显著性仍然存在。在其中一个案例中,生物学意义可能存在,而另一个案例仍然是个谜。

相似文献

1
On the statistical significance of nucleic acid similarities.论核酸相似性的统计学意义。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):215-26. doi: 10.1093/nar/12.1part1.215.
2
The statistical distribution of nucleic acid similarities.核酸相似性的统计分布。
Nucleic Acids Res. 1985 Jan 25;13(2):645-56. doi: 10.1093/nar/13.2.645.
5
Informational parameters of nucleic acid and molecular evolution.核酸与分子进化的信息参数
J Theor Biol. 1988 Feb 7;130(3):351-61. doi: 10.1016/s0022-5193(88)80034-1.
6
On the statistical assessment of similarities in DNA sequences.关于DNA序列相似性的统计评估。
Nucleic Acids Res. 1984 Jul 11;12(13):5529-43. doi: 10.1093/nar/12.13.5529.
8
Computation of statistical secondary structure of nucleic acids.核酸统计二级结构的计算。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):335-46. doi: 10.1093/nar/12.1part1.335.
9
Statistical analysis of DNA sequences.DNA序列的统计分析。
J Natl Cancer Inst. 1988 May 18;80(6):395-406. doi: 10.1093/jnci/80.6.395.

引用本文的文献

1
The core and unique proteins of haloarchaea.嗜盐菌的核心和独特蛋白。
BMC Genomics. 2012 Jan 24;13:39. doi: 10.1186/1471-2164-13-39.
2
Target-decoy approach and false discovery rate: when things may go wrong.靶向诱饵方法和错误发现率:当事情可能出错时。
J Am Soc Mass Spectrom. 2011 Jul;22(7):1111-20. doi: 10.1007/s13361-011-0139-3. Epub 2011 May 5.

本文引用的文献

1
Some rules in the ordering of nucleotides in the DNA.DNA中核苷酸排列的一些规则。
Nucleic Acids Res. 1980 Oct 10;8(19):4545-62. doi: 10.1093/nar/8.19.4545.
2
Strong adenine clustering in nucleotide sequences.核苷酸序列中强烈的腺嘌呤聚类。
J Theor Biol. 1980 Jul 21;85(2):285-91. doi: 10.1016/0022-5193(80)90021-1.
3
Identification of common molecular subsequences.常见分子子序列的鉴定
J Mol Biol. 1981 Mar 25;147(1):195-7. doi: 10.1016/0022-2836(81)90087-5.
4
Recognition of protein coding regions in DNA sequences.DNA序列中蛋白质编码区域的识别。
Nucleic Acids Res. 1982 Sep 11;10(17):5303-18. doi: 10.1093/nar/10.17.5303.
5
Codon catalog usage and the genome hypothesis.密码子目录使用与基因组假说。
Nucleic Acids Res. 1980 Jan 11;8(1):r49-r62. doi: 10.1093/nar/8.1.197-c.
6
Random sequences.随机序列
J Mol Biol. 1983 Jan 15;163(2):171-6. doi: 10.1016/0022-2836(83)90002-5.
8
Contextual constraints on synonymous codon choice.同义密码子选择的上下文限制
J Mol Biol. 1983 Jan 25;163(3):363-76. doi: 10.1016/0022-2836(83)90063-3.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验