• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

核酸相似性的统计分布。

The statistical distribution of nucleic acid similarities.

作者信息

Smith T F, Waterman M S, Burks C

出版信息

Nucleic Acids Res. 1985 Jan 25;13(2):645-56. doi: 10.1093/nar/13.2.645.

DOI:10.1093/nar/13.2.645
PMID:3871073
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC341021/
Abstract

All pairs of a large set of known vertebrate DNA sequences were searched by computer for most similar segments. Analysis of this data shows that the computed similarity scores are distributed proportionally to the logarithm of the product of the lengths of the sequences involved. This distribution is closely related to recent results of Erdos and others on the longest run of heads in coin tossing. A simple rule is derived for determination of statistical significance of the similarity scores and to assist in relating statistical and biological significance.

摘要

通过计算机对一大组已知脊椎动物DNA序列的所有序列对进行搜索,以寻找最相似的片段。对这些数据的分析表明,计算出的相似性得分与所涉及序列长度乘积的对数成比例分布。这种分布与埃尔德什等人最近关于抛硬币中最长正面序列的结果密切相关。推导了一个简单规则,用于确定相似性得分的统计显著性,并有助于关联统计显著性和生物学显著性。

相似文献

1
The statistical distribution of nucleic acid similarities.核酸相似性的统计分布。
Nucleic Acids Res. 1985 Jan 25;13(2):645-56. doi: 10.1093/nar/13.2.645.
2
On the statistical significance of nucleic acid similarities.论核酸相似性的统计学意义。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):215-26. doi: 10.1093/nar/12.1part1.215.
3
The probabilities of similarities in DNA sequence comparisons.DNA序列比较中相似性的概率。
Genomics. 1988 Oct;3(3):207-16. doi: 10.1016/0888-7543(88)90081-x.
4
Statistical significance of symmetrical and repetitive segments in DNA.DNA中对称和重复片段的统计学意义。
Nucleic Acids Res. 1982 Dec 20;10(24):8323-39. doi: 10.1093/nar/10.24.8323.
5
On the statistical assessment of similarities in DNA sequences.关于DNA序列相似性的统计评估。
Nucleic Acids Res. 1984 Jul 11;12(13):5529-43. doi: 10.1093/nar/12.13.5529.
6
An accurate approximation to the distribution of the length of the longest matching word between two random DNA sequences.两个随机DNA序列之间最长匹配单词长度分布的精确近似值。
Bull Math Biol. 1990;52(6):773-84. doi: 10.1007/BF02460808.
7
DNA sequence comparisons of the human, mouse, and rabbit immunoglobulin kappa gene.人类、小鼠和兔免疫球蛋白κ基因的DNA序列比较。
Mol Biol Evol. 1985 Jan;2(1):35-52. doi: 10.1093/oxfordjournals.molbev.a040336.
8
Statistical analysis of nucleotide runs in coding and noncoding DNA sequences.编码和非编码DNA序列中核苷酸序列的统计分析
J Biomol Struct Dyn. 1988 Oct;6(2):345-58. doi: 10.1080/07391102.1988.10507717.
9
Use of statistical criteria for screening potential homologies in nucleic acid sequences.使用统计标准筛选核酸序列中的潜在同源性。
Nucleic Acids Res. 1984 Jan 11;12(1 Pt 1):203-13. doi: 10.1093/nar/12.1part1.203.
10
Comparative statistics for DNA and protein sequences: single sequence analysis.DNA和蛋白质序列的比较统计:单序列分析
Proc Natl Acad Sci U S A. 1985 Sep;82(17):5800-4. doi: 10.1073/pnas.82.17.5800.

引用本文的文献

1
Label Transfer for Drug Disease Association in Three Meta-Paths.基于三条元路径的药物-疾病关联标签传递
Evol Bioinform Online. 2024 Sep 13;20:11769343241272414. doi: 10.1177/11769343241272414. eCollection 2024.
2
Competitive mapping allows for the identification and exclusion of human DNA contamination in ancient faunal genomic datasets.竞争作图可用于鉴定和排除古代动物基因组数据集中的人类 DNA 污染。
BMC Genomics. 2020 Nov 30;21(1):844. doi: 10.1186/s12864-020-07229-y.
3
LncRNA-miRNA interaction prediction through sequence-derived linear neighborhood propagation method with information combination.通过序列衍生线性邻域传播方法与信息组合进行 lncRNA-miRNA 相互作用预测。
BMC Genomics. 2019 Dec 20;20(Suppl 11):946. doi: 10.1186/s12864-019-6284-y.
4
Drug repositioning of herbal compounds via a machine-learning approach.基于机器学习的中草药化合物的药物再定位。
BMC Bioinformatics. 2019 May 29;20(Suppl 10):247. doi: 10.1186/s12859-019-2811-8.
5
Drug repurposing in oncology: Compounds, pathways, phenotypes and computational approaches for colorectal cancer.肿瘤学中的药物再利用:结直肠癌的化合物、途径、表型和计算方法。
Biochim Biophys Acta Rev Cancer. 2019 Apr;1871(2):434-454. doi: 10.1016/j.bbcan.2019.04.005. Epub 2019 Apr 26.
6
Prediction of drug-disease associations based on ensemble meta paths and singular value decomposition.基于集成元路径和奇异值分解的药物-疾病关联预测。
BMC Bioinformatics. 2019 Mar 29;20(Suppl 3):134. doi: 10.1186/s12859-019-2644-5.
7
SFPEL-LPI: Sequence-based feature projection ensemble learning for predicting LncRNA-protein interactions.SFPEL-LPI:基于序列的特征投影集成学习预测 LncRNA-蛋白质相互作用。
PLoS Comput Biol. 2018 Dec 11;14(12):e1006616. doi: 10.1371/journal.pcbi.1006616. eCollection 2018 Dec.
8
Quantifying and reducing spurious alignments for the analysis of ultra-short ancient DNA sequences.量化和减少超短古 DNA 序列分析中的虚假比对。
BMC Biol. 2018 Oct 25;16(1):121. doi: 10.1186/s12915-018-0581-9.
9
Predicting drug-disease associations by using similarity constrained matrix factorization.基于相似性约束矩阵分解预测药物-疾病关联。
BMC Bioinformatics. 2018 Jun 19;19(1):233. doi: 10.1186/s12859-018-2220-4.
10
Towards drug repositioning: a unified computational framework for integrating multiple aspects of drug similarity and disease similarity.迈向药物重新定位:一个整合药物相似性和疾病相似性多个方面的统一计算框架。
AMIA Annu Symp Proc. 2014 Nov 14;2014:1258-67. eCollection 2014.

本文引用的文献

1
Optimal sequence alignments.最佳序列比对。
Proc Natl Acad Sci U S A. 1983 Mar;80(5):1382-6. doi: 10.1073/pnas.80.5.1382.
2
The nucleotide sequence of the ubiquitous repetitive DNA sequence B1 complementary to the most abundant class of mouse fold-back RNA.与最丰富的一类小鼠回折RNA互补的普遍存在的重复DNA序列B1的核苷酸序列。
Nucleic Acids Res. 1980 Mar 25;8(6):1201-15. doi: 10.1093/nar/8.6.1201.
3
The nucleotide sequence of the major beta-globin mRNA from Xenopus laevis.非洲爪蟾主要β-珠蛋白信使核糖核酸的核苷酸序列。
Nucleic Acids Res. 1980 Sep 25;8(18):4247-58. doi: 10.1093/nar/8.18.4247.
4
The ovalbumin gene family: structure of the X gene and evolution of duplicated split genes.卵清蛋白基因家族:X基因的结构与重复分裂基因的进化
Cell. 1980 Jul;20(3):625-37. doi: 10.1016/0092-8674(80)90309-8.
5
The structure of a human alpha-globin pseudogene and its relationship to alpha-globin gene duplication.人类α-珠蛋白假基因的结构及其与α-珠蛋白基因重复的关系。
Cell. 1980 Sep;21(2):537-44. doi: 10.1016/0092-8674(80)90491-2.
6
Structural analysis of interspersed repetitive polymerase III transcription units in human DNA.人类DNA中散布的重复聚合酶III转录单元的结构分析。
Nucleic Acids Res. 1981 Mar 11;9(5):1151-70.
7
Molecular cloning and characterization of cDNA sequences coding for rat relaxin.编码大鼠松弛素的cDNA序列的分子克隆与特性分析
Nature. 1981 May 14;291(5811):127-31. doi: 10.1038/291127a0.
8
Primary structure of the human Met- and Leu-enkephalin precursor and its mRNA.人甲硫氨酸脑啡肽和亮氨酸脑啡肽前体及其信使核糖核酸的一级结构。
Nature. 1982 Feb 25;295(5851):663-6. doi: 10.1038/295663a0.
9
Nucleotide sequence of Xenopus laevis 18S ribosomal RNA inferred from gene sequence.从基因序列推断出的非洲爪蟾18S核糖体RNA的核苷酸序列。
Nature. 1981 May 21;291(5812):205-8. doi: 10.1038/291205a0.
10
Isolation and sequence of the gene for actin in Saccharomyces cerevisiae.酿酒酵母肌动蛋白基因的分离与测序。
Proc Natl Acad Sci U S A. 1980 Jul;77(7):3912-6. doi: 10.1073/pnas.77.7.3912.