使用序列特异性和位置特异性取代矩阵进行局部序列比对的成对统计显著性。

Pairwise statistical significance of local sequence alignment using sequence-specific and position-specific substitution matrices.

机构信息

Department of Computer Science, Iowa State University, 226 Atanasoff Hall, Ames, IA 50011-1041, USA.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):194-205. doi: 10.1109/TCBB.2009.69.

DOI:10.1109/TCBB.2009.69

Abstract

Pairwise sequence alignment is a central problem in bioinformatics, which forms the basis of various other applications. Two related sequences are expected to have a high alignment score, but relatedness is usually judged by statistical significance rather than by alignment score. Recently, it was shown that pairwise statistical significance gives promising results as an alternative to database statistical significance for getting individual significance estimates of pairwise alignment scores. The improvement was mainly attributed to making the statistical significance estimation process more sequence-specific and database-independent. In this paper, we use sequence-specific and position-specific substitution matrices to derive the estimates of pairwise statistical significance, which is expected to use more sequence-specific information in estimating pairwise statistical significance. Experiments on a benchmark database with sequence-specific substitution matrices at different levels of sequence-specific contribution were conducted, and results confirm that using sequence-specific substitution matrices for estimating pairwise statistical significance is significantly better than using a standard matrix like BLOSUM62, and than database statistical significance estimates reported by popular database search programs like BLAST, PSI-BLAST (without pretrained PSSMs), and SSEARCH on a benchmark database, but with pretrained PSSMs, PSI-BLAST results are significantly better. Further, using position-specific substitution matrices for estimating pairwise statistical significance gives significantly better results even than PSI-BLAST using pretrained PSSMs.

摘要

序列比对是生物信息学中的一个核心问题，它是许多其他应用的基础。人们期望相关的两个序列具有较高的比对得分，但相关性通常是通过统计显著性来判断，而不是通过比对得分。最近，有人表明，对于获得两两比对得分的个体显著性估计，两两统计显著性可以替代数据库统计显著性，作为一种替代方法，它具有很好的效果。这种改进主要归因于使统计显著性估计过程更加序列特异性和数据库独立性。在本文中，我们使用序列特异性和位置特异性替换矩阵来推导两两统计显著性的估计值，预计这将在估计两两统计显著性时使用更多的序列特异性信息。在具有不同序列特异性贡献水平的序列特异性替换矩阵的基准数据库上进行了实验，结果证实，使用序列特异性替换矩阵来估计两两统计显著性明显优于使用像 BLOSUM62 这样的标准矩阵，也优于流行的数据库搜索程序（如 BLAST、PSI-BLAST（无预训练 PSSM）和 SSEARCH）在基准数据库上报告的数据库统计显著性估计值，但具有预训练 PSSM 的 PSI-BLAST 结果要好得多。此外，使用位置特异性替换矩阵来估计两两统计显著性甚至比 PSI-BLAST 使用预训练 PSSM 得到的结果要好得多。

相似文献

Pairwise statistical significance of local sequence alignment using sequence-specific and position-specific substitution matrices.使用序列特异性和位置特异性取代矩阵进行局部序列比对的成对统计显著性。

IEEE/ACM Trans Comput Biol Bioinform. 2011 Jan-Mar;8(1):194-205. doi: 10.1109/TCBB.2009.69.

PSIBLAST_PairwiseStatSig: reordering PSI-BLAST hits using pairwise statistical significance.PSI-BLAST成对统计显著性：使用成对统计显著性对PSI-BLAST命中结果进行重新排序。

Bioinformatics. 2009 Apr 15;25(8):1082-3. doi: 10.1093/bioinformatics/btp089. Epub 2009 Feb 27.

Sequence-specific sequence comparison using pairwise statistical significance.基于成对统计显著性的序列特异性序列比较。

Adv Exp Med Biol. 2011;696:297-306. doi: 10.1007/978-1-4419-7046-6_30.

Pairwise statistical significance and empirical determination of effective gap opening penalties for protein local sequence alignment.蛋白质局部序列比对中有效空位开放罚分的成对统计显著性和经验确定

Int J Comput Biol Drug Des. 2008;1(4):347-67. doi: 10.1504/ijcbdd.2008.022207.

Pairwise statistical significance of local sequence alignment using multiple parameter sets and empirical justification of parameter set change penalty.使用多个参数集进行局部序列比对的成对统计显著性以及参数集变化罚分的经验依据。

BMC Bioinformatics. 2009 Mar 19;10 Suppl 3(Suppl 3):S1. doi: 10.1186/1471-2105-10-S3-S1.

IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices.IMPALA：将蛋白质序列与一组由PSI-BLAST构建的位置特异性得分矩阵进行匹配。

Bioinformatics. 1999 Dec;15(12):1000-11. doi: 10.1093/bioinformatics/15.12.1000.

A comparison of position-specific score matrices based on sequence and structure alignments.基于序列和结构比对的特定位置得分矩阵比较。

Protein Sci. 2002 Feb;11(2):361-70. doi: 10.1110/ps.19902.

Provably sensitive indexing strategies for biosequence similarity search.用于生物序列相似性搜索的可证明敏感索引策略。

J Comput Biol. 2003;10(3-4):399-417. doi: 10.1089/10665270360688093.

CTX-BLAST: context sensitive version of protein BLAST.CTX-BLAST：蛋白质BLAST的上下文敏感版本。

Bioinformatics. 2007 Jul 1;23(13):1686-8. doi: 10.1093/bioinformatics/btm136. Epub 2007 Apr 25.

Accelerating pairwise statistical significance estimation for local alignment by harvesting GPU's power.利用 GPU 加速局部比对的成对统计显著性估计。

BMC Bioinformatics. 2012 Apr 12;13 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2105-13-S5-S3.

引用本文的文献

Selection of Optimal Bioinformatic Tools and Proper Reference for Reducing the Alignment Error in Targeted Sequencing Data.选择最佳生物信息学工具和合适的参考以减少靶向测序数据中的比对错误。

J Med Signals Sens. 2021 Jan 30;11(1):37-44. doi: 10.4103/jmss.JMSS_7_20. eCollection 2021 Jan-Mar.

PR2ALIGN: a stand-alone software program and a web-server for protein sequence alignment using weighted biochemical properties of amino acids.PR2ALIGN：一个用于利用氨基酸加权生化特性进行蛋白质序列比对的独立软件程序和网络服务器。

BMC Res Notes. 2015 May 7;8:187. doi: 10.1186/s13104-015-1152-6.

An AT-hook domain in MeCP2 determines the clinical course of Rett syndrome and related disorders.MECP2 中的 AT 钩结构域决定雷特综合征及相关疾病的临床病程。

Cell. 2013 Feb 28;152(5):984-96. doi: 10.1016/j.cell.2013.01.038.

New finite-size correction for local alignment score distributions.局部比对得分分布的新有限尺寸校正。

BMC Res Notes. 2012 Jun 12;5:286. doi: 10.1186/1756-0500-5-286.

Accelerating pairwise statistical significance estimation for local alignment by harvesting GPU's power.利用 GPU 加速局部比对的成对统计显著性估计。

BMC Bioinformatics. 2012 Apr 12;13 Suppl 5(Suppl 5):S3. doi: 10.1186/1471-2105-13-S5-S3.

Protein sequence alignment with family-specific amino acid similarity matrices.使用家族特异性氨基酸相似性矩阵进行蛋白质序列比对。

BMC Res Notes. 2011 Aug 16;4:296. doi: 10.1186/1756-0500-4-296.

Prediction of antimicrobial peptides based on sequence alignment and feature selection methods.基于序列比对和特征选择方法的抗菌肽预测。

PLoS One. 2011 Apr 13;6(4):e18476. doi: 10.1371/journal.pone.0018476.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用序列特异性和位置特异性取代矩阵进行局部序列比对的成对统计显著性。

Pairwise statistical significance of local sequence alignment using sequence-specific and position-specific substitution matrices.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献