Suppr超能文献

蛋白质序列数据库搜索中的检索准确性、统计显著性和组成相似性。

Retrieval accuracy, statistical significance and compositional similarity in protein sequence database searches.

作者信息

Yu Yi-Kuo, Gertz E Michael, Agarwala Richa, Schäffer Alejandro A, Altschul Stephen F

机构信息

National Center for Biotechnology Information, National Library of Medicine, NIH, DHHS, Bethesda, MD 20894, USA.

出版信息

Nucleic Acids Res. 2006;34(20):5966-73. doi: 10.1093/nar/gkl731. Epub 2006 Oct 26.

Abstract

Protein sequence database search programs may be evaluated both for their retrieval accuracy--the ability to separate meaningful from chance similarities--and for the accuracy of their statistical assessments of reported alignments. However, methods for improving statistical accuracy can degrade retrieval accuracy by discarding compositional evidence of sequence relatedness. This evidence may be preserved by combining essentially independent measures of alignment and compositional similarity into a unified measure of sequence similarity. A version of the BLAST protein database search program, modified to employ this new measure, outperforms the baseline program in both retrieval and statistical accuracy on ASTRAL, a SCOP-based test set.

摘要

蛋白质序列数据库搜索程序可以从检索准确性(即区分有意义的相似性和随机相似性的能力)以及对所报告比对的统计评估准确性这两方面进行评估。然而,提高统计准确性的方法可能会通过舍弃序列相关性的组成证据而降低检索准确性。通过将比对和组成相似性这两个基本独立的度量合并为一个统一的序列相似性度量,可以保留这一证据。对BLAST蛋白质数据库搜索程序的一个版本进行修改,使其采用这种新度量,在基于SCOP的测试集ASTRAL上,该版本在检索准确性和统计准确性方面均优于基线程序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8e79/1694031/61fef4478b68/gkl731f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验