• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

简单即美:一种改进PSI-BLAST搜索中真阳性和假阳性划分的直接方法。

Simple is beautiful: a straightforward approach to improve the delineation of true and false positives in PSI-BLAST searches.

作者信息

Lee Marianne M, Chan Michael K, Bundschuh Ralf

机构信息

The Ohio State Biophysics Program, Ohio State University, 484 W 12th Av., Columbus OH 43210-1117, USA.

出版信息

Bioinformatics. 2008 Jun 1;24(11):1339-43. doi: 10.1093/bioinformatics/btn130. Epub 2008 Apr 10.

DOI:10.1093/bioinformatics/btn130
PMID:18403442
Abstract

MOTIVATION

The deluge of biological information from different genomic initiatives and the rapid advancement in biotechnologies have made bioinformatics tools an integral part of modern biology. Among the widely used sequence alignment tools, BLAST and PSI-BLAST are arguably the most popular. PSI-BLAST, which uses an iterative profile position specific score matrix (PSSM)-based search strategy, is more sensitive than BLAST in detecting weak homologies, thus making it suitable for remote homolog detection. Many refinements have been made to improve PSI-BLAST, and its computational efficiency and high specificity have been much touted. Nevertheless, corruption of its profile via the incorporation of false positive sequences remains a major challenge.

RESULTS

We have developed a simple and elegant approach to resolve the problem of model corruption in PSI-BLAST searches. We hypothesized that combining results from the first (least-corrupted) profile with results from later (most sensitive) iterations of PSI-BLAST provides a better discriminator for true and false hits. Accordingly, we have derived a formula that utilizes the E-values from these two PSI-BLAST iterations to obtain a figure of merit for rank-ordering the hits. Our verification results based on a 'gold-standard' test set indicate that this figure of merit does indeed delineate true positives from false positives better than PSI-BLAST E-values. Perhaps what is most notable about this strategy is that it is simple and straightforward to implement.

摘要

动机

来自不同基因组计划的海量生物信息以及生物技术的快速发展,使生物信息学工具成为现代生物学不可或缺的一部分。在广泛使用的序列比对工具中,BLAST和PSI-BLAST可以说是最受欢迎的。PSI-BLAST采用基于迭代剖面特定位置得分矩阵(PSSM)的搜索策略,在检测弱同源性方面比BLAST更敏感,因此适用于远源同源物检测。人们已经进行了许多改进以提升PSI-BLAST,其计算效率和高特异性也备受赞誉。然而,通过纳入假阳性序列导致其剖面受损仍然是一个重大挑战。

结果

我们开发了一种简单而巧妙的方法来解决PSI-BLAST搜索中模型受损的问题。我们假设将第一个(受损最少)剖面的结果与PSI-BLAST后续(最敏感)迭代的结果相结合,能为区分真阳性和假阳性提供更好的判别标准。因此,我们推导出了一个公式,利用这两次PSI-BLAST迭代的E值来获得一个品质因数,用于对命中结果进行排序。我们基于一个“黄金标准”测试集的验证结果表明,这个品质因数确实比PSI-BLAST的E值能更好地区分真阳性和假阳性。也许这个策略最值得注意的是它实现起来简单直接。

相似文献

1
Simple is beautiful: a straightforward approach to improve the delineation of true and false positives in PSI-BLAST searches.简单即美:一种改进PSI-BLAST搜索中真阳性和假阳性划分的直接方法。
Bioinformatics. 2008 Jun 1;24(11):1339-43. doi: 10.1093/bioinformatics/btn130. Epub 2008 Apr 10.
2
SIB-BLAST: a web server for improved delineation of true and false positives in PSI-BLAST searches.SIB-BLAST:一个用于在PSI-BLAST搜索中更好地区分真阳性和假阳性的网络服务器。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W53-6. doi: 10.1093/nar/gkp301. Epub 2009 May 8.
3
Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: validation.通过保守应用Psi-BLAST在低序列同一性下高效识别蛋白质折叠:验证
J Mol Recognit. 2005 Mar-Apr;18(2):139-49. doi: 10.1002/jmr.721.
4
Strategies for the effective identification of remotely related sequences in multiple PSSM search approach.在多重位置特异性得分矩阵(PSSM)搜索方法中有效识别远距离相关序列的策略。
Proteins. 2007 Jun 1;67(4):789-94. doi: 10.1002/prot.21356.
5
Detection of homologous proteins by an intermediate sequence search.通过中间序列搜索检测同源蛋白。
Protein Sci. 2004 Jan;13(1):54-62. doi: 10.1110/ps.03335004.
6
Large-scale comparison of protein sequence alignment algorithms with structure alignments.蛋白质序列比对算法与结构比对的大规模比较。
Proteins. 2000 Jul 1;40(1):6-22. doi: 10.1002/(sici)1097-0134(20000701)40:1<6::aid-prot30>3.0.co;2-7.
7
PSIBLAST_PairwiseStatSig: reordering PSI-BLAST hits using pairwise statistical significance.PSI-BLAST成对统计显著性:使用成对统计显著性对PSI-BLAST命中结果进行重新排序。
Bioinformatics. 2009 Apr 15;25(8):1082-3. doi: 10.1093/bioinformatics/btp089. Epub 2009 Feb 27.
8
Identification of new claudin family members by a novel PSI-BLAST based approach with enhanced specificity.通过一种具有更高特异性的基于新型位置特异性迭代比对(PSI-BLAST)的方法鉴定新的紧密连接蛋白家族成员。
Proteins. 2006 Dec 1;65(4):808-15. doi: 10.1002/prot.21218.
9
Benchmarking PSI-BLAST in genome annotation.在基因组注释中对PSI-BLAST进行基准测试。
J Mol Biol. 1999 Nov 12;293(5):1257-71. doi: 10.1006/jmbi.1999.3233.
10
IMPALA: matching a protein sequence against a collection of PSI-BLAST-constructed position-specific score matrices.IMPALA:将蛋白质序列与一组由PSI-BLAST构建的位置特异性得分矩阵进行匹配。
Bioinformatics. 1999 Dec;15(12):1000-11. doi: 10.1093/bioinformatics/15.12.1000.

引用本文的文献

1
Sequence-Based Prediction of Plant Allergenic Proteins: Machine Learning Classification Approach.基于序列的植物变应原蛋白预测:机器学习分类方法
ACS Omega. 2023 Jan 20;8(4):3698-3704. doi: 10.1021/acsomega.2c02842. eCollection 2023 Jan 31.
2
A Comparative Analysis of Novel Deep Learning and Ensemble Learning Models to Predict the Allergenicity of Food Proteins.用于预测食物蛋白变应原性的新型深度学习与集成学习模型的比较分析
Foods. 2021 Apr 9;10(4):809. doi: 10.3390/foods10040809.
3
Revisiting amino acid substitution matrices for identifying distantly related proteins.
重新审视用于鉴定远缘蛋白质的氨基酸替换矩阵。
Bioinformatics. 2014 Feb 1;30(3):317-25. doi: 10.1093/bioinformatics/btt694. Epub 2013 Nov 26.
4
Homologous over-extension: a challenge for iterative similarity searches.同源超长延伸:迭代相似性搜索的挑战。
Nucleic Acids Res. 2010 Apr;38(7):2177-89. doi: 10.1093/nar/gkp1219. Epub 2010 Jan 11.
5
Computational biology methods and their application to the comparative genomics of endocellular symbiotic bacteria of insects.计算生物学方法及其在昆虫内生共生菌比较基因组学中的应用。
Biol Proced Online. 2009 Mar 11;11:52-78. doi: 10.1007/s12575-009-9004-1.
6
SIB-BLAST: a web server for improved delineation of true and false positives in PSI-BLAST searches.SIB-BLAST:一个用于在PSI-BLAST搜索中更好地区分真阳性和假阳性的网络服务器。
Nucleic Acids Res. 2009 Jul;37(Web Server issue):W53-6. doi: 10.1093/nar/gkp301. Epub 2009 May 8.