• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用基于长度和结构的序列比对工具(LESTAT)进行远距离同源性检测。

Distant homology detection using a LEngth and STructure-based sequence Alignment Tool (LESTAT).

作者信息

Lee Marianne M, Bundschuh Ralf, Chan Michael K

机构信息

The Ohio State Biophysics Program, The Ohio State University, Columbus, Ohio 43210, USA.

出版信息

Proteins. 2008 May 15;71(3):1409-19. doi: 10.1002/prot.21830.

DOI:10.1002/prot.21830
PMID:18076050
Abstract

A new machine learning algorithm, LESTAT (LEngth and STructure-based sequence Alignment Tool) has been developed for detecting protein homologs having low-sequence identity. LESTAT is an iterative profile-based method that runs without reliance on a predefined library and incorporates several novel features that enhance its ability to identify remote sequences. To overcome the inherent bias associated with a single starting model, LESTAT utilizes three structural homologs to create a profile consisting of structurally conserved positions and block separation distances. Subsequent profiles are refined iteratively using sequence information obtained from previous cycles. Additionally, the refinement process incorporates a "lock-in" feature to retain the high-scoring sequences involved in previous alignments for subsequent model building and an enhancement factor to complement the weighting scheme used to build the position specific scoring matrix. A comparison of the performance of LESTAT against PSI-BLAST for seven systems reveals that LESTAT exhibits increased sensitivity and specificity over PSI-BLAST in six of these systems, based on the number of true homologs detected and the number of families these homologs covered. Notably, many of the hits identified are unique to each method, presumably resulting from the distinct differences in the two approaches. Taken together, these findings suggest that LESTAT is a useful complementary method to PSI-BLAST in the detection of distant homologs.

摘要

一种名为LESTAT(基于长度和结构的序列比对工具)的新型机器学习算法已被开发出来,用于检测低序列同一性的蛋白质同源物。LESTAT是一种基于迭代轮廓的方法,其运行不依赖于预定义的库,并结合了几个新特性,增强了其识别远缘序列的能力。为了克服与单个起始模型相关的固有偏差,LESTAT利用三个结构同源物来创建一个由结构保守位置和块分离距离组成的轮廓。随后的轮廓使用从先前循环中获得的序列信息进行迭代优化。此外,优化过程包含一个“锁定”功能,以保留先前比对中涉及的高分序列,用于后续的模型构建,以及一个增强因子,以补充用于构建位置特异性评分矩阵的加权方案。对LESTAT和PSI-BLAST在七个系统上的性能比较表明,基于检测到的真实同源物数量以及这些同源物覆盖的家族数量,在其中六个系统中,LESTAT比PSI-BLAST表现出更高的灵敏度和特异性。值得注意的是,许多识别出的命中结果是每种方法所特有的,这可能是由于两种方法的明显差异所致。综上所述,这些发现表明LESTAT在检测远缘同源物方面是PSI-BLAST的一种有用的补充方法。

相似文献

1
Distant homology detection using a LEngth and STructure-based sequence Alignment Tool (LESTAT).使用基于长度和结构的序列比对工具(LESTAT)进行远距离同源性检测。
Proteins. 2008 May 15;71(3):1409-19. doi: 10.1002/prot.21830.
2
SVM-HUSTLE--an iterative semi-supervised machine learning approach for pairwise protein remote homology detection.SVM-HUSTLE——一种用于成对蛋白质远程同源性检测的迭代半监督机器学习方法。
Bioinformatics. 2008 Mar 15;24(6):783-90. doi: 10.1093/bioinformatics/btn028. Epub 2008 Feb 1.
3
Fast model-based protein homology detection without alignment.基于快速模型的无需比对的蛋白质同源性检测。
Bioinformatics. 2007 Jul 15;23(14):1728-36. doi: 10.1093/bioinformatics/btm247. Epub 2007 May 8.
4
Within the twilight zone: a sensitive profile-profile comparison tool based on information theory.在模糊区域内:一种基于信息论的灵敏的轮廓-轮廓比较工具。
J Mol Biol. 2002 Feb 1;315(5):1257-75. doi: 10.1006/jmbi.2001.5293.
5
Incremental window-based protein sequence alignment algorithms.基于窗口递增的蛋白质序列比对算法。
Bioinformatics. 2007 Jan 15;23(2):e17-23. doi: 10.1093/bioinformatics/btl297.
6
Efficient recognition of protein fold at low sequence identity by conservative application of Psi-BLAST: validation.通过保守应用Psi-BLAST在低序列同一性下高效识别蛋白质折叠:验证
J Mol Recognit. 2005 Mar-Apr;18(2):139-49. doi: 10.1002/jmr.721.
7
A comparison of scoring functions for protein sequence profile alignment.蛋白质序列谱比对评分函数的比较
Bioinformatics. 2004 May 22;20(8):1301-8. doi: 10.1093/bioinformatics/bth090. Epub 2004 Feb 12.
8
SPEM: improving multiple sequence alignment with sequence profiles and predicted secondary structures.SPEM:利用序列概况和预测的二级结构改进多序列比对
Bioinformatics. 2005 Sep 15;21(18):3615-21. doi: 10.1093/bioinformatics/bti582. Epub 2005 Jul 14.
9
STRUCTFAST: protein sequence remote homology detection and alignment using novel dynamic programming and profile-profile scoring.STRUCTFAST:利用新型动态规划和轮廓-轮廓评分进行蛋白质序列远程同源性检测与比对。
Proteins. 2006 Sep 1;64(4):960-7. doi: 10.1002/prot.21049.
10
Sequence comparison and protein structure prediction.序列比较与蛋白质结构预测。
Curr Opin Struct Biol. 2006 Jun;16(3):374-84. doi: 10.1016/j.sbi.2006.05.006. Epub 2006 May 19.

引用本文的文献

1
Using amino acid physicochemical distance transformation for fast protein remote homology detection.利用氨基酸物化距离变换进行快速蛋白质远程同源检测。
PLoS One. 2012;7(9):e46633. doi: 10.1371/journal.pone.0046633. Epub 2012 Sep 28.