Suppr超能文献

Scan2S:利用二级结构约束提高PROSITE模式基序的精度

Scan2S: increasing the precision of PROSITE pattern motifs using secondary structure constraints.

作者信息

Skrabanek Lucy, Niv Masha Y

机构信息

Department of Physiology and Biophysics, Weill Medical College of Cornell University, New York, New York 10021, USA.

出版信息

Proteins. 2008 Sep;72(4):1138-47. doi: 10.1002/prot.22008.

Abstract

Sequence signature databases such as PROSITE, which include protein pattern motifs indicative of a protein's function, are widely used for function prediction studies, cellular localization annotation, and sequence classification. Correct annotation relies on high precision of the motifs. We present a new and general approach for increasing the precision of established protein pattern motifs by including secondary structure constraints (SSCs). We use Scan2S, the first sequence motif-scanning program to optionally include SSCs, to augment PROSITE pattern motifs. The constraints were derived from either the DSSP secondary structure assignment or the PSIPRED predictions for PROSITE-documented true positive hits. The secondary structure-augmented motifs were scanned against all SwissProt sequences, for which secondary structure predictions were precalculated. Against this dataset, motifs with PSIPRED-derived SSCs exhibited improved performance over motifs with DSSP-derived constraints. The precision of 763 of the 782 PSIPRED-augmented motifs remained unchanged or increased compared to the original motifs; 26 motifs showed an absolute precision increase of 10-30%. We provide the complete set of augmented motifs and the Scan2S program at http://physiology.med.cornell.edu/go/scan2s. Our results suggest a general protocol for increasing the precision of protein pattern detection via the inclusion of SSCs.

摘要

诸如PROSITE之类的序列特征数据库,其中包含指示蛋白质功能的蛋白质模式基序,被广泛用于功能预测研究、细胞定位注释和序列分类。正确的注释依赖于基序的高精度。我们提出了一种新的通用方法,通过纳入二级结构约束(SSC)来提高已建立的蛋白质模式基序的精度。我们使用Scan2S(第一个可选择性纳入SSC的序列基序扫描程序)来增强PROSITE模式基序。这些约束要么来自DSSP二级结构分配,要么来自对PROSITE记录的真阳性命中的PSIPRED预测。针对所有预先计算了二级结构预测的SwissProt序列扫描二级结构增强的基序。在这个数据集上,具有PSIPRED衍生的SSC的基序比具有DSSP衍生约束的基序表现出更好的性能。与原始基序相比,782个PSIPRED增强基序中的763个的精度保持不变或提高;26个基序的绝对精度提高了10 - 30%。我们在http://physiology.med.cornell.edu/go/scan2s上提供了完整的增强基序集和Scan2S程序。我们的结果表明了一种通过纳入SSC来提高蛋白质模式检测精度的通用方案。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验