Suppr超能文献

对处于模糊界限区域的序列进行蛋白质结构类别的预测。

Prediction of protein structural class for the twilight zone sequences.

作者信息

Kurgan Lukasz, Chen Ke

机构信息

Department of Electrical and Computer Engineering, University of Alberta, Edmonton, Alberta, Canada.

出版信息

Biochem Biophys Res Commun. 2007 Jun 1;357(2):453-60. doi: 10.1016/j.bbrc.2007.03.164. Epub 2007 Apr 5.

Abstract

Structural class characterizes the overall folding type of a protein or its domain. This paper develops an accurate method for in silico prediction of structural classes from low homology (twilight zone) protein sequences. The proposed LLSC-PRED method applies linear logistic regression classifier and a custom-designed, feature-based sequence representation to provide predictions. The main advantages of the LLSC-PRED are the comprehensive representation that includes 58 features describing composition and physicochemical properties of the sequences and transparency of the prediction model. The representation also includes predicted secondary structure content, thus for the first time exploring synergy between these two related predictions. Based on tests performed with a large set of 1673 twilight zone domains, the LLSC-PRED's prediction accuracy, which equals over 62%, is shown to be better than accuracy of over a dozen recently published competing in silico methods and similar to accuracy of other, non-transparent classifiers that use the proposed representation.

摘要

结构类别表征蛋白质或其结构域的整体折叠类型。本文开发了一种从低同源性(模糊区)蛋白质序列进行结构类别计算机预测的精确方法。所提出的LLSC - PRED方法应用线性逻辑回归分类器和定制设计的基于特征的序列表示来进行预测。LLSC - PRED的主要优点是包含58个描述序列组成和物理化学性质的特征的综合表示以及预测模型的透明度。该表示还包括预测的二级结构含量,从而首次探索这两个相关预测之间的协同作用。基于对1673个模糊区结构域的大量测试,LLSC - PRED的预测准确率超过62%,结果表明其优于最近发表的十几种竞争的计算机方法的准确率,并且与使用所提出表示的其他非透明分类器的准确率相似。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验