Suppr超能文献

使用3D-1D兼容性算法预测蛋白质二级结构。

Prediction of protein secondary structure using the 3D-1D compatibility algorithm.

作者信息

Ito M, Matsuo Y, Nishikawa K

机构信息

School of Information Science, Japan Advanced Institute of Science and Technology, Ishikawa, Japan.

出版信息

Comput Appl Biosci. 1997 Aug;13(4):415-24. doi: 10.1093/bioinformatics/13.4.415.

Abstract

A new method for the prediction of protein secondary structure is proposed, which relies totally on the global aspect of a protein. The prediction scheme is as follows. A structural library is first scanned with a query sequence by the 3D-1D compatibility method developed before. All the structures examined are sorted with the compatibility score and the top 50 in the list are picked out. Then, all the known secondary structures of the 50 proteins are globally aligned against the query sequence, according to the 3D-1D alignments. Prediction of either alpha helix, beta strand or coil is made by taking the majority among the observations at each residue site. Besides 325 proteins in the structural library, 77 proteins were selected from the latest release of the Brookhaven Protein Data Bank, and they were divided into three data sets. Data set 1 was used as a training set for which several adjustable parameters in the method were optimized. Then, the final form of the method was applied to a testing set (data set 2) which contained proteins of chain length < or = 400 residues. The average prediction accuracy was as high as 69% in the three-state assessment of alpha, beta and coil. On the other hand, data set 3 contains only those proteins of length > 400 residues, for which the present method would not work properly because of the size effect inherent in the 3D-1D compatibility method. The proteins in data set 3 were, therefore, subdivided into constituent domains (data set 4) before being fed into the prediction program. The prediction accuracy for data set 4 was 66% on average, a few percent lower than that for data set 2. Possible causes for this discrepancy are discussed.

摘要

提出了一种全新的蛋白质二级结构预测方法,该方法完全依赖于蛋白质的整体特征。预测方案如下。首先使用之前开发的3D-1D兼容性方法,将查询序列与结构库进行比对。根据兼容性得分对所有检测到的结构进行排序,挑选出列表中排名前50的结构。然后,根据3D-1D比对结果,将这50种蛋白质的所有已知二级结构与查询序列进行全局比对。通过在每个残基位点的观测值中取多数来预测α螺旋、β链或卷曲结构。除了结构库中的325种蛋白质外,还从布鲁克海文蛋白质数据库的最新版本中选取了77种蛋白质,并将它们分为三个数据集。数据集1用作训练集,对该方法中的几个可调参数进行了优化。然后,将该方法的最终形式应用于测试集(数据集2),该测试集包含链长≤400个残基的蛋白质。在α、β和卷曲的三态评估中,平均预测准确率高达69%。另一方面,数据集3仅包含长度>400个残基的蛋白质,由于3D-1D兼容性方法固有的尺寸效应,本方法对这些蛋白质无法正常发挥作用。因此,在将数据集3中的蛋白质输入预测程序之前,先将它们细分为组成结构域(数据集4)。数据集4的平均预测准确率为66%,比数据集2低几个百分点。文中讨论了造成这种差异的可能原因。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验