Suppr超能文献

利用PsiCSI对蛋白质二级结构进行准确且自动化的分类。

Accurate and automated classification of protein secondary structure with PsiCSI.

作者信息

Hung Ling-Hong, Samudrala Ram

机构信息

Computational Genomics, Department of Microbiology, University of Washington, Seattle 98109, USA.

出版信息

Protein Sci. 2003 Feb;12(2):288-95. doi: 10.1110/ps.0222303.

Abstract

PsiCSI is a highly accurate and automated method of assigning secondary structure from NMR data, which is a useful intermediate step in the determination of tertiary structures. The method combines information from chemical shifts and protein sequence using three layers of neural networks. Training and testing was performed on a suite of 92 proteins (9437 residues) with known secondary and tertiary structure. Using a stringent cross-validation procedure in which the target and homologous proteins were removed from the databases used for training the neural networks, an average 89% Q3 accuracy (per residue) was observed. This is an increase of 6.2% and 5.5% (representing 36% and 33% fewer errors) over methods that use chemical shifts (CSI) or sequence information (Psipred) alone. In addition, PsiCSI improves upon the translation of chemical shift information to secondary structure (Q3 = 87.4%) and is able to use sequence information as an effective substitute for sparse NMR data (Q3 = 86.9% without (13)C shifts and Q3 = 86.8% with only H(alpha) shifts available). Finally, errors made by PsiCSI almost exclusively involve the interchange of helix or strand with coil and not helix with strand (<2.5 occurrences per 10000 residues). The automation, increased accuracy, absence of gross errors, and robustness with regards to sparse data make PsiCSI ideal for high-throughput applications, and should improve the effectiveness of hybrid NMR/de novo structure determination methods. A Web server is available for users to submit data and have the assignment returned.

摘要

PsiCSI是一种从核磁共振(NMR)数据中分配二级结构的高度准确且自动化的方法,这是确定三级结构过程中一个有用的中间步骤。该方法使用三层神经网络结合化学位移和蛋白质序列信息。在一组具有已知二级和三级结构的92种蛋白质(9437个残基)上进行了训练和测试。使用一种严格的交叉验证程序,其中将目标蛋白和同源蛋白从用于训练神经网络的数据库中移除,观察到平均Q3准确率为89%(每个残基)。与仅使用化学位移(CSI)或序列信息(Psipred)的方法相比,这分别提高了6.2%和5.5%(误差减少了36%和33%)。此外,PsiCSI在将化学位移信息转化为二级结构方面有所改进(Q3 = 87.4%),并且能够将序列信息用作稀疏NMR数据的有效替代(在没有(13)C位移时Q3 = 86.9%,仅在有H(α)位移时Q3 = 86.8%)。最后,PsiCSI产生的错误几乎完全涉及螺旋或链与卷曲的互换,而不是螺旋与链的互换(每10000个残基中少于2.5次出现)。自动化、更高的准确性、没有重大错误以及对稀疏数据的稳健性使得PsiCSI非常适合高通量应用,并且应该会提高混合NMR/从头结构测定方法的有效性。有一个网络服务器可供用户提交数据并获得分配结果。

相似文献

2
Protein secondary structure prediction with SPARROW.利用 SPARROW 进行蛋白质二级结构预测。
J Chem Inf Model. 2012 Feb 27;52(2):545-56. doi: 10.1021/ci200321u. Epub 2012 Jan 23.
10
Improving protein secondary structure prediction using a multi-modal BP method.利用多模态 BP 方法改进蛋白质二级结构预测。
Comput Biol Med. 2011 Oct;41(10):946-59. doi: 10.1016/j.compbiomed.2011.08.005. Epub 2011 Aug 30.

引用本文的文献

6
CSI 2.0: a significantly improved version of the Chemical Shift Index.CSI 2.0:化学位移指数的显著改进版本。
J Biomol NMR. 2014 Nov;60(2-3):131-46. doi: 10.1007/s10858-014-9863-x. Epub 2014 Oct 2.
7
Type I and II β-turns prediction using NMR chemical shifts.利用核磁共振化学位移预测I型和II型β-转角
J Biomol NMR. 2014 Jul;59(3):175-84. doi: 10.1007/s10858-014-9837-z. Epub 2014 May 17.

本文引用的文献

8
A tour of structural genomics.结构基因组学之旅。
Nat Rev Genet. 2001 Oct;2(10):801-9. doi: 10.1038/35093574.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验