Suppr超能文献

一种用于蛋白质二级结构预测的动态贝叶斯网络方法。

A dynamic Bayesian network approach to protein secondary structure prediction.

作者信息

Yao Xin-Qiu, Zhu Huaiqiu, She Zhen-Su

机构信息

State Key Laboratory for Turbulence and Complex Systems and Department of Biomedical Engineering, Peking University, Beijing 100871, China.

出版信息

BMC Bioinformatics. 2008 Jan 25;9:49. doi: 10.1186/1471-2105-9-49.

Abstract

BACKGROUND

Protein secondary structure prediction method based on probabilistic models such as hidden Markov model (HMM) appeals to many because it provides meaningful information relevant to sequence-structure relationship. However, at present, the prediction accuracy of pure HMM-type methods is much lower than that of machine learning-based methods such as neural networks (NN) or support vector machines (SVM).

RESULTS

In this paper, we report a new method of probabilistic nature for protein secondary structure prediction, based on dynamic Bayesian networks (DBN). The new method models the PSI-BLAST profile of a protein sequence using a multivariate Gaussian distribution, and simultaneously takes into account the dependency between the profile and secondary structure and the dependency between profiles of neighboring residues. In addition, a segment length distribution is introduced for each secondary structure state. Tests show that the DBN method has made a significant improvement in the accuracy compared to other pure HMM-type methods. Further improvement is achieved by combining the DBN with an NN, a method called DBNN, which shows better Q3 accuracy than many popular methods and is competitive to the current state-of-the-arts. The most interesting feature of DBN/DBNN is that a significant improvement in the prediction accuracy is achieved when combined with other methods by a simple consensus.

CONCLUSION

The DBN method using a Gaussian distribution for the PSI-BLAST profile and a high-ordered dependency between profiles of neighboring residues produces significantly better prediction accuracy than other HMM-type probabilistic methods. Owing to their different nature, the DBN and NN combine to form a more accurate method DBNN. Future improvement may be achieved by combining DBNN with a method of SVM type.

摘要

背景

基于概率模型(如隐马尔可夫模型(HMM))的蛋白质二级结构预测方法备受关注,因为它能提供与序列 - 结构关系相关的有意义信息。然而,目前纯HMM类型方法的预测准确率远低于基于机器学习的方法,如神经网络(NN)或支持向量机(SVM)。

结果

在本文中,我们报告了一种基于动态贝叶斯网络(DBN)的蛋白质二级结构预测新方法,该方法具有概率性质。新方法使用多元高斯分布对蛋白质序列的PSI - BLAST特征进行建模,同时考虑特征与二级结构之间的依赖性以及相邻残基特征之间的依赖性。此外,为每个二级结构状态引入了片段长度分布。测试表明,与其他纯HMM类型方法相比,DBN方法在准确率上有显著提高。通过将DBN与NN相结合进一步提高了准确率,这种方法称为DBNN,其Q3准确率优于许多流行方法,并且与当前最先进的方法具有竞争力。DBN / DBNN最有趣的特点是,通过简单的共识与其他方法结合时,预测准确率有显著提高。

结论

使用高斯分布对PSI - BLAST特征进行建模以及考虑相邻残基特征之间高阶依赖性的DBN方法,比其他HMM类型的概率方法产生了显著更高的预测准确率。由于DBN和NN性质不同,它们结合形成了更准确的方法DBNN。未来通过将DBNN与SVM类型的方法相结合可能会实现进一步的改进。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1c4/2266706/105b6d4419f6/1471-2105-9-49-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验