Kinjo Akira R
Institute for Protein Research, Osaka University, Suita, Osaka, 565-0871, Japan.
Biophysics (Nagoya-shi). 2009 May 30;5:37-44. doi: 10.2142/biophysics.5.37. eCollection 2009.
A statistical model of protein families, called profile conditional random fields (CRFs), is proposed. This model may be regarded as an integration of the profile hidden Markov model (HMM) and the Finkelstein-Reva (FR) theory of protein folding. While the model structure of the profile CRF is almost identical to the profile HMM, it can incorporate arbitrary correlations in the sequences to be aligned to the model. In addition, like in the FR theory, the profile CRF can incorporate long-range pair-wise interactions between model states via mean-field-like approximations. We give the detailed formulation of the model, self-consistent approximations for treating long-range interactions, and algorithms for computing partition functions and marginal probabilities. We also outline the methods for the global optimization of model parameters as well as a Bayesian framework for parameter learning and selection of optimal alignments.
提出了一种称为轮廓条件随机场(CRF)的蛋白质家族统计模型。该模型可被视为轮廓隐马尔可夫模型(HMM)与芬克尔斯坦 - 雷瓦(FR)蛋白质折叠理论的整合。虽然轮廓CRF的模型结构与轮廓HMM几乎相同,但它可以纳入要与模型比对的序列中的任意相关性。此外,与FR理论一样,轮廓CRF可以通过类似平均场的近似纳入模型状态之间的长程成对相互作用。我们给出了模型的详细公式、处理长程相互作用的自洽近似以及计算配分函数和边缘概率的算法。我们还概述了模型参数全局优化的方法以及用于参数学习和最优比对选择的贝叶斯框架。