Devane Russell, Shinoda Wataru, Moore Preston B, Klein Michael L
Center for Molecular Modeling and Department of Chemistry, University of Pennsylvania, 231 South 34th Street, Philadelphia, PA 19104-6323.
J Chem Theory Comput. 2009 Aug 11;5(8):2115-2124. doi: 10.1021/ct800441u.
The large quantity of protein sequences being generated from genomic data has greatly outpaced the throughput of experimental protein structure determining methods and consequently brought urgency to the need for accurate protein structure prediction tools. Reduced resolution, or coarse grained (CG) models, have become a mainstay in computational protein structure prediction perfoming among the best tools available. The quest for high quality generalized CG models presents an extremely challenging yet popular endeavor. To this point, a CG based interaction potential is presented here for the naturally occurring amino acids. In the present approach, three to four heavy atoms and associated hydrogens are condensed into a single CG site. The parameterization of the site-site interaction potential relies on experimental data thus providing a novel approach that is neither based on all-atom (AA) simulations nor experimental protein structural data. Specifically, intermolecular potentials, which are based on Lennard-Jones (LJ) style functional forms, are parameterized using thermodynamic data including surface tension and density. Using this approach, an amino acid potential dataset has been developed for use in modeling peptides and proteins. The potential is evaluated here by comparing the solvent accessible surface area (SASA) to AA representations and ranking of protein decoy data sets provided by Decoys 'R' Us. The model is shown to perform very well compared to other existing prediction models for these properties.
从基因组数据中生成的大量蛋白质序列,其增长速度已远远超过实验性蛋白质结构测定方法的通量,因此迫切需要准确的蛋白质结构预测工具。低分辨率模型,即粗粒度(CG)模型,已成为计算蛋白质结构预测中可用的最佳工具之一。寻求高质量的通用CG模型是一项极具挑战性但又很热门的工作。就此而言,本文提出了一种基于CG的天然氨基酸相互作用势。在本方法中,三到四个重原子及相关氢原子被浓缩为一个单一的CG位点。位点间相互作用势的参数化依赖于实验数据,从而提供了一种既不基于全原子(AA)模拟也不基于实验蛋白质结构数据的新方法。具体而言,基于 Lennard-Jones(LJ)函数形式的分子间势,利用包括表面张力和密度在内的热力学数据进行参数化。通过这种方法,已开发出一个氨基酸势数据集,用于肽和蛋白质的建模。在此,通过将溶剂可及表面积(SASA)与AA表示进行比较以及对Decoys 'R' Us提供的蛋白质诱饵数据集进行排名来评估该势。结果表明,与其他现有的针对这些性质的预测模型相比,该模型表现非常出色。