Department of Cardiothoracic Surgery, Southwest Hospital, Third Military Medical University, Chongqing 400038, People's Republic of China.
Biopolymers. 2011;96(3):288-301. doi: 10.1002/bip.21531.
Although there were intensive works addressed on multivariate extraction of the informative components from numerous physicochemical parameters of amino acids in isolated state, the various conformational behaviors of amino acids in complicated biological context have long been underappreciated in the field of quantitative structure-activity relationship (QSAR). In this work, the amino acid rotamers, which were derived from statistical survey of protein crystal structures, were used to reproduce the conformational variety of amino acid side-chains in real condition. In this procedure, these rotamers were superposed into a nx x ny x nz lattice and an artificial probe was employed to detect four kinds of nonbonding field potentials (i.e., electrostatic, steric, hydrophobic, and hydrogen bonds) at each lattice point using a Gaussian-type potential function; the generated massive data were then subjected to a principal component analysis (PCA) treatment to obtain a set of few, informative amino acid descriptors. We used this set of descriptors, that we named principal property descriptors derived from amino acid rotamers (PDAR), to characterize over 13,000 peptides with known binding affinities to 10 types of SH3 domains. Genetic algorithm/ partial least square regression (GA/PLS) modeling and Monte Carlo cross-validation (MCCV) demonstrated that the correlation between the PDAR descriptors and the binding affinities of peptides are comparable with or even better than previously published models. Furthermore, from the PDAR-based QSAR models we concluded that the core motif of peptides, particularly the electrostatic property, hydrophobicity, and hydrogen bond at residue positions P3, P2, and/or P0, contribute significantly to the hAmph SH3 domain-peptide binding, whereas two ends of the peptides, such as P6, P4, P-4, and P5, only play a secondary role in the binding.
虽然已经有大量工作致力于从氨基酸的众多物理化学参数中提取信息成分的多元提取,但在定量构效关系(QSAR)领域,氨基酸在复杂生物环境中的各种构象行为长期以来一直未得到充分重视。在这项工作中,从蛋白质晶体结构的统计调查中得出的氨基酸旋转异构体被用于再现氨基酸侧链在真实条件下的构象多样性。在这个过程中,这些旋转异构体被叠加到一个 nx x ny x nz 晶格中,并用一个人工探针在每个晶格点使用高斯型势能函数检测四种非键场势(即静电、立体、疏水和氢键);生成的大量数据随后进行主成分分析(PCA)处理,以获得一组少量的、信息丰富的氨基酸描述符。我们使用这组描述符,我们称之为源自氨基酸旋转异构体的主要性质描述符(PDAR),来描述 13000 多个具有已知结合亲和力的肽与 10 种 SH3 结构域。遗传算法/偏最小二乘回归(GA/PLS)建模和蒙特卡罗交叉验证(MCCV)表明,PDAR 描述符与肽结合亲和力之间的相关性与之前发表的模型相当,甚至更好。此外,从基于 PDAR 的 QSAR 模型中,我们得出结论,肽的核心模体,特别是残基 P3、P2 和/或 P0 处的静电特性、疏水性和氢键,对 hAmph SH3 结构域-肽结合有重要贡献,而肽的两个末端,如 P6、P4、P-4 和 P5,在结合中仅起次要作用。