School of Chemistry, Manchester Institute of Biotechnology, University of Manchester , 131 Princess Street, Manchester, M1 7DN, U.K.
J Chem Inf Model. 2018 Feb 26;58(2):234-243. doi: 10.1021/acs.jcim.7b00488. Epub 2018 Feb 5.
The ability to model the activity of a protein using quantitative structure-activity relationships (QSAR) requires descriptors for the 20 naturally coded amino acids. In this work we show that by modifying some established descriptors we were able to model the activity data of 140 mutants of the enzyme epoxide hydrolase with improved accuracy. These new descriptors (referred to as physical descriptors) also gave very good results when tested against a series of four dipeptide data sets. The physical descriptors encode the amino acids using only two orthogonal scales: the first is strongly linked to hydrophilicity/hydrophobicity, and the second, to the volume of the amino acid residue. The use of these new amino acid descriptors should result in simpler and more readily interpretable models for the enzyme activity (and potentially other functions of interest, e.g., secondary and tertiary structure) of peptides and proteins.
使用定量构效关系(QSAR)来模拟蛋白质的活性需要 20 种天然编码氨基酸的描述符。在这项工作中,我们表明,通过修改一些已建立的描述符,我们能够以更高的准确性来模拟酶环氧水解酶的 140 种突变体的活性数据。这些新的描述符(称为物理描述符)在经过一系列四个二肽数据集的测试时也取得了非常好的结果。物理描述符仅使用两个正交标度对氨基酸进行编码:第一个与亲水性/疏水性强烈相关,第二个与氨基酸残基的体积相关。这些新的氨基酸描述符的使用应该会为肽和蛋白质的酶活性(以及潜在的其他感兴趣的功能,例如二级和三级结构)生成更简单且更易于解释的模型。