Spasov Borislav, Hall Lowell H
Department of Chemistry, Eastern Nazarene College, 23 East Elm Avenue, Quincy, MA 02170, USA.
Chem Biodivers. 2007 Nov;4(11):2528-39. doi: 10.1002/cbdv.200790206.
Topological Structure-Information Representation (SIR) serves as the basis for QSAR model development on two data sets of dipeptides. Data sets of both bitter-taste (48 compounds) and angiotensin-converting-enzyme (ACE) inhibition (58 compounds) were analyzed by means of multiple linear-regression methods to produce QSAR models that relate structure to property. For the bitter-taste data set, two variables describe the data well, both being whole-molecule descriptors: (1)chi(v) (molecular connectivity first-order valence index) and SHBa (sum of E-State indices for H-bond acceptors) yield r(2)=0.88, s=0.22. External validation and cross-validation indicate that the model may be predictive. For the ACE-inhibition data set, five variables produced a satisfactory model. Four of the descriptors relate to amino acid side chains: the E-State polarity/non-polarity index Q(v) (for position A adjacent to the N-terminus; Fig. 1) and the E-State index s(2) (for the backbone position of substitution), along with the square of the molecular connectivity path-four valence index ((4)chi(PC); for side chain B adjacent to C-terminus) and the E-State index s(5) (for the attachment point of the side chain B (Fig. 1)). Together with the E-State whole-molecule descriptor for internal H-bonding (five skeletal bonds; SHBint5), the five variables form a predictive model (r(2)=0.88, s=0.36). Both external-test and cross-validation-test statistics indicate that the model may be predictive. This study is the first investigation in which E-State descriptors are developed for amino acid side chains.
拓扑结构 - 信息表示(SIR)是基于两个二肽数据集开发定量构效关系(QSAR)模型的基础。通过多元线性回归方法分析了苦味(48种化合物)和血管紧张素转换酶(ACE)抑制(58种化合物)的数据集,以生成将结构与性质相关联的QSAR模型。对于苦味数据集,两个变量能很好地描述数据,这两个变量均为全分子描述符:(1)χ(v)(分子连接性一阶价指数)和SHBa(氢键受体的E态指数之和)的相关系数r(2) = 0.88,标准差s = 0.22。外部验证和交叉验证表明该模型可能具有预测性。对于ACE抑制数据集,五个变量产生了一个令人满意的模型。其中四个描述符与氨基酸侧链有关:E态极性/非极性指数Q(v)(对于N端相邻的位置A;图1)和E态指数s(2)(对于取代的主链位置),以及分子连接性路径 - 四价指数的平方((4)χ(PC);对于C端相邻的侧链B)和E态指数s(5)(对于侧链B的连接点(图1))。与用于内部氢键的E态全分子描述符(五个骨架键;SHBint5)一起,这五个变量构成了一个预测模型(r(2) = 0.88,标准差s = 0.36)。外部测试和交叉验证测试统计数据均表明该模型可能具有预测性。本研究是首次针对氨基酸侧链开发E态描述符的调查。