Castillo-Garit Juan A, Marrero-Ponce Yovani, Escobar Jeanette, Torrens Francisco, Rotondo Richard
Applied Chemistry Research Center, Central University of Las Villas, Santa Clara, 54830, Villa Clara, Cuba.
Chemosphere. 2008 Sep;73(3):415-27. doi: 10.1016/j.chemosphere.2008.05.024. Epub 2008 Jul 1.
The main aim of the study was to develop quantitative structure-activity relationship (QSAR) models for the prediction of aquatic toxicity using atom-based non-stochastic and stochastic linear indices. The used dataset consist of 392 benzene derivatives, separated into training and test sets, for which toxicity data to the ciliate Tetrahymena pyriformis were available. Using multiple linear regression, two statistically significant QSAR models were obtained with non-stochastic (R2=0.791 and s=0.344) and stochastic (R2=0.799 and s=0.343) linear indices. A leave-one-out (LOO) cross-validation procedure was carried out achieving values of q2=0.781 (scv=0.348) and q2=0.786 (scv=0.350), respectively. In addition, a validation through an external test set was performed, which yields significant values of Rpred2 of 0.762 and 0.797. A brief study of the influence of the statistical outliers in QSAR's model development was also carried out. Finally, our method was compared with other approaches implemented in the Dragon software achieving better results. The non-stochastic and stochastic linear indices appear to provide an interesting alternative to costly and time-consuming experiments for determining toxicity.
该研究的主要目的是使用基于原子的非随机和随机线性指标开发用于预测水生毒性的定量构效关系(QSAR)模型。所使用的数据集由392种苯衍生物组成,分为训练集和测试集,可获得这些化合物对梨形四膜虫的毒性数据。通过多元线性回归,利用非随机(R2 = 0.791,s = 0.344)和随机(R2 = 0.799,s = 0.343)线性指标获得了两个具有统计学意义的QSAR模型。进行了留一法(LOO)交叉验证,分别得到q2 = 0.781(scv = 0.348)和q2 = 0.786(scv = 0.350)的值。此外,通过外部测试集进行了验证,得到的Rpred2值分别为0.762和0.797。还对QSAR模型开发中统计异常值的影响进行了简要研究。最后,将我们的方法与Dragon软件中实现的其他方法进行了比较,取得了更好的结果。非随机和随机线性指标似乎为确定毒性的昂贵且耗时的实验提供了一种有趣的替代方法。