Istituto di Ricerche Farmacologiche Mario Negri, Laboratory of Environmental Chemistry and Toxicology, Milano, Italy.
J Comput Chem. 2011 Sep;32(12):2727-33. doi: 10.1002/jcc.21848. Epub 2011 Jun 8.
For six random splits, one-variable models of rat toxicity (minus decimal logarithm of the 50% lethal dose [pLD50], oral exposure) have been calculated with CORAL software (http://www.insilico.eu/coral/). The total number of considered compounds is 689. New additional global attributes of the simplified molecular input line entry system (SMILES) have been examined for improvement of the optimal SMILES-based descriptors. These global SMILES attributes are representing the presence of some chemical elements and different kinds of chemical bonds (double, triple, and stereochemical). The "classic" scheme of building up quantitative structure-property/activity relationships and the balance of correlations (BC) with the ideal slopes were compared. For all six random splits, best prediction takes place if the aforementioned BC along with the global SMILES attributes are included in the modeling process. The average statistical characteristics for the external test set are the following: n = 119 ± 6.4, R(2) = 0.7371 ± 0.013, and root mean square error = 0.360 ± 0.037.
对于大鼠毒性(减去 50%致死剂量[pLD50]的十进制对数,口服暴露)的六个随机拆分,使用 CORAL 软件(http://www.insilico.eu/coral/)计算了单变量模型。考虑的化合物总数为 689 个。为了改进最佳基于 SMILES 的描述符,检查了简化分子输入行(entry system, SMILES)的新的附加全局属性。这些全局 SMILES 属性代表了某些化学元素和不同类型的化学键(双键、三键和立体化学)的存在。比较了建立定量构效/活性关系的“经典”方案和具有理想斜率的相关性平衡(BC)。对于所有六个随机拆分,如果将上述 BC 以及全局 SMILES 属性包含在建模过程中,则可以进行最佳预测。外部测试集的平均统计特征如下:n = 119 ± 6.4,R(2) = 0.7371 ± 0.013,均方根误差 = 0.360 ± 0.037。