Drug Theoretics and Cheminformatics Laboratory, Division of Medicinal and Pharmaceutical Chemistry, Department of Pharmaceutical Technology, Jadavpur University, Kolkata, 700 032, India.
Environ Sci Pollut Res Int. 2014 Feb;21(4):2955-65. doi: 10.1007/s11356-013-2247-z. Epub 2013 Oct 30.
Predictive regression-based models for bioconcentration factor (BCF) have been developed using mechanistically interpretable descriptors computed from open source tool PaDEL-Descriptor ( http://padel.nus.edu.sg/software/padeldescriptor/ ). A data set of 522 diverse chemicals has been used for this modeling study, and extended topochemical atom (ETA) indices developed by the present authors' group were chosen as the descriptors. Due to the importance of lipohilicity in modeling BCF, XLogP (computed partition coefficient) was also tried as an additional descriptor. Genetic function approximation followed by multiple linear regression algorithm was applied to select descriptors, and subsequent partial least squares analyses were performed to establish mathematical equations for BCF prediction. The model generated from only ETA indices shows importance of seven descriptors in model development, while the model generated from ETA descriptors along with XlogP shows importance of four descriptors in model development. In general, BCF depends on lipophilicity, presence of heteroatoms, presence of halogens, fused ring system, hydrogen bonding groups, etc. The developed models show excellent statistical qualities and predictive ability. The developed models were used also for prediction of an external data set available from the literature, and good quality of predictions (R (2) pred = 0.812 and 0.826) was demonstrated. Thus, BCF can be predicted using ETA and XlogP descriptors calculated from open source PaDEL-Descriptor software in the context of aquatic chemical toxicity management.
已经使用源自开源工具 PaDEL-Descriptor(http://padel.nus.edu.sg/software/padeldescriptor/)的具有可解释机理的描述符开发了用于生物浓缩因子(BCF)的预测回归模型。该建模研究使用了 522 种不同化学物质的数据集,并选择了本研究小组开发的扩展拓扑原子(ETA)指数作为描述符。由于亲脂性在 BCF 建模中的重要性,还尝试了 XLogP(计算分配系数)作为附加描述符。应用遗传函数逼近和多元线性回归算法选择描述符,然后进行偏最小二乘分析,建立 BCF 预测的数学方程。仅从 ETA 指数生成的模型显示了模型开发中七个描述符的重要性,而从 ETA 描述符和 XlogP 生成的模型则显示了模型开发中四个描述符的重要性。一般来说,BCF 取决于亲脂性、杂原子的存在、卤素的存在、稠环系统、氢键基团等。所开发的模型具有出色的统计质量和预测能力。还使用开发的模型对文献中可用的外部数据集进行了预测,证明了良好的预测质量(R(2)pred=0.812 和 0.826)。因此,可以使用源自开源 PaDEL-Descriptor 软件的 ETA 和 XlogP 描述符来预测水生化学毒性管理中的 BCF。