Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Linggong Road 2, Dalian 116024, China.
Key Laboratory of Industrial Ecology and Environmental Engineering (MOE), School of Environmental Science and Technology, Dalian University of Technology, Linggong Road 2, Dalian 116024, China.
Sci Total Environ. 2020 Jul 10;725:138455. doi: 10.1016/j.scitotenv.2020.138455. Epub 2020 Apr 3.
Predicting plant cuticle-water partition coefficients (K) and understanding the partition mechanisms are crucial to assess environmental fate and risk of organic pollutants. Up to now, experimental K values are determined for only hundreds of compounds because of high experimental cost. For this reason, computational models, which can predict K values based on chemical structures, are promising approaches to evaluate new compounds. In this study, a large dataset consisting of 279 logK values for 125 unique compounds were collected and curated. A poly-parameter linear free energy relationship (pp-LFER) model was developed with stepwise multiple linear regression based on this dataset. The resulted pp-LFER model has good predictability and robustness as indicated by determination coefficient (R) of 0.93, bootstrapping coefficient (Q) of 0.92, external validation coefficient (Q) of 0.94 and root mean square error of 0.52 log units. Contribution analysis of different interactions indicated that dispersion and hydrophobic interactions have the highest positive contribution (56%) to increase the partition of pollutants onto plant cuticles. In addition, for organic pollutions containing benzene ring (13-31%), double bond (9-17%) or nitrogen-containing heterocycles (9-17%), π/n-electron pairs interactions exhibit obvious positive contributions to logK. In conclusion, the proposed pp-LFER model is beneficial for predicting logK of potential organic pollutants directly from their molecular structures.
预测植物角质层-水分配系数(K)并理解分配机制对于评估有机污染物的环境归宿和风险至关重要。到目前为止,由于实验成本高,只有数百种化合物的实验 K 值被确定。出于这个原因,基于化学结构预测 K 值的计算模型是评估新化合物的有前途的方法。在这项研究中,收集并整理了一个包含 279 个 logK 值和 125 种独特化合物的大型数据集。基于该数据集,采用逐步多元线性回归方法开发了一个多参数线性自由能关系(pp-LFER)模型。结果表明,pp-LFER 模型具有良好的预测能力和稳健性,决定系数(R)为 0.93,bootstrap 系数(Q)为 0.92,外部验证系数(Q)为 0.94,均方根误差为 0.52 log 单位。不同相互作用的贡献分析表明,分散和疏水相互作用对污染物分配到植物角质层的增加具有最高的正贡献(56%)。此外,对于含有苯环(13-31%)、双键(9-17%)或含氮杂环(9-17%)的有机污染物,π/n-电子对相互作用对 logK 表现出明显的正贡献。总之,所提出的 pp-LFER 模型有利于直接从分子结构预测潜在有机污染物的 logK 值。