QSAR Lab, Trzy Lipy 3, 80-172, Gdańsk, Poland.
QSAR Lab, Trzy Lipy 3, 80-172, Gdańsk, Poland.
Chemosphere. 2023 Nov;340:139965. doi: 10.1016/j.chemosphere.2023.139965. Epub 2023 Aug 24.
This work aimed to verify whether it is possible to extend the applicability domain (AD) of existing QSPR (Quantitative Structure-Property Relationship) models by employing a strategy involving additional quantum-chemical calculations. We selected two published QSPR models: for water solubility, logS, and vapor pressure, logVP of PFAS as case studies. We aimed to enlarge set of compounds used to build the model by applying factorial planning to plan the augmentation of the set of these compounds based on their structural features (descriptors). Next, we used the COSMO-RS model to calculate the logS and logVP for selected chemicals. This allowed filling gaps in the experimental data for further training QSPR models. We improved the published models by significantly extending number of compounds for which theoretical predictions are reliable (i.e., extending the AD). Additionally, we performed external validation that had not been carried out in original models. To test effectiveness of the AD extension, we screened 4519 PFAS from NORMAN Database. The number of compounds outside the domain was reduced comparing the original model for both properties. Our work shows that combining physics-based methods with data-driven models can significantly improve the performance of predictions of phys-chem properties relevant for the chemical risk assessment.
这项工作旨在验证通过采用涉及额外量子化学计算的策略,是否可以扩展现有定量结构-性质关系 (QSPR) 模型的适用性域 (AD)。我们选择了两个已发表的 QSPR 模型:作为案例研究,用于水溶解度、logS 和蒸气压、logVP 的 PFAS。我们旨在通过应用析因规划来扩大模型中化合物的集合,从而根据其结构特征(描述符)来规划这些化合物的集合的扩充。接下来,我们使用 COSMO-RS 模型计算选定化学品的 logS 和 logVP。这允许为进一步训练 QSPR 模型填补实验数据中的空白。我们通过显著扩展理论预测可靠的化合物数量(即扩展 AD)来改进已发表的模型。此外,我们进行了原始模型中未进行的外部验证。为了测试 AD 扩展的有效性,我们从 NORMAN 数据库中筛选了 4519 种 PFAS。与两种特性的原始模型相比,超出该范围的化合物数量减少了。我们的工作表明,将基于物理的方法与数据驱动的模型相结合,可以显著提高与化学风险评估相关的物理化学性质预测的性能。