Suppr超能文献

利用机器学习预测有机化合物在生物炭、碳纳米管、颗粒状活性炭和树脂上的水相吸附。

Predicting Aqueous Adsorption of Organic Compounds onto Biochars, Carbon Nanotubes, Granular Activated Carbons, and Resins with Machine Learning.

机构信息

Department of Civil and Environmental Engineering, Case Western Reserve University, Cleveland, Ohio 44106, United States.

出版信息

Environ Sci Technol. 2020 Jun 2;54(11):7008-7018. doi: 10.1021/acs.est.0c02526. Epub 2020 May 20.

Abstract

Predictive models are useful tools for aqueous adsorption research; existing models such as multilinear regression (MLR), however, can only predict adsorption under specific equilibrium concentrations or for certain adsorption isotherm models. Also, few studies have discussed data processing beyond applying different modeling algorithms to improve the prediction accuracy. In this research, we employed a cosine similarity approach that focused on mining the available data before developing models; this approach can mine the most relevant data concerning the prediction target to build models and was found to considerably improve the prediction accuracy. We then built a machine-learning modeling process based on neural networks (NN), a group-selection data-splitting strategy for grouped adsorption data for adsorbent-adsorbate pairs under different equilibrium concentrations, and polyparameter linear free energy relationships (pp-LFERs) for aqueous adsorption of 165 organic compounds onto 50 biochars, 34 carbon nanotubes, 35 GACs, and 30 polymeric resins. The final NN-LFER models were successfully applied to various equilibrium concentrations regardless of the adsorption isotherm models and showed less prediction deviations than the published models with the root-mean-square errors 0.23-0.31 versus 0.23-0.97 log unit, and the predictions were improved by adding two key descriptors (BET surface area and pore volume) for the adsorbents. Finally, interpreting the NN-LFER models based on the Shapley values suggested that not considering equilibrium concentration and properties of the adsorbents in the existing MLR models is a possible reason for their higher prediction deviations.

摘要

预测模型是水吸附研究的有用工具;然而,现有的模型,如多元线性回归(MLR),只能预测特定平衡浓度下或特定吸附等温线模型下的吸附。此外,很少有研究讨论过数据处理,除了应用不同的建模算法来提高预测精度。在这项研究中,我们采用了余弦相似性方法,该方法侧重于在开发模型之前挖掘可用数据;这种方法可以挖掘与预测目标最相关的数据来建立模型,结果发现它可以显著提高预测精度。然后,我们基于神经网络(NN)建立了一个机器学习建模过程,针对不同平衡浓度下吸附剂-吸附质对的分组吸附数据,采用了一种分组数据分割策略,以及用于 165 种有机化合物在 50 种生物炭、34 种碳纳米管、35 种 GAC 和 30 种聚合物树脂上的水吸附的多参数线性自由能关系(pp-LFER)。最终的 NN-LFER 模型成功地应用于各种平衡浓度,而不论吸附等温线模型如何,与具有 0.23-0.31 对数单位的已发表模型相比,预测偏差较小,并且通过添加两个关键描述符(BET 表面积和孔体积)来改善预测。最后,根据 Shapley 值解释 NN-LFER 模型表明,在现有的 MLR 模型中不考虑平衡浓度和吸附剂的性质可能是其预测偏差较大的一个原因。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验