Gomez-Flores Allan, Bradford Scott A, Cai Li, Urík Martin, Kim Hyunjung
Department of Earth Resources and Environmental Engineering, Hanyang University, 222 Wangsimni-ro, Seongdong-gu, Seoul 04763, Republic of Korea.
USDA, ARS, Sustainable Agricultural Water Systems Unit, 239 Hopkins Road, Davis, CA 95616, United States.
Water Res. 2023 Feb 1;229:119429. doi: 10.1016/j.watres.2022.119429. Epub 2022 Nov 25.
Colloidal particles can attach to surfaces during transport, but the attachment depends on particle size, hydrodynamics, solid and water chemistry, and particulate matter. The attachment is quantified in filtration theory by measuring attachment or sticking efficiency (Alpha). A comprehensive Alpha database (2538 records) was built from experiments in the literature and used to develop a machine learning (ML) model to predict Alpha. The training (r-squared: 0.86) was performed using two random forests capable of handling missing data. A holdout dataset was used to validate the training (r-squared: 0.98), and the variable importance was explored for training and validation. Finally, an additional validation dataset was built from quartz crystal microbalance experiments using surface-modified polystyrene, poly (methyl methacrylate), and polyethylene. The experiments were performed in the absence or presence of humic acid. Full database regression (r-squared: 0.90) predicted Alpha for the additional validation with an r-squared of 0.23. Nevertheless, when the original database and the additional validation dataset were combined into a new database, both the training (r-squared: 0.95) and validation (r-squared: 0.70) increased. The developed ML model provides a data-driven prediction of Alpha over a big database and evaluates the significance of 22 input variables.
胶体颗粒在输运过程中会附着于表面,但这种附着取决于颗粒大小、流体动力学、固体和水的化学性质以及颗粒物。在过滤理论中,通过测量附着或黏附效率(α)来量化这种附着。从文献中的实验建立了一个综合的α数据库(2538条记录),并用于开发一个机器学习(ML)模型来预测α。使用两个能够处理缺失数据的随机森林进行训练(决定系数:0.86)。使用一个保留数据集来验证训练(决定系数:0.98),并探索训练和验证中的变量重要性。最后,利用石英晶体微天平实验,使用表面改性的聚苯乙烯、聚甲基丙烯酸甲酯和聚乙烯建立了一个额外的验证数据集。实验在有无腐殖酸的情况下进行。完整数据库回归(决定系数:0.90)预测额外验证的α时决定系数为0.23。然而,当将原始数据库和额外的验证数据集合并成一个新数据库时,训练(决定系数:0.95)和验证(决定系数:0.70)都有所提高。所开发的ML模型在一个大型数据库上提供了数据驱动的α预测,并评估了22个输入变量的重要性。