Department of Civil and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea; Institute of Construction and Environmental Engineering, Seoul National University, Seoul, 08826, South Korea.
Division of Urban Planning and Transportation, Seoul Institute, Seoul, 06756, South Korea.
Chemosphere. 2023 Dec;344:140350. doi: 10.1016/j.chemosphere.2023.140350. Epub 2023 Oct 2.
Assessment of inorganic arsenate (As(V)) is critical for ensuring a sustainable environment because of its adverse effects on humans and ecosystems. This study is the first to attempt to predict As(V) toxicity to the bioluminescent bacterium Aliivibrio fischeri exposed to varying As(V) dosages and environmental factors (pH and phosphate concentration) using six machine learning (ML)-guided models. The predicted toxicity values were compared with those predicted using the extended biotic ligand model (BLM) we previously developed to evaluate the toxic effect of oxyanion (i.e., As(V)). The relationship between the variables (input features) and toxicity (output) was found to play an important role in the prediction accuracy of each ML-guided model. The results indicated that the extended BLM had the highest prediction accuracy, with a root mean square error (RMSE) of 12.997. However, with an RMSE of 14.361, the multilayer perceptron (MLP) model exhibited quasi-accurate prediction, despite having been trained with a relatively small dataset (n = 256). In view of simplicity, an MLP model is compatible with an extended BLM and does not require expert knowledge for the derivation of specific parameters, such as binding fraction and binding constant values. Furthermore, with the development and employment of reliable in-situ sensing techniques, monitoring data are expected to be augmented faster to provide sufficient training data for the improvement of prediction accuracy which may, thus, allow it to outperform the extended BLM after obtaining sufficient data.
评估无机砷酸盐(As(V))对于确保环境可持续性至关重要,因为它对人类和生态系统有不良影响。本研究首次尝试使用六种机器学习 (ML) 指导模型预测在不同砷酸盐剂量和环境因素(pH 值和磷酸盐浓度)下发光细菌 Aliivibrio fischeri 暴露于砷酸盐的毒性。预测的毒性值与我们之前开发的扩展生物配体模型 (BLM) 预测的毒性值进行了比较,该模型用于评估含氧阴离子(即 As(V))的毒性效应。发现变量(输入特征)与毒性(输出)之间的关系在每个 ML 指导模型的预测准确性中起着重要作用。结果表明,扩展 BLM 的预测精度最高,均方根误差 (RMSE) 为 12.997。然而,多层感知器 (MLP) 模型的 RMSE 为 14.361,表现出准准确的预测,尽管它使用相对较小的数据集(n = 256)进行了训练。鉴于其简单性,MLP 模型与扩展 BLM 兼容,并且不需要专门知识来推导特定参数,例如结合分数和结合常数值。此外,随着可靠的原位传感技术的发展和应用,监测数据有望更快地增加,为提高预测精度提供足够的训练数据,这可能会在获得足够的数据后使其优于扩展 BLM。