Zhou Yunchi, Wang Ying, Peijnenburg Willie, Vijver Martina G, Balraadjsing Surendra, Dong Zhaomin, Zhao Xiaoli, Leung Kenneth M Y, Mortensen Holly M, Wang Zhenyu, Lynch Iseult, Afantitis Antreas, Mu Yunsong, Wu Fengchang, Fan Wenhong
School of Materials Science and Engineering, Beihang University, Beijing 100191, China.
Ecole Centrale de Pékin/School of General Engineering, Beihang University, Beijing 100191, China.
Environ Sci Technol. 2024 Aug 7. doi: 10.1021/acs.est.4c03328.
The massive production and application of nanomaterials (NMs) have raised concerns about the potential adverse effects of NMs on human health and the environment. Evaluating the adverse effects of NMs by laboratory methods is expensive, time-consuming, and often fails to keep pace with the invention of new materials. Therefore, methods that utilize machine learning techniques to predict the toxicity potentials of NMs are a promising alternative approach if regulatory confidence in them can be enhanced. Previous reviews and regulatory OECD guidance documents have discussed in detail how to build an predictive model for NMs. Nevertheless, there is still room for improvement in addressing the ways to enhance the model representativeness and performance from different angles, such as data set curation, descriptor selection, task type (classification/regression), algorithm choice, and model evaluation (internal and external validation, applicability domain, and mechanistic interpretation, which is key to ensuring stakeholder confidence). This review explores how to build better predictive models; the current state of the art is analyzed via a statistical evaluation of literature, while the challenges faced and future perspectives are summarized. Moreover, a recommended workflow and best practices are provided to help in developing more predictive, reliable, and interpretable models that can assist risk assessment as well as safe-by-design development of NMs.
纳米材料(NMs)的大规模生产和应用引发了人们对其对人类健康和环境潜在不利影响的担忧。通过实验室方法评估纳米材料的不利影响成本高昂、耗时且往往跟不上新材料的发明速度。因此,如果能够增强监管机构对其的信心,利用机器学习技术预测纳米材料毒性潜力的方法是一种很有前景的替代方法。先前的综述和经合组织(OECD)监管指导文件已经详细讨论了如何构建纳米材料的预测模型。然而,在从不同角度提高模型的代表性和性能方面仍有改进空间,例如数据集整理、描述符选择、任务类型(分类/回归)、算法选择和模型评估(内部和外部验证、适用域以及机理解释,这是确保利益相关者信心的关键)。本综述探讨了如何构建更好的预测模型;通过对文献的统计评估分析了当前的技术水平,同时总结了面临的挑战和未来展望。此外,还提供了推荐的工作流程和最佳实践,以帮助开发更具预测性、可靠性和可解释性的模型,这些模型可协助纳米材料的风险评估以及按设计安全开发。