Department of Civil and Environmental Engineering, Case Western Reserve University, Cleveland, Ohio 44106, United States.
Department of Civil, Architectural, and Environmental Engineering, Missouri University of Science and Technology, Rolla, Missouri 65409, United States.
Environ Sci Technol. 2021 Oct 5;55(19):12741-12754. doi: 10.1021/acs.est.1c01339. Epub 2021 Aug 17.
The rapid increase in both the quantity and complexity of data that are being generated daily in the field of environmental science and engineering (ESE) demands accompanied advancement in data analytics. Advanced data analysis approaches, such as machine learning (ML), have become indispensable tools for revealing hidden patterns or deducing correlations for which conventional analytical methods face limitations or challenges. However, ML concepts and practices have not been widely utilized by researchers in ESE. This feature explores the potential of ML to revolutionize data analysis and modeling in the ESE field, and covers the essential knowledge needed for such applications. First, we use five examples to illustrate how ML addresses complex ESE problems. We then summarize four major types of applications of ML in ESE: making predictions; extracting feature importance; detecting anomalies; and discovering new materials or chemicals. Next, we introduce the essential knowledge required and current shortcomings in ML applications in ESE, with a focus on three important but often overlooked components when applying ML: correct model development, proper model interpretation, and sound applicability analysis. Finally, we discuss challenges and future opportunities in the application of ML tools in ESE to highlight the potential of ML in this field.
随着环境科学与工程(ESE)领域每天产生的数据量和复杂性的快速增长,数据分析也需要随之发展。先进的数据分析方法,如机器学习(ML),已经成为揭示隐藏模式或推断传统分析方法面临限制或挑战的相关性的不可或缺的工具。然而,ML 的概念和实践并没有被 ESE 研究人员广泛应用。本专题探讨了 ML 在 ESE 领域数据分析和建模方面的潜力,并涵盖了此类应用所需的基本知识。首先,我们使用五个例子来说明 ML 如何解决复杂的 ESE 问题。然后,我们总结了 ML 在 ESE 中的四大类应用:预测、提取特征重要性、检测异常和发现新材料或化学品。接下来,我们介绍了 ML 在 ESE 应用中所需的基本知识和当前的不足之处,重点介绍了在应用 ML 时经常被忽视的三个重要但常常被忽视的组件:正确的模型开发、适当的模型解释和合理的适用性分析。最后,我们讨论了 ML 工具在 ESE 中的应用所面临的挑战和未来机遇,以突出 ML 在该领域的潜力。