KWR Water Research Institute, Groningenhaven 7, 3433, PE Nieuwegein, the Netherlands.
Environ Res. 2022 Sep;212(Pt D):113569. doi: 10.1016/j.envres.2022.113569. Epub 2022 May 27.
Monitoring of microplastics in environmental samples is relevant to the scientific world, as well as to environmental agencies and water authorities, in particular considering increasing efforts to decrease emissions and the growing concern of governments and the public. Therefore, rapid accurate detection and identification of microplastics including polymers, despite their degradation in the environment, is crucial. The degradation has a significant impact on the infrared spectra of the microplastics and can impede the identification process. This work presents a novel approach to addressing the problem of identification of weathered microplastics. A quantum cascade laser (LDIR) was used to record the infrared spectra of various polymeric particles (81,291 individual particles). Using a combination of pristine and weathered particles, two supervised machine learning (ML) models, namely Subspace k-Nearest Neighbor (Sub-kNN) and Boosted Decision Tree (BDT), were trained to recognize the spectrum characteristics of labeled particles and then used to identify unlabeled samples, with an identification accuracy of 89.7% and 77.1% using 10-fold cross validation. About 90% of the samples could be identified via the Sub-kNN or BDT models. Subsequently, a non-supervised ML model, namely, Density-based Spatial Clustering of Applications with Noise (DBSCAN), was used to cluster samples which could not be labeled from the supervised ML model. This enabled the detection of additional subgroups of microplastics. Manual labelling can then be carried out on a selection of spectra per group (e.g., centroids of each cluster), hence accelerating the identification process and allowing to add new labeled samples to the initial supervised ML. Although expert efforts are still needed, the proposed method greatly lowers labeling efforts by using the combined supervised and unsupervised learning models. In the future, the use of deep neural networks could further boost the implementation of these kinds of approaches for polymer and microplastic identification in environmental settings.
对环境样品中的微塑料进行监测不仅与科学界有关,也与环境机构和水管理部门有关,特别是考虑到减少排放的努力不断增加,以及政府和公众日益关注。因此,快速准确地检测和识别微塑料,包括聚合物,尽管它们在环境中降解,这是至关重要的。降解对微塑料的红外光谱有重大影响,并可能阻碍识别过程。本工作提出了一种解决风化微塑料识别问题的新方法。使用量子级联激光(LDIR)记录了各种聚合物颗粒(81,291 个颗粒)的红外光谱。使用原始和风化颗粒的组合,训练了两个监督机器学习(ML)模型,即子空间 k-最近邻(Sub-kNN)和 Boosted 决策树(BDT),以识别标记颗粒的光谱特征,然后用于识别未标记的样品,使用 10 倍交叉验证的识别准确率分别为 89.7%和 77.1%。约 90%的样品可以通过 Sub-kNN 或 BDT 模型进行识别。随后,使用非监督机器学习模型,即带噪声的基于密度的空间聚类(DBSCAN),对无法通过监督 ML 模型进行标记的样本进行聚类。这使得可以检测到微塑料的其他亚组。然后可以对每个组(例如,每个聚类的质心)的一组光谱进行手动标记,从而加速识别过程,并允许将新的标记样本添加到初始监督 ML 中。尽管仍然需要专家的努力,但使用组合的监督和非监督学习模型大大降低了标记的工作量。将来,使用深度神经网络可以进一步推动在环境中对聚合物和微塑料进行识别的这类方法的实施。