Chung Jihoon, Zhang Junru, Saimon Amirul Islam, Liu Yang, Johnson Blake N, Kong Zhenyu
Department of Industrial Engineering, Pusan National University, Busan, South Korea.
Grado Department of Industrial and Systems Engineering, Virginia Tech, Blacksburg, VA, USA.
Sci Rep. 2024 Jun 9;14(1):13230. doi: 10.1038/s41598-024-63285-4.
Spectroscopic techniques generate one-dimensional spectra with distinct peaks and specific widths in the frequency domain. These features act as unique identities for material characteristics. Deep neural networks (DNNs) has recently been considered a powerful tool for automatically categorizing experimental spectra data by supervised classification to evaluate material characteristics. However, most existing work assumes balanced spectral data among various classes in the training data, contrary to actual experiments, where the spectral data is usually imbalanced. The imbalanced training data deteriorates the supervised classification performance, hindering understanding of the phase behavior, specifically, sol-gel transition (gelation) of soft materials and glycomaterials. To address this issue, this paper applies a novel data augmentation method based on a generative adversarial network (GAN) proposed by the authors in their prior work. To demonstrate the effectiveness of the proposed method, the actual imbalanced spectral data from Pluronic F-127 hydrogel and Alpha-Cyclodextrin hydrogel are used to classify the phases of data. Specifically, our approach improves 8.8%, 6.4%, and 6.2% of the performance of the existing data augmentation methods regarding the classifier's F-score, Precision, and Recall on average, respectively. Specifically, our method consists of three DNNs: the generator, discriminator, and classifier. The method generates samples that are not only authentic but emphasize the differentiation between material characteristics to provide balanced training data, improving the classification results. Based on these validated results, we expect the method's broader applications in addressing imbalanced measurement data across diverse domains in materials science and chemical engineering.
光谱技术在频域中生成具有独特峰和特定宽度的一维光谱。这些特征作为材料特性的独特标识。深度神经网络(DNN)最近被认为是一种强大的工具,可通过监督分类自动对实验光谱数据进行分类,以评估材料特性。然而,与实际实验不同,大多数现有工作假设训练数据中各类别之间的光谱数据是平衡的,而实际实验中的光谱数据通常是不平衡的。不平衡的训练数据会降低监督分类性能,阻碍对软材料和糖材料的相行为(特别是溶胶 - 凝胶转变(凝胶化))的理解。为了解决这个问题,本文应用了作者在之前工作中提出的基于生成对抗网络(GAN)的新型数据增强方法。为了证明所提出方法的有效性,使用来自普朗尼克F - 127水凝胶和α - 环糊精水凝胶的实际不平衡光谱数据对数据的相进行分类。具体而言,我们的方法在分类器的F分数、精度和召回率方面,分别平均提高了现有数据增强方法性能的8.8%、6.4%和6.2%。具体来说,我们的方法由三个DNN组成:生成器、判别器和分类器。该方法生成的样本不仅真实,而且强调材料特性之间的差异,以提供平衡的训练数据,从而改善分类结果。基于这些验证结果,我们期望该方法在解决材料科学和化学工程中不同领域的不平衡测量数据方面有更广泛的应用。