Younes Khaled, Antar Mayssara, Chaouk Hamdi, Kharboutly Yahya, Mouhtady Omar, Obeid Emil, Gazo Hanna Eddie, Halwani Jalal, Murshid Nimer
College of Engineering and Technology, American University of the Middle East, Egaila 54200, Kuwait.
Water and Environment Sciences Laboratory, Lebanese University, Tripoli P.O. Box 6573/14, Lebanon.
Gels. 2023 Jun 6;9(6):465. doi: 10.3390/gels9060465.
In this study, our aim was to estimate the adsorption potential of three families of aerogels: nanocellulose (NC), chitosan (CS), and graphene (G) oxide-based aerogels. The emphasized efficiency to seek here concerns oil and organic contaminant removal. In order to achieve this goal, principal component analysis (PCA) was used as a data mining tool. PCA showed hidden patterns that were not possible to seek by the bi-dimensional conventional perspective. In fact, higher total variance was scored in this study compared with previous findings (an increase of nearly 15%). Different approaches and data pre-treatments have provided different findings for PCA. When the whole dataset was taken into consideration, PCA was able to reveal the discrepancy between nanocellulose-based aerogel from one part and chitosan-based and graphene-based aerogels from another part. In order to overcome the bias yielded by the outliers and to probably increase the degree of representativeness, a separation of individuals was adopted. This approach allowed an increase in the total variance of the PCA approach from 64.02% (for the whole dataset) to 69.42% (outliers excluded dataset) and 79.82% (outliers only dataset). This reveals the effectiveness of the followed approach and the high bias yielded from the outliers.
在本研究中,我们的目的是评估三类气凝胶的吸附潜力:纳米纤维素(NC)、壳聚糖(CS)和氧化石墨烯(G)基气凝胶。这里所强调的寻求效率涉及油类和有机污染物的去除。为了实现这一目标,主成分分析(PCA)被用作一种数据挖掘工具。PCA揭示了从二维传统视角无法探寻到的隐藏模式。事实上,与先前的研究结果相比,本研究中获得了更高的总方差(增加了近15%)。不同的方法和数据预处理为PCA提供了不同的结果。当考虑整个数据集时,PCA能够揭示一部分基于纳米纤维素的气凝胶与另一部分基于壳聚糖和石墨烯的气凝胶之间的差异。为了克服异常值产生的偏差并可能提高代表性程度,采用了个体分离的方法。这种方法使得PCA方法的总方差从64.02%(对于整个数据集)增加到69.42%(排除异常值的数据集)和79.82%(仅包含异常值的数据集)。这揭示了所采用方法的有效性以及异常值产生的高偏差。