Gunawan Agus Yodi, Kresnowati Made Tri Ari Penia
Telkom University, School of Electrical Engineering, Department of Telecommunication Engineering, Jl. Telekomunikasi No.1 Dayeuh Kolot, 40257 Kabupaten Bandung, Jawa Barat, Indonesia.
Institut Teknologi Bandung, Faculty of Mathematics and Natural Sciences, Industrial and Financial Mathematics Research Group, Jl. Ganesha 10 Bandung 40132, Indonesia.
Heliyon. 2022 Jun 9;8(6):e09715. doi: 10.1016/j.heliyon.2022.e09715. eCollection 2022 Jun.
In metabolomics studies, independent analyses or replicating the metabolite concentration measurements are often performed to anticipate errors. On the other hand, the size of the dataset is increasing. For clustering purposes, obtaining representative information chemically from independent analyses is needed. The objective of this study is to develop a data reduction method such that a dataset that represents chemical information is obtained. Overall a proper data reduction method would simplify the clustering of metabolite data. We propose the modified Weiszfeld algorithm (MWA) to reduce independent analyses. To obtain comprehensive results, we compare MWA with some other well-known reduction methods, including PCA, CMDS, LE, and LLE. Then reduced datasets are clustered using the fuzzy c-means (FCM) algorithm with the Tang Sun Sun (TSS) index and silhouette index as the cluster validity indices. The results show that MWA, together with PCA, present the optimal number of clusters, namely four clusters. This result aligns with the optimal number of clusters before dimensionality reduction. The present results show that MWA is robust to perform dimensionality reduction of independent analyses while maintaining chemical information on the reduced dataset. Therefore, we recommend the reliability of MWA as one of the chemometric techniques, and the present finding has enriched chemometric techniques in metabolomics studies.
在代谢组学研究中,通常会进行独立分析或重复代谢物浓度测量以预估误差。另一方面,数据集的规模在不断增大。出于聚类目的,需要从独立分析中获取具有代表性的化学信息。本研究的目的是开发一种数据约简方法,以便获得一个能代表化学信息的数据集。总体而言,一种合适的数据约简方法将简化代谢物数据的聚类。我们提出改进的魏斯菲尔德算法(MWA)来减少独立分析。为了获得全面的结果,我们将MWA与其他一些著名的约简方法进行比较,包括主成分分析(PCA)、经典多维缩放(CMDS)、局部线性嵌入(LE)和局部线性等距映射(LLE)。然后使用模糊c均值(FCM)算法对约简后的数据集进行聚类,将唐孙孙(TSS)指数和轮廓系数作为聚类有效性指标。结果表明,MWA与PCA一起给出了最优聚类数,即四个聚类。这一结果与降维前的最优聚类数一致。目前的结果表明,MWA在执行独立分析的降维时具有鲁棒性,同时在约简后的数据集上保留了化学信息。因此,我们推荐MWA作为一种化学计量学技术的可靠性,并且目前的发现丰富了代谢组学研究中的化学计量学技术。