Meilisa Mira, Otok Bambang Widjanarko, Purnomo Jerry Dwi Trijoyo
Department of Statistics, Faculty of Science and Data Analytics, Institut Teknologi Sepuluh Nopember, Surabaya 60111, Indonesia.
Muhammadiyah University of Sumatera Barat, West Sumatera, Indonesia.
MethodsX. 2024 May 21;12:102775. doi: 10.1016/j.mex.2024.102775. eCollection 2024 Jun.
This article offers a new method of clustering data. This method is constructed by combining the multivariate adaptive regression spline biresponse continuous model (MARSBC) with the fuzzy clustering means (FCM) approach, called the multivariate adaptive biresponse fuzzy clustering means regression splines (MABFCMRS) model. This method uses patterns obtained from the MARSBC model to separate data into specific groups. Observing unobserved heterogeneity that has not been obtained from previous models. Unlike the classic fuzzy clustering methods that use euclid distances to determine the weight of the object, this method uses the total square of the massed residual distance generated by the MARSBC model. Theoretical studies were conducted to obtain predictions for the MABFCMRS model parameters. Furthermore, this method was applied to stunting and wasting cases in southeastern Sulawesi province. The results of the research show that in the case of stunting modeling and wasting in southeast Sulawesi province, the best clusters were obtained based on the criteria of partition coefficient (PC) and modification of PC (MPC). This research is able to show that the clustering process using the MABFCMRS model has been able to improve generalized cross-validation (GCV) values and determination coefficients. •This paper presents a new, effective method for clustering data based on unobserved heterogeneity.•This is applicable to sizable samples with 3 to 20 predictors.
本文提出了一种新的数据聚类方法。该方法是通过将多元自适应回归样条双响应连续模型(MARSBC)与模糊聚类均值(FCM)方法相结合构建而成的,称为多元自适应双响应模糊聚类均值回归样条(MABFCMRS)模型。该方法利用从MARSBC模型获得的模式将数据分离成特定的组。观察从先前模型中未获得的未观察到的异质性。与使用欧几里得距离来确定对象权重的经典模糊聚类方法不同,该方法使用MARSBC模型生成的累积残差距离的总平方。进行了理论研究以获得MABFCMRS模型参数的预测。此外,该方法应用于南苏拉威西省的发育迟缓与消瘦病例。研究结果表明,在南苏拉威西省发育迟缓建模与消瘦的案例中,基于划分系数(PC)和PC修正(MPC)标准获得了最佳聚类。该研究能够表明,使用MABFCMRS模型的聚类过程能够提高广义交叉验证(GCV)值和决定系数。•本文提出了一种基于未观察到的异质性进行数据聚类的新的有效方法。•这适用于具有3到20个预测变量的大量样本。