Suppr超能文献

moth-flame 优化-蝙蝠优化:基于 moth-flame 蝙蝠优化和稀疏模糊 C 均值的大数据聚类的 Map-Reduce 框架。

Moth-Flame Optimization-Bat Optimization: Map-Reduce Framework for Big Data Clustering Using the Moth-Flame Bat Optimization and Sparse Fuzzy C-Means.

机构信息

VNRVJIET, Hyderabad, India.

Department of CSE and NSS Coordinator, JNTUA University, Ananthapuramu, India.

出版信息

Big Data. 2020 Jun;8(3):203-217. doi: 10.1089/big.2019.0125. Epub 2020 May 19.

Abstract

The technical advancements in big data have become popular and most desirable among users for storing, processing, and handling huge data sets. However, clustering using these big data sets has become a major challenge in big data analysis. The conventional clustering algorithms used scalable solutions for managing huge data sets. Thus, this study proposes a technique for big data clustering using the spark architecture. The proposed technique undergoes two steps for clustering the big data, involving feature selection and clustering, performed in the initial cluster nodes of spark architecture. At first, the initial cluster nodes read the big data from various distributed systems, and the optimal features are selected and placed in the feature vector based on the proposed moth-flame optimization-based bat (MFO-Bat) algorithm, which is designed by integrating MFO and Bat algorithms. Then, the selected features are fed to the final cluster nodes of spark, which uses the sparse-fuzzy C-means method for performing optimal clustering. The performance of proposed MFO-Bat outperformed other existing methods with a maximal classification accuracy of 95.806%, Dice coefficient of 99.181%, and Jaccard coefficient of 98.376%, respectively.

摘要

大数据技术在存储、处理和处理大数据集方面已经变得非常流行,也是用户最想要的。然而,使用这些大数据集进行聚类已经成为大数据分析中的一个主要挑战。传统的聚类算法使用可扩展的解决方案来管理大数据集。因此,本研究提出了一种使用 spark 架构进行大数据聚类的技术。所提出的技术经历了两个步骤来对大数据进行聚类,涉及特征选择和聚类,在 spark 架构的初始簇节点中执行。首先,初始簇节点从各个分布式系统中读取大数据,然后根据所提出的 moth-flame 优化蝙蝠算法(MFO-Bat)选择最优特征,并将其放置在特征向量中,该算法是通过集成 MFO 和 Bat 算法设计的。然后,选择的特征被输入到 spark 的最终簇节点,该节点使用稀疏模糊 C 均值方法进行最佳聚类。所提出的 MFO-Bat 的性能优于其他现有方法,最大分类准确率为 95.806%,Dice 系数为 99.181%,Jaccard 系数为 98.376%。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验