Banihashemi Sepideh, Veksler Keren, Abhari Abdolreza
Department of Computer Science, Toronto Metropolitan University, Toronto, ON, Canada.
Simulation. 2025 Jun;101(6):681-701. doi: 10.1177/00375497241298962. Epub 2024 Dec 6.
Analyzing social media networks is crucial for understanding and uncovering common interests and characteristics among users within human societies. In this context, we simulated a simple application of human interaction in social networks, which involves users following others based on text similarity. We then investigated the effects of various machine learning (ML) algorithms employed in the applications to be used as recommendations to decision-making users. A novel agent-based social network simulator called distributed system and multinode processing is developed to assess the parallelization of the ML algorithms (i.e., K-means clustering, cosine similarity, support vector machine, multilayer perceptron) using bag of words (BoW) term frequency-inverse document frequency vectorization by evaluating their performance when executed in parallel across distributed heterogeneous resources. In addition, this simulator compares the effects of BoW with the Doc2Vec model on network structure by observing the differences in detected communities and resulting network graphs when a selected user follows the recommendations produced by an employed algorithm. Three real datasets were used in the experiments: Twitter, Scientific Research Papers, and Retail. This work's contribution is a unique in-house agent-based simulator developed to analyze the impact of common ML algorithms, including supervised and unsupervised learning, on social networks.
分析社交媒体网络对于理解和揭示人类社会中用户的共同兴趣和特征至关重要。在此背景下,我们模拟了社交网络中人类互动的一个简单应用,即用户基于文本相似度关注他人。然后,我们研究了应用中使用的各种机器学习(ML)算法作为决策用户推荐的效果。开发了一种名为分布式系统和多节点处理的新型基于代理的社交网络模拟器,通过在分布式异构资源上并行执行时评估其性能,来评估ML算法(即K均值聚类、余弦相似度、支持向量机、多层感知器)使用词袋(BoW)词频 - 逆文档频率向量化的并行化。此外,该模拟器通过观察当选定用户遵循所采用算法产生的推荐时检测到的社区差异和生成的网络图,比较了BoW与Doc2Vec模型对网络结构的影响。实验中使用了三个真实数据集:推特、科研论文和零售数据。这项工作的贡献在于开发了一个独特的基于内部代理的模拟器,用于分析包括监督学习和无监督学习在内的常见ML算法对社交网络的影响。