Zhao Kuo, Zhang Huajian, Li Jiaxin, Pan Qifu, Lai Li, Nie Yike, Zhang Zhongfei
School of Intelligent Systems Science and Engineering, Jinan University, Zhuhai 519070, China.
Guangdong International Cooperation Base of Science and Technology for GBA Smart Logistics, Jinan University, Zhuhai 519070, China.
Entropy (Basel). 2024 Jul 7;26(7):579. doi: 10.3390/e26070579.
The rapid evolution of computer technology and social networks has led to massive data generation through interpersonal communications, necessitating improved methods for information mining and relational analysis in areas such as criminal activity. This paper introduces a Social Network Forensic Analysis model that employs network representation learning to identify and analyze key figures within criminal networks, including leadership structures. The model incorporates traditional web forensics and community algorithms, utilizing concepts such as centrality and similarity measures and integrating the Deepwalk, Line, and Node2vec algorithms to map criminal networks into vector spaces. This maintains node features and structural information that are crucial for the relational analysis. The model refines node relationships through modified random walk sampling, using BFS and DFS, and employs a Continuous Bag-of-Words with Hierarchical Softmax for node vectorization, optimizing the value distribution via the Huffman tree. Hierarchical clustering and distance measures (cosine and Euclidean) were used to identify the key nodes and establish a hierarchy of influence. The findings demonstrate the effectiveness of the model in accurately vectorizing nodes, enhancing inter-node relationship precision, and optimizing clustering, thereby advancing the tools for combating complex criminal networks.
计算机技术和社交网络的快速发展,通过人际交流产生了海量数据,这就需要改进犯罪活动等领域的信息挖掘和关系分析方法。本文介绍了一种社交网络取证分析模型,该模型采用网络表示学习来识别和分析犯罪网络中的关键人物,包括领导结构。该模型结合了传统的网络取证和社区算法,利用中心性和相似性度量等概念,并集成了Deepwalk、Line和Node2vec算法,将犯罪网络映射到向量空间。这保留了对关系分析至关重要的节点特征和结构信息。该模型通过使用广度优先搜索(BFS)和深度优先搜索(DFS)的改进随机游走采样来优化节点关系,并采用带有分层Softmax的连续词袋模型进行节点向量化,通过哈夫曼树优化值分布。使用层次聚类和距离度量(余弦和欧几里得)来识别关键节点并建立影响层次结构。研究结果表明,该模型在准确地将节点向量化、提高节点间关系精度和优化聚类方面是有效的,从而推进了打击复杂犯罪网络的工具。