Ianni Michele, Masciari Elio, Sperlí Giancarlo
DIMES - Department of Informatics, Modeling, Electronics and Systems, University of Calabria, 87036 Arcavacata, CS Italy.
Department of Electrical and Information Technology (DIETI), University of Naples Federico II, via Claudio 21, 80125 Naples, Italy.
J Intell Inf Syst. 2021;57(1):73-100. doi: 10.1007/s10844-020-00629-2. Epub 2020 Nov 9.
The pervasive diffusion of Social Networks (SN) produced an unprecedented amount of heterogeneous data. Thus, traditional approaches quickly became unpractical for real life applications due their intrinsic properties: large amount of user-generated data (text, video, image and audio), data heterogeneity and high speed generation rate. More in detail, the analysis of user generated data by popular social networks (i.e Facebook (https://www.facebook.com/), Twitter (https://www.twitter.com/), Instagram (https://www.instagram.com/), LinkedIn (https://www.linkedin.com/)) poses quite intriguing challenges for both research and industry communities in the task of analyzing user behavior, user interactions, link evolution, opinion spreading and several other important aspects. This survey will focus on the analyses performed in last two decades on these kind of data w.r.t. the dimensions defined for Big Data paradigm (the so called Big Data 6 V's).
社交网络(SN)的广泛传播产生了前所未有的大量异构数据。因此,传统方法因其固有特性,即大量用户生成的数据(文本、视频、图像和音频)、数据异构性和高速生成率,很快在实际应用中变得不切实际。更详细地说,流行社交网络(如脸书(https://www.facebook.com/)、推特(https://www.twitter.com/)、照片墙(https://www.instagram.com/)、领英(https://www.linkedin.com/))对用户生成数据的分析,在分析用户行为、用户交互、链接演变、观点传播以及其他几个重要方面的任务中,给研究和行业社区带来了相当有趣的挑战。本综述将聚焦于过去二十年针对这类数据,在大数据范式所定义的维度(即所谓的大数据6V)方面所进行的分析。