Dilmaghani Saharnaz, Brust Matthias R, Piyatumrong Apivadee, Danoy Grégoire, Bouvry Pascal
Interdisciplinary Centre for Security, Reliability, and Trust (SnT), University of Luxembourg, Esch-sur-Alzette, Luxembourg.
National Electronics and Computer Technology Center, A Member of NSTDA, Bangkok, Thailand.
Front Big Data. 2019 Jun 26;2:22. doi: 10.3389/fdata.2019.00022. eCollection 2019.
Collaboration networks are defined as a set of individuals who come together and collaborate on particular tasks such as publishing a paper. The analysis of such networks permits to extract knowledge on the structure and patterns of communities. The link definition and network extraction have a high impact on the analysis of collaboration networks. Previous studies model the connectivity in a network considering it as a binomial problem with respect to the existence of a collaboration between individuals. However, such a data consists of a high diversity of features that describe the quality of the interaction such as the contribution amount of each individual. In this paper, we have determined a solution to extract collaboration networks using corresponding features in a dataset. We define to quantify the collaboration between collaborators. In order to validate our proposed method, we benefit from a scientific research institute dataset in which researchers are co-authors who are involved in the production of papers, prototypes, and intellectual properties (IP). We evaluated the generated networks, produced through different thresholds of , by employing a set of network analysis metrics such as clustering coefficient, network density, and centrality measures. We investigated more the obtained networks using a community detection algorithm to further discuss the impact of our model on community detection. The outcome shows that the quality of resulted communities on the extracted collaboration networks can differ significantly based on the choice of the linkage threshold.
合作网络被定义为一组聚集在一起并就特定任务(如发表论文)进行合作的个人。对这类网络的分析有助于提取有关社区结构和模式的知识。链接定义和网络提取对合作网络的分析有很大影响。先前的研究在考虑网络连通性时,将其视为关于个体之间是否存在合作的二项式问题。然而,这样的数据包含了描述交互质量的高度多样化的特征,比如每个个体的贡献量。在本文中,我们确定了一种利用数据集中的相应特征来提取合作网络的解决方案。我们定义 来量化合作者之间的合作。为了验证我们提出的方法,我们利用了一个科研机构数据集,其中研究人员是参与论文、原型和知识产权(IP)产出的共同作者。我们通过使用一组网络分析指标(如聚类系数、网络密度和中心性度量)来评估通过不同 阈值生成的网络。我们使用社区检测算法对得到的网络进行了更深入的研究,以进一步讨论我们的模型对社区检测的影响。结果表明,基于链接阈值的选择,提取的合作网络上所得社区的质量可能会有显著差异。