Interdisciplinary Center for Scientific Computing (IWR), University of Heidelberg, Heidelberg, Germany.
PLoS One. 2012;7(4):e34740. doi: 10.1371/journal.pone.0034740. Epub 2012 Apr 6.
Members of social network platforms often choose to reveal private information, and thus sacrifice some of their privacy, in exchange for the manifold opportunities and amenities offered by such platforms. In this article, we show that the seemingly innocuous combination of knowledge of confirmed contacts between members on the one hand and their email contacts to non-members on the other hand provides enough information to deduce a substantial proportion of relationships between non-members. Using machine learning we achieve an area under the (receiver operating characteristic) curve (AUC) of at least 0.85 for predicting whether two non-members known by the same member are connected or not, even for conservative estimates of the overall proportion of members, and the proportion of members disclosing their contacts.
社交网络平台的用户经常选择透露私人信息,以牺牲部分隐私为代价,换取平台提供的多种机会和便利。在本文中,我们表明,一方面成员之间已确认的联系的知识与他们与非成员的电子邮件联系的结合,另一方面提供了足够的信息来推断非成员之间的很大一部分关系。我们使用机器学习技术,即使对于保守估计的成员总数以及成员披露其联系人的比例,也能为预测同一成员已知的两个非成员是否连接的情况获得至少 0.85 的(接收者操作特征)曲线下面积(AUC)。