Olteanu Alexandra, Castillo Carlos, Diaz Fernando, Kıcıman Emre
Microsoft Research, New York, NY, United States.
Microsoft Research, Montreal, QC, Canada.
Front Big Data. 2019 Jul 11;2:13. doi: 10.3389/fdata.2019.00013. eCollection 2019.
Social data in digital form-including user-generated content, expressed or implicit relations between people, and behavioral traces-are at the core of popular applications and platforms, driving the research agenda of many researchers. The promises of social data are many, including understanding "what the world thinks" about a social issue, brand, celebrity, or other entity, as well as enabling better decision-making in a variety of fields including public policy, healthcare, and economics. Many academics and practitioners have warned against the naïve usage of social data. There are biases and inaccuracies occurring at the source of the data, but also introduced during processing. There are methodological limitations and pitfalls, as well as ethical boundaries and unexpected consequences that are often overlooked. This paper recognizes the rigor with which these issues are addressed by different researchers varies across a wide range. We identify a variety of menaces in the practices around social data use, and organize them in a framework that helps to identify them. ".
数字形式的社会数据,包括用户生成内容、人与人之间明示或隐含的关系以及行为痕迹,是流行应用程序和平台的核心,推动着许多研究人员的研究议程。社会数据的前景众多,包括了解人们对某个社会问题、品牌、名人或其他实体的“世界看法”,以及在包括公共政策、医疗保健和经济学在内的各个领域实现更好的决策。许多学者和从业者都警告过不要天真地使用社会数据。数据来源存在偏差和不准确之处,在处理过程中也会引入这些问题。存在方法上的局限性和陷阱,以及常常被忽视的道德界限和意外后果。本文认识到不同研究人员处理这些问题的严谨程度差异很大。我们识别了社会数据使用实践中的各种威胁,并将它们组织在一个有助于识别这些威胁的框架中。