Kostygina Ganna, Kim Yoonsang, Seeskin Zachary, LeClere Felicia, Emery Sherry
NORC at the University of Chicago, USA.
Soc Media Soc. 2023 Oct-Dec;9(4). doi: 10.1177/20563051231216947. Epub 2023 Dec 18.
Social media dominate today's information ecosystem and provide valuable information for social research. Market researchers, social scientists, policymakers, government entities, public health researchers, and practitioners recognize the potential for social data to inspire innovation, support products and services, characterize public opinion, and guide decisions. The appeal of mining these rich datasets is clear. However, there is potential risk of data misuse, underscoring an equally huge and fundamental flaw in the research: there are no procedural standards and little transparency. Transparency across the processes of collecting and analyzing social media data is often limited due to proprietary algorithms. Spurious findings and biases introduced by artificial intelligence (AI) demonstrate the challenges this lack of transparency poses for research. Social media research remains a virtual "wild west," with no clear standards for reporting regarding data retrieval, preprocessing steps, analytic methods, or interpretation. Use of emerging generative AI technologies to augment social media analytics can undermine validity and replicability of findings, potentially turning this research into a "black box" enterprise. Clear guidance for social media analyses and reporting is needed to assure the quality of the resulting research. In this article, we propose criteria for evaluating the quality of studies using social media data, grounded in established scientific practice. We offer clear documentation guidelines to ensure that social data are used properly and transparently in research and applications. A checklist of disclosure elements to meet minimal reporting standards is proposed. These criteria will make it possible for scholars and practitioners to assess the quality, credibility, and comparability of research findings using digital data.
社交媒体主导着当今的信息生态系统,并为社会研究提供有价值的信息。市场研究人员、社会科学家、政策制定者、政府实体、公共卫生研究人员和从业者都认识到社会数据在激发创新、支持产品和服务、刻画公众舆论以及指导决策方面的潜力。挖掘这些丰富数据集的吸引力显而易见。然而,存在数据滥用的潜在风险,这凸显了该研究中一个同样巨大且根本的缺陷:没有程序标准且几乎没有透明度。由于专有算法,社交媒体数据收集和分析过程的透明度往往有限。人工智能(AI)引入的虚假发现和偏差表明了这种缺乏透明度给研究带来的挑战。社交媒体研究仍然是一个虚拟的“蛮荒西部”,在数据检索、预处理步骤、分析方法或解释的报告方面没有明确标准。使用新兴的生成式人工智能技术来增强社交媒体分析可能会破坏研究结果的有效性和可重复性,有可能使这项研究变成一个“黑匣子”企业。需要为社交媒体分析和报告提供明确指导,以确保研究结果的质量。在本文中,我们基于既定的科学实践,提出了评估使用社交媒体数据的研究质量的标准。我们提供清晰的文档指南,以确保社会数据在研究和应用中得到正确且透明的使用。提出了一份披露要素清单,以满足最低报告标准。这些标准将使学者和从业者能够评估使用数字数据的研究结果的质量、可信度和可比性。