Suppr超能文献

通过减少欺诈性数据提高在线健康调查的严谨性。

Increasing Rigor in Online Health Surveys Through the Reduction of Fraudulent Data.

作者信息

Ng Wen Zhi, Erdembileg Sundarimaa, Liu Jean C J, Tucker Joseph D, Tan Rayner Kay Jin

机构信息

Saw Swee Hock School of Public Health, National University of Singapore, National University Health System, 12 Science Drive 2, #10-01, Singapore, 117549, Singapore, 65 91878576.

Yale-NUS College, National University of Singapore, Singapore, Singapore.

出版信息

J Med Internet Res. 2025 Aug 21;27:e68092. doi: 10.2196/68092.

Abstract

Online surveys have become a key tool of modern health research, offering a fast, cost-effective, and convenient means of data collection. It enables researchers to access diverse populations, such as those underrepresented in traditional studies, and facilitates the collection of stigmatized or sensitive behaviors through greater anonymity. However, the ease of participation also introduces significant challenges, particularly around data integrity and rigor. As fraudulent responses-whether from bots, repeat responders, or individuals misrepresenting themselves-become more sophisticated and pervasive, ensuring the rigor of online surveys has never been more crucial. This article provides a comprehensive synthesis of practical strategies that help to increase the rigor of online surveys through the detection and removal of fraudulent data. Drawing on recent literature and case studies, we outline several options that address the full research cycle from predata collection strategies to validation post data collection. We emphasize the integration of automated screening techniques (eg, CAPTCHAs and honeypot questions) and attention checks (eg, trap questions) for purposeful survey design. Robust recruitment procedures (eg, concealed eligibility criteria and 2-stage screening) and a proper incentive or compensation structure can also help to deter fraudulent participation. We examine the merits and limitations of different sampling methodologies, including river sampling, online panels, and crowdsourcing platforms, offering guidance on how to select samples based on specific research objectives. Post data collection, we discuss metadata-based techniques to detect fraudulent data (eg, duplicate email or IP addresses, response time analysis), alongside methods to better screen for low-quality responses (eg, inconsistent response patterns and improbable qualitative responses). The escalating sophistication of fraud tactics, particularly with the growth of artificial intelligence (AI), demands that researchers continuously adapt and stay vigilant. We propose the use of dynamic protocols, combining multiple strategies into a multipronged approach that can better filter for fraudulent data and evolve depending on the type of responses received across the data collection process. However, there is still significant room for strategies to develop, and it should be a key focus for upcoming research. As online surveys become increasingly integral to health research, investing in robust strategies to screen for fraudulent data and increasing the rigor of studies is key to upholding scientific integrity.

摘要

在线调查已成为现代健康研究的关键工具,提供了一种快速、经济高效且便捷的数据收集方式。它使研究人员能够接触到不同的人群,比如在传统研究中代表性不足的人群,并通过更高的匿名性促进对受污名化或敏感行为的收集。然而,参与的便捷性也带来了重大挑战,尤其是在数据完整性和严谨性方面。随着欺诈性回答——无论是来自机器人、重复回答者还是虚假陈述身份的个人——变得更加复杂和普遍,确保在线调查的严谨性从未如此关键。本文全面综合了一些实用策略,这些策略有助于通过检测和去除欺诈性数据来提高在线调查的严谨性。借鉴近期的文献和案例研究,我们概述了几种涵盖从数据收集前策略到数据收集后验证的整个研究周期的选项。我们强调将自动筛选技术(如验证码和蜜罐问题)和注意力检查(如陷阱问题)整合到有目的的调查设计中。稳健的招募程序(如隐蔽的资格标准和两阶段筛选)以及适当的激励或补偿结构也有助于阻止欺诈性参与。我们研究了不同抽样方法的优缺点,包括河流抽样、在线样本库和众包平台,并就如何根据特定研究目标选择样本提供指导。在数据收集后,我们讨论基于元数据的技术来检测欺诈性数据(如重复的电子邮件或IP地址、响应时间分析),以及更好地筛选低质量回答的方法(如不一致的回答模式和不太可能的定性回答)。欺诈策略的复杂性不断升级,尤其是随着人工智能(AI)的发展,这要求研究人员不断适应并保持警惕。我们建议使用动态协议,将多种策略组合成一种多管齐下的方法,这种方法可以更好地筛选欺诈性数据,并根据数据收集过程中收到的回答类型进行演变。然而,策略的发展仍有很大空间,这应该是未来研究的一个关键重点。随着在线调查在健康研究中变得越来越不可或缺,投资于强大的策略来筛选欺诈性数据并提高研究的严谨性是维护科学诚信的关键。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9377/12370263/c9691998ab43/jmir-v27-e68092-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验