Madola Abson, DeWitt Michael, Wenner Jennifer, McNeil Candice Joy
Section on Infectious Diseases, https://ror.org/04v8djg66Wake Forest University School of Medicine, Winston-Salem, NC, USA.
Department of Biology, https://ror.org/0207ad724Wake Forest University, Winston-Salem, NC, USA.
Epidemiol Infect. 2025 Aug 15;153:e97. doi: 10.1017/S095026882510040X.
Anonymous online surveys using financial incentives are an essential tool for understanding sexual networks and risk factors including attitudes, sexual behaviors, and practices. However, these surveys are vulnerable to bots attempting to exploit the incentive. We deployed an in-person, limited audience survey via QR code at select locations in North Carolina to assess geolocation application use among men who have sex with men to characterize the role of app usage on infection risk and behavior. The survey was unexpectedly posted on a social media platform and went viral. Descriptive statistics were performed on repeat responses, free-text length, and demographic consistency. Between August 2022 and March 2023, we received 4,709 responses. Only 13 responses were recorded over a 6-month period until a sharp spike occurred: over 500 responses were recorded in a single hour and over 2,000 in a single day. Although free-text responses were often remarkably sophisticated, many multiple-choice responses were internally inconsistent. To protect data quality, all online surveys must incorporate defensive techniques such as response time validation, logic checks, and IP screening. With the rise of large language models, bot attacks with sophisticated responses to open-ended questions pose a growing threat to the integrity of research studies.
使用经济激励措施的匿名在线调查是了解性网络和风险因素(包括态度、性行为和习惯)的重要工具。然而,这些调查容易受到试图利用激励措施的机器人的攻击。我们在北卡罗来纳州的选定地点通过二维码开展了一项面对面的、受众有限的调查,以评估男男性行为者对地理位置应用程序的使用情况,从而确定应用程序使用在感染风险和行为方面所起的作用。该调查意外地出现在一个社交媒体平台上并迅速传播开来。我们对重复回复、自由文本长度和人口统计学一致性进行了描述性统计。在2022年8月至2023年3月期间,我们收到了4709份回复。在长达6个月的时间里只记录了13份回复,直到出现急剧飙升:在一小时内记录了超过500份回复,在一天内记录了超过2000份回复。尽管自由文本回复往往非常复杂,但许多多项选择回复在内部是不一致的。为保护数据质量,所有在线调查都必须采用诸如回复时间验证、逻辑检查和IP筛选等防御技术。随着大语言模型的兴起,对开放式问题给出复杂回复的机器人攻击对研究的完整性构成了越来越大的威胁。