Hanson Karla L, Marshall Grace A, Graham Meredith L, Villarreal Deyaun L, Volpe Leah C, Seguin-Fowler Rebecca A
Department of Public and Ecosystem Health, Cornell University, Ithaca, NY 14853, USA.
Institute for Advancing Health Through Agriculture, Texas A&M AgriLife Research, Dallas, TX 75252, USA.
Methods Protoc. 2024 Nov 9;7(6):93. doi: 10.3390/mps7060093.
Using the internet to recruit participants into research trials is effective but can attract high numbers of fraudulent attempts, particularly via social media. We drew upon the previous literature to rigorously identify and remove fraudulent attempts when recruiting rural residents into a community-based health improvement intervention trial. Our objectives herein were to describe our dynamic process for identifying fraudulent attempts, quantify the fraudulent attempts identified by each action, and make recommendations for minimizing fraudulent responses. The analysis was descriptive. Validation methods occurred in four phases: (1) recruitment and screening for eligibility and validation; (2) investigative periods requiring greater scrutiny; (3) baseline data cleaning; and (4) validation during the first annual follow-up survey. A total of 19,665 attempts to enroll were recorded, 74.4% of which were considered fraudulent. Automated checks for IP addresses outside study areas (22.1%) and reCAPTCHA screening (10.1%) efficiently identified many fraudulent attempts. Active investigative procedures identified the most fraudulent cases (33.7%) but required time-consuming interaction between researchers and individuals attempting to enroll. Some automated validation was overly zealous: 32.1% of all consented individuals who provided an invalid birthdate at follow-up were actively contacted by researchers and could verify or correct their birthdate. We anticipate fraudulent responses will grow increasingly nuanced and adaptive given recent advances in generative artificial intelligence. Researchers will need to balance automated and active validation techniques adapted to the topic of interest, population being recruited, and acceptable participant burden.
利用互联网招募研究试验的参与者是有效的,但可能会吸引大量欺诈性尝试,尤其是通过社交媒体。在招募农村居民参与一项基于社区的健康改善干预试验时,我们借鉴以往文献,严格识别并剔除欺诈性尝试。我们在此的目标是描述我们识别欺诈性尝试的动态过程,量化每项行动所识别出的欺诈性尝试,并提出尽量减少欺诈性回复的建议。分析采用描述性方法。验证方法分四个阶段进行:(1)招募并筛选资格及进行验证;(2)需要更严格审查的调查阶段;(3)基线数据清理;(4)在首次年度随访调查期间进行验证。共记录了19665次报名尝试,其中74.4%被认为是欺诈性的。对研究区域外的IP地址进行自动检查(22.1%)和使用reCAPTCHA筛选(10.1%)有效地识别了许多欺诈性尝试。积极的调查程序识别出的欺诈案例最多(33.7%),但需要研究人员与试图报名的个人之间进行耗时的互动。一些自动验证过于严格:在随访时提供无效出生日期的所有同意参与者中,32.1%被研究人员主动联系,他们能够核实或更正自己的出生日期。鉴于生成式人工智能的最新进展,我们预计欺诈性回复将变得越来越微妙和具有适应性。研究人员需要平衡适用于感兴趣主题、所招募人群以及可接受的参与者负担的自动验证和主动验证技术。