Ayers John W, Caputi Theodore L, Nebeker Camille, Dredze Mark
1Division of Infectious Disease and Global Public Health, University of California San Diego, School of Medicine, La Jolla, CA USA.
2School of Public Health, College of Medicine and Health, University College Cork, Cork, Ireland.
NPJ Digit Med. 2018 Aug 2;1:30. doi: 10.1038/s41746-018-0036-2. eCollection 2018.
We investigated if participants in social media surveillance studies could be reverse identified by reviewing all articles published on PubMed in 2015 or 2016 with the words "Twitter" and either "read," "coded," or "content" in the title or abstract. Seventy-two percent (95% CI: 63-80) of articles quoted at least one participant's tweet and searching for the quoted content led to the participant 84% (95% CI: 74-91) of the time. Twenty-one percent (95% CI: 13-29) of articles disclosed a participant's Twitter username thereby making the participant immediately identifiable. Only one article reported obtaining consent to disclose identifying information and institutional review board (IRB) involvement was mentioned in only 40% (95% CI: 31-50) of articles, of which 17% (95% CI: 10-25) received IRB-approval and 23% (95% CI:16-32) were deemed exempt. Biomedical publications are routinely including identifiable information by quoting tweets or revealing usernames which, in turn, violates ICMJE ethical standards governing scientific ethics, even though said content is scientifically unnecessary. We propose that authors convey aggregate findings without revealing participants' identities, editors refuse to publish reports that reveal a participant's identity, and IRBs attend to these privacy issues when reviewing studies involving social media data. These strategies together will ensure participants are protected going forward.
我们通过检索2015年或2016年发表在PubMed上、标题或摘要中包含“Twitter”以及“阅读”“编码”或“内容”的所有文章,来研究社交媒体监测研究中的参与者是否会被反向识别。72%(95%置信区间:63 - 80)的文章引用了至少一名参与者的推文,通过搜索引用内容,84%(95%置信区间:74 - 91)的情况下能找到该参与者。21%(95%置信区间:13 - 29)的文章披露了参与者的Twitter用户名,从而使参与者可立即被识别。只有一篇文章报告获得了披露识别信息的同意,仅40%(95%置信区间:31 - 50)的文章提到了机构审查委员会(IRB)的参与,其中17%(95%置信区间:10 - 25)获得了IRB批准,23%(95%置信区间:16 - 32)被视为豁免。生物医学出版物经常通过引用推文或披露用户名来包含可识别信息,这反过来违反了国际医学期刊编辑委员会(ICMJE)关于科学伦理的道德标准,尽管这些内容在科学上并非必要。我们建议作者在不透露参与者身份的情况下传达汇总结果,编辑拒绝发表揭示参与者身份的报告,并且IRB在审查涉及社交媒体数据的研究时关注这些隐私问题。这些策略共同作用将确保参与者今后得到保护。