Rudan Igor, Song Peige, Adeloye Davies, Campbell Harry
Centre for Global Health, Usher Institute, The University of Edinburgh, Edinburgh, UK.
Nuffield Department of Primary Care Health Sciences and Green Templeton College, Oxford University, Oxford, UK.
J Glob Health. 2025 Jul 1;15:01004. doi: 10.7189/jogh.15.01004.
In recent years, global accessibility to large 'big data' repositories that enable 'open research' - such as the UK Biobank, National Health and Nutrition Examination Survey (NHANES), and Global Burden of Disease (GBD) datasets - has created unprecedented opportunities for researchers worldwide to conduct secondary data analyses. This development is particularly beneficial for early-career researchers in low- and middle-income countries (LMICs), as it lets them access large and otherwise costly datasets without the need for local infrastructure, potentially curbing brain drain. However, through our work at the Journal of Global Health (JoGH), we have identified emerging concerns that must be addressed to help preserve the integrity and scientific value of this otherwise positive trend. These include: the risk of 'paper mills' mass-producing superficial papers with questionable authorship practices; duplicate publications produced through republishing already available results or by multiple groups testing the same hypothesis using identical datasets and methods without awareness of each other's work; proliferation of false-positive findings due to inadequate adjustment for multiple testing in large datasets; and the inappropriate or undisclosed use of artificial intelligence (AI) tools in generating manuscripts. To counter these issues while continuing to support legitimate and innovative secondary data analyses, JoGH is introducing guidelines for authors submitting such work for consideration and peer review. These guidelines require authors to declare transparently: their previous published work based on similar datasets or hypotheses; the originality of their research question and design in the context of other similar research; their awareness of related published studies using the same dataset; how they addressed multiple testing statistically; and the role of AI, if any, in manuscript preparation or data analysis. A new, mandatory section in such submitted manuscripts - 'Adherence to JoGH's Guidelines for Reporting Analyses of Big Data Repositories Open to the Public (GRABDROP)' - will summarise these declarations, with full details provided in a supplemental file. This proactive editorial policy aims to safeguard scientific quality while empowering global researchers. By improving transparency and accountability, JoGH seeks to ensure that the benefits of open big data are not undermined by unethical or careless practices. We suggest that other publishers engage in an open discussion on how to address these challenges and consider adopting JoGH's GRABDROP guidelines or similar measures to maintain trust in scientific outputs derived from secondary analyses. Through these steps, JoGH remains committed to fostering reproducible and equitable global health research.
近年来,全球范围内可获取的大型“大数据”存储库推动了“开放研究”,如英国生物银行、美国国家健康与营养检查调查(NHANES)以及全球疾病负担(GBD)数据集,为全球研究人员进行二次数据分析创造了前所未有的机会。这一发展对低收入和中等收入国家(LMICs)的早期职业研究人员尤为有益,因为它使他们无需本地基础设施就能获取大型且原本成本高昂的数据集,有可能抑制人才外流。然而,通过我们在《全球健康杂志》(JoGH)的工作,我们发现了一些新出现的问题,必须加以解决,以维护这一积极趋势的完整性和科学价值。这些问题包括:“论文工厂”大量炮制作者身份存疑的表面文章的风险;通过重新发表已有结果,或多个团队使用相同数据集和方法测试同一假设而未意识到彼此工作的情况下产生的重复发表;由于在大型数据集中对多重检验调整不足导致的假阳性结果激增;以及在生成稿件时不恰当或未披露地使用人工智能(AI)工具。为了应对这些问题,同时继续支持合理且创新的二次数据分析,JoGH正在为提交此类工作以供审议和同行评审的作者引入指导方针。这些指导方针要求作者透明地声明:他们基于类似数据集或假设的先前发表作品;其研究问题和设计在其他类似研究背景下的原创性;他们对使用相同数据集的相关已发表研究的了解;他们如何在统计上处理多重检验;以及AI在稿件准备或数据分析中(如有)所起的作用。此类提交稿件中的一个新的强制性部分——“遵守JoGH关于报告对公众开放的大数据存储库分析的指导方针(GRABDROP)”——将总结这些声明,并在补充文件中提供完整细节。这一积极的编辑政策旨在保障科学质量,同时赋予全球研究人员权力。通过提高透明度和问责制,JoGH力求确保开放大数据的益处不会因不道德或粗心的行为而受到损害。我们建议其他出版商就如何应对这些挑战展开公开讨论,并考虑采用JoGH的GRABDROP指导方针或类似措施,以维持对二次分析得出的科学成果的信任。通过这些步骤,JoGH仍然致力于促进可重复和公平的全球健康研究。