Georgetown University and Harvard T.H. Chan School of Public Health, Boston, USA.
Georgetown University, D.C., Washington, USA.
Global Health. 2022 Jan 6;18(1):2. doi: 10.1186/s12992-021-00795-0.
The COVID-19 pandemic has led to an avalanche of scientific studies, drawing on many different types of data. However, studies addressing the effectiveness of government actions against COVID-19, especially non-pharmaceutical interventions, often exhibit data problems that threaten the validity of their results. This review is thus intended to help epidemiologists and other researchers identify a set of data issues that, in our view, must be addressed in order for their work to be credible. We further intend to help journal editors and peer reviewers when evaluating studies, to apprise policy-makers, journalists, and other research consumers about the strengths and weaknesses of published studies, and to inform the wider debate about the scientific quality of COVID-19 research.
To this end, we describe common challenges in the collection, reporting, and use of epidemiologic, policy, and other data, including completeness and representativeness of outcomes data; their comparability over time and among jurisdictions; the adequacy of policy variables and data on intermediate outcomes such as mobility and mask use; and a mismatch between level of intervention and outcome variables. We urge researchers to think critically about potential problems with the COVID-19 data sources over the specific time periods and particular locations they have chosen to analyze, and to choose not only appropriate study designs but also to conduct appropriate checks and sensitivity analyses to investigate the impact(s) of potential threats on study findings.
In an effort to encourage high quality research, we provide recommendations on how to address the issues we identify. Our first recommendation is for researchers to choose an appropriate design (and the data it requires). This review describes considerations and issues in order to identify the strongest analytical designs and demonstrates how interrupted time-series and comparative longitudinal studies can be particularly useful. Furthermore, we recommend that researchers conduct checks or sensitivity analyses of the results to data source and design choices, which we illustrate. Regardless of the approaches taken, researchers should be explicit about the kind of data problems or other biases that the design choice and sensitivity analyses are addressing.
COVID-19 大流行导致了大量科学研究的涌现,这些研究利用了许多不同类型的数据。然而,针对政府应对 COVID-19 措施(特别是非药物干预措施)有效性的研究往往存在数据问题,这些问题威胁到了研究结果的有效性。因此,本综述旨在帮助流行病学家和其他研究人员识别出一系列数据问题,我们认为,为了使他们的工作具有可信度,必须解决这些问题。我们还旨在帮助期刊编辑和同行评审员在评估研究时,让政策制定者、记者和其他研究消费者了解已发表研究的优缺点,并为更广泛的关于 COVID-19 研究科学质量的辩论提供信息。
为此,我们描述了在收集、报告和使用流行病学、政策和其他数据方面的常见挑战,包括结局数据的完整性和代表性;它们在时间和司法管辖区之间的可比性;政策变量和中间结局(如流动性和口罩使用)数据的充分性;以及干预措施和结局变量之间的不匹配。我们敦促研究人员批判性地思考他们选择分析的特定时间段和特定地点的 COVID-19 数据来源可能存在的问题,并选择不仅适当的研究设计,而且还进行适当的检查和敏感性分析,以研究潜在威胁对研究结果的影响。
为了鼓励高质量的研究,我们提供了关于如何解决我们所确定的问题的建议。我们的首要建议是研究人员选择适当的设计(以及它所需的数据)。本综述描述了考虑因素和问题,以确定最强的分析设计,并展示了中断时间序列和比较纵向研究如何特别有用。此外,我们建议研究人员对结果进行数据来源和设计选择的检查或敏感性分析,我们对此进行了说明。无论采用何种方法,研究人员都应该明确设计选择和敏感性分析所针对的数据问题或其他偏差的类型。