Emerg Infect Dis. 2023 Feb;29(2):242-251. doi: 10.3201/eid2902.221482. Epub 2023 Jan 3.
Genomic data provides useful information for public health practice, particularly when combined with epidemiologic data. However, sampling bias is a concern because inferences from nonrandom data can be misleading. In March 2021, the Washington State Department of Health, USA, partnered with submitting and sequencing laboratories to establish sentinel surveillance for SARS-CoV-2 genomic data. We analyzed available genomic and epidemiologic data during presentinel and sentinel periods to assess representativeness and timeliness of availability. Genomic data during the presentinel period was largely unrepresentative of all COVID-19 cases. Data available during the sentinel period improved representativeness for age, death from COVID-19, outbreak association, long-term care facility-affiliated status, and geographic coverage; timeliness of data availability and captured viral diversity also improved. Hospitalized cases were underrepresented, indicating a need to increase inpatient sampling. Our analysis emphasizes the need to understand and quantify sampling bias in phylogenetic studies and continue evaluation and improvement of public health surveillance systems.
基因组数据为公共卫生实践提供了有用的信息,特别是当与流行病学数据结合使用时。然而,抽样偏差是一个令人关注的问题,因为非随机数据的推断可能会产生误导。2021 年 3 月,美国华盛顿州卫生部与提交和测序实验室合作,建立了 SARS-CoV-2 基因组数据的哨点监测。我们分析了现有哨点和监测期间的基因组和流行病学数据,以评估可用性的代表性和及时性。在哨点监测期间,基因组数据在很大程度上不能代表所有 COVID-19 病例。在监测期间可用的数据改善了年龄、COVID-19 死亡、暴发关联、长期护理机构附属地位和地理覆盖范围的代表性;数据可用性的及时性和捕获的病毒多样性也有所提高。住院病例代表性不足,表明需要增加住院患者采样。我们的分析强调了在系统发育研究中理解和量化抽样偏差的必要性,并继续评估和改进公共卫生监测系统。