Feng Yenan, Chen Songqi, Wang Anqi, Zhao Zhongfu, Chen Cao
National Key Laboratory of Intelligent Tracking and Forecasting for Infectious Diseases, NHC Key Laboratory of Medical Virology and Viral Diseases, National Institute for Viral Disease Control and Prevention, Chinese Center for Disease Control and Prevention, Beijing, China.
Front Public Health. 2024 Nov 20;12:1491623. doi: 10.3389/fpubh.2024.1491623. eCollection 2024.
The global sharing of pathogen genome sequences has been significantly expedited by the COVID-19 pandemic. This study aims to elucidate the global landscape of SARS-CoV-2 genome sharing between 2020 and 2023 with a focus on quantity, timeliness, and quality. Specifically, the characteristics of China are examined.
SARS-CoV-2 genomes along with associated metadata were sourced from GISAID database. The genomes were analyzed to evaluate the quantity, timeliness, and quality across different countries/regions. The metadata characteristics of shared genomes in China in 2023 were examined and compared with the actual demographic data of China in 2023.
From 2020 to 2023, European countries consistently maintained high levels of genomic data sharing in terms of quantity, timeliness, and quality. In 2023, China made remarkable improvements in sequence sharing, ranking among the top 3.89% globally for quantity, 22.78% for timeliness, and 17.78% for quality. The genome sharing in China in 2023 covered all provinces with Shanghai Municipality contributing the most genomes. Human samples accounted for 99.73% of the shared genomes and exhibited three distinct peaks in collection dates. Males constituted 52.06%, while females constituted 47.94%. Notably, there was an increase in individuals aged 65 and above within the GISAID database compared to China's overall population in 2023.
The global sharing of SARS-CoV-2 genomes in 2020-2023 exhibited disparities in terms of quantity, timeliness, and quality. However, China has made significant advancements since 2023 by achieving comprehensive coverage across provinces, timely dissemination of data, and widespread population monitoring. Strengthening data sharing capabilities in countries like China during the SARS-CoV-2 pandemic will play a crucial role in containing and responding to future pandemics caused by emerging pathogens.
新冠疫情显著加速了病原体基因组序列的全球共享。本研究旨在阐明2020年至2023年期间严重急性呼吸综合征冠状病毒2(SARS-CoV-2)基因组共享的全球格局,重点关注数量、及时性和质量。具体而言,对中国的特征进行了考察。
SARS-CoV-2基因组及其相关元数据来自全球流感共享数据库(GISAID)。对这些基因组进行分析,以评估不同国家/地区的数量、及时性和质量。考察了2023年中国共享基因组的元数据特征,并与2023年中国的实际人口数据进行了比较。
2020年至2023年期间,欧洲国家在基因组数据共享的数量、及时性和质量方面一直保持高水平。2023年,中国在序列共享方面取得了显著进步,数量排名全球前3.89%,及时性排名22.78%,质量排名17.78%。2023年中国的基因组共享覆盖了所有省份,其中上海市贡献的基因组最多。人类样本占共享基因组的99.73%,采集日期呈现三个明显的峰值。男性占52.06%,女性占47.94%。值得注意的是,与2023年中国的总人口相比,全球流感共享数据库中65岁及以上的个体有所增加。
2020 - 2023年期间SARS-CoV-2基因组的全球共享在数量、及时性和质量方面存在差异。然而,自2023年以来,中国通过实现各省全面覆盖、及时传播数据和广泛的人群监测取得了重大进展。在SARS-CoV-2大流行期间加强中国等国家的数据共享能力,将在遏制和应对未来由新兴病原体引起的大流行中发挥关键作用。