Shelenkov Andrey, Slavokhotova Anna, Yunusova Mariyam, Kulikov Vladimir, Mikhaylova Yulia, Akimkin Vasiliy
Central Research Institute of Epidemiology, Novogireevskaya str., 3a, Moscow, 111123, Russia.
The National Medical Research Center of Otorhinolaryngology of the Federal Medico-Biological Agency of Russia, Volokolamskoe shosse 30 build. 2, Moscow, 123182, Russia.
BMC Genom Data. 2025 Sep 26;26(1):65. doi: 10.1186/s12863-025-01363-w.
Bacterial infections pose a global health threat across clinical and community settings. Over the past decade, the alarming expansion of antimicrobial resistance (AMR) has progressively narrowed therapeutic options, particularly for healthcare-associated infections. This critical situation has been formally recognized by the World Health Organization as a major public health concern. Epidemiological studies have demonstrated that the dissemination of AMR is frequently mediated by specific high-risk bacterial lineages, often designated as "global clones" or "clonal complexes." Consequently, surveillance of these epidemic clones and elucidation of their pathogenic mechanisms and AMR acquisition pathways have become essential research priorities. The advent of whole genome sequencing has revolutionized these investigations, enabling comprehensive epidemiological tracking and detailed analysis of mobile genetic elements responsible for resistance gene transfer. However, despite the exponential increase in available bacterial genome sequences, significant challenges persist. Current genomic datasets often suffer from uneven representation of clinically relevant strains and inconsistent availability of accompanying metadata. These limitations create substantial obstacles for large-scale comparative studies and hinder effective surveillance efforts.
This database represents a comprehensive genomic analysis of 98,950 Staphylococcus aureus isolates, a high-priority bacterial pathogen of global clinical significance. We provide detailed isolate characterization through several established typing schemes including multilocus sequence typing (MLST), clonal complex (CC) assignments, spa typing results, and core genome MLST (cgMLST) profiles. The dataset also documents the presence of CRISPR-Cas systems in these isolates. Beyond fundamental typing data, our resource incorporates the distribution of antimicrobial resistance determinants, virulence factors, and plasmid replicons. These systematically curated genomic features offer researchers valuable insights into isolate epidemiology, resistance mechanisms, and horizontal gene transfer patterns in this highly concerning pathogen.
This database is freely available under CC BY-NC-SA at https://doi.org/10.5281/zenodo.14833440 . The data provided enables researchers to identify optimal reference isolates for various genomic studies, supporting critical investigations into S. aureus epidemiology and antimicrobial resistance evolution. This resource will ultimately inform the development of more effective prevention and control measures against this high-priority pathogen.
细菌感染在临床和社区环境中对全球健康构成威胁。在过去十年中,令人担忧的抗菌药物耐药性(AMR)扩展逐渐缩小了治疗选择范围,尤其是对于医疗保健相关感染。世界卫生组织已正式将这一危急情况确认为重大公共卫生问题。流行病学研究表明,AMR的传播通常由特定的高风险细菌谱系介导,这些谱系常被称为“全球克隆”或“克隆复合体”。因此,监测这些流行克隆并阐明其致病机制和AMR获得途径已成为重要的研究重点。全基因组测序的出现彻底改变了这些研究,能够进行全面的流行病学追踪并详细分析负责耐药基因转移的移动遗传元件。然而,尽管可用细菌基因组序列呈指数级增长,但重大挑战依然存在。当前的基因组数据集往往存在临床相关菌株代表性不均衡以及伴随元数据可用性不一致的问题。这些限制为大规模比较研究带来了巨大障碍,并阻碍了有效的监测工作。
该数据库代表了对98,950株金黄色葡萄球菌分离株的全面基因组分析,金黄色葡萄球菌是具有全球临床意义的高优先级细菌病原体。我们通过几种既定的分型方案提供详细的分离株特征,包括多位点序列分型(MLST)、克隆复合体(CC)分类、spa分型结果和核心基因组MLST(cgMLST)图谱。该数据集还记录了这些分离株中CRISPR-Cas系统的存在情况。除了基本的分型数据外,我们的资源还纳入了抗菌药物耐药性决定因素、毒力因子和质粒复制子的分布情况。这些经过系统整理的基因组特征为研究人员提供了关于这种高度关注病原体的分离株流行病学、耐药机制和水平基因转移模式的宝贵见解。
该数据库可在CC BY-NC-SA许可下免费获取,网址为https://doi.org/10.5281/zenodo.14833440 。所提供的数据使研究人员能够为各种基因组研究确定最佳参考分离株,支持对金黄色葡萄球菌流行病学和抗菌药物耐药性演变的关键研究。该资源最终将为针对这种高优先级病原体制定更有效的预防和控制措施提供信息。