Suppr超能文献

开发和实施. 的核心基因组多位点序列分型方案。

Development and implementation of a core genome multilocus sequence typing scheme for .

机构信息

Nuffield Department of Medicine, Centre for Tropical Medicine and Global Health, University of Oxford, Oxford, UK.

Department of Biology, University of Oxford, Oxford, UK.

出版信息

Microb Genom. 2024 Aug;10(8). doi: 10.1099/mgen.0.001281.

Abstract

is part of the human nasopharyngeal microbiota and a pathogen causing invasive disease. The extensive genetic diversity observed in necessitates discriminatory analytical approaches to evaluate its population structure. This study developed a core genome multilocus sequence typing (cgMLST) scheme for using pangenome analysis tools and validated the cgMLST scheme using datasets consisting of complete reference genomes ( = 14) and high-quality draft genomes ( = 2297). The draft genome dataset was divided into a development dataset ( = 921) and a validation dataset ( = 1376). The development dataset was used to identify potential core genes, and the validation dataset was used to refine the final core gene list to ensure the reliability of the proposed cgMLST scheme. Functional classifications were made for all the resulting core genes. Phylogenetic analyses were performed using both allelic profiles and nucleotide sequence alignments of the core genome to test congruence, as assessed by Spearman's correlation and ordinary least square linear regression tests. Preliminary analyses using the development dataset identified 1067 core genes, which were refined to 1037 with the validation dataset. More than 70% of core genes were predicted to encode proteins essential for metabolism or genetic information processing. Phylogenetic and statistical analyses indicated that the core genome allelic profile accurately represented phylogenetic relatedness among the isolates ( = 0.945). We used this cgMLST scheme to define a high-resolution population structure for , which enhances the genomic analysis of this clinically relevant human pathogen.

摘要

是人类鼻咽微生物群的一部分,也是一种引起侵袭性疾病的病原体。 观察到 中广泛的遗传多样性需要区分分析方法来评估其种群结构。本研究使用泛基因组分析工具为 开发了核心基因组多位点序列分型 (cgMLST) 方案,并使用包含完整参考基因组的数据集 ( = 14) 和高质量的草图 基因组 ( = 2297) 验证了 cgMLST 方案。草图基因组数据集分为开发数据集 ( = 921) 和验证数据集 ( = 1376)。开发数据集用于识别潜在的核心基因,验证数据集用于精炼最终的核心基因列表,以确保所提出的 cgMLST 方案的可靠性。对所有生成的核心基因进行了功能分类。使用核心基因组的等位基因谱和核苷酸序列比对进行系统发育分析,以测试一致性,通过 Spearman 相关和普通最小二乘线性回归测试进行评估。使用开发数据集进行的初步分析确定了 1067 个核心基因,通过验证数据集将其精炼至 1037 个。超过 70%的核心基因被预测编码代谢或遗传信息处理所必需的蛋白质。系统发育和统计分析表明,核心基因组等位基因谱准确地代表了分离株之间的系统发育关系 ( = 0.945)。我们使用这个 cgMLST 方案来定义 一个高分辨率的 种群结构,增强了对这种与临床相关的人类病原体的基因组分析。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a3f9/11315579/95aa95a72fb9/mgen-10-01281-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验