Suppr超能文献

作为在复杂环境中鉴定密切相关细菌物种的工具,该属的全球基因组相似性和核心基因组序列多样性。

Global genomic similarity and core genome sequence diversity of the genus as a toolkit to identify closely related bacterial species in complex environments.

作者信息

Barajas Hugo R, Romero Miguel F, Martínez-Sánchez Shamayim, Alcaraz Luis D

机构信息

Departamento de Biología Celular, Facultad de Ciencias, Universidad Nacional Autónoma de México, Mexico City, Mexico.

Laboratorio Nacional de Ciencias de la Sostenibilidad, Instituto de Ecología. Universidad Nacional Autonóma de México, Mexico city, Mexico.

出版信息

PeerJ. 2019 Jan 14;6:e6233. doi: 10.7717/peerj.6233. eCollection 2019.

Abstract

BACKGROUND

The genus is relevant to both public health and food safety because of its ability to cause pathogenic infections. It is well-represented (>100 genomes) in publicly available databases. Streptococci are ubiquitous, with multiple sources of isolation, from human pathogens to dairy products. The genus has traditionally been classified by morphology, serum types, the 16S ribosomal RNA (rRNA) gene, and multi-locus sequence types subject to in-depth comparative genomic analysis.

METHODS

Core and pan-genomes described the genomic diversity of 108 strains belonging to 16 species. The core genome nucleotide diversity was calculated and compared to phylogenomic distances within the genus . The core genome was also used as a resource to recruit metagenomic fragment reads from streptococci dominated environments. A conventional 16S rRNA gene phylogeny reconstruction was used as a reference to compare the resulting dendrograms of average nucleotide identity (ANI) and genome similarity score (GSS) dendrograms.

RESULTS

The core genome, in this work, consists of 404 proteins that are shared by all 108 . The average identity of the pairwise compared core proteins decreases proportionally to GSS lower scores, across species. The GSS dendrogram recovers most of the clades in the 16S rRNA gene phylogeny while distinguishing between 16S polytomies (unresolved nodes). The GSS is a distance metric that can reflect evolutionary history comparing orthologous proteins. Additionally, GSS resulted in the most useful metric for genus and species comparisons, where ANI metrics failed due to false positives when comparing different species.

DISCUSSION

Understanding of genomic variability and species relatedness is the goal of tools like GSS, which makes use of the maximum pairwise shared orthologous sequences for its calculation. It allows for long evolutionary distances (above species) to be included because of the use of amino acid alignment scores, rather than nucleotides, and normalizing by positive matches. Newly sequenced species and strains could be easily placed into GSS dendrograms to infer overall genomic relatedness. The GSS is not restricted to ubiquitous conservancy of gene features; thus, it reflects the mosaic-structure and dynamism of gene acquisition and loss in bacterial genomes.

摘要

背景

该属由于其引发致病性感染的能力,与公共卫生和食品安全都相关。在公开可用数据库中它有充分的代表性(>100个基因组)。链球菌无处不在,分离源多样,从人类病原体到乳制品。传统上该属是根据形态学、血清型、16S核糖体RNA(rRNA)基因以及多基因座序列类型进行分类的,这些都有待深入的比较基因组分析。

方法

核心基因组和泛基因组描述了属于16个物种的108个菌株的基因组多样性。计算了核心基因组核苷酸多样性,并与该属内的系统发育距离进行比较。核心基因组还被用作从以链球菌为主的环境中招募宏基因组片段读数的资源。使用传统的16S rRNA基因系统发育重建作为参考,以比较平均核苷酸同一性(ANI)和基因组相似性得分(GSS)树状图的结果。

结果

在这项研究中,核心基因组由所有108个菌株共有的404种蛋白质组成。跨物种比较时,成对比较核心蛋白质的平均同一性与较低的GSS得分成比例下降。GSS树状图恢复了16S rRNA基因系统发育中的大多数分支,同时区分了16S多歧分支(未解析节点)。GSS是一种距离度量,可反映比较直系同源蛋白质的进化历史。此外,在属和种的比较中,GSS是最有用的度量标准,而ANI度量标准由于在比较不同物种时出现假阳性而失败。

讨论

对基因组变异性和物种相关性的理解是GSS等工具的目标,GSS在计算时利用了最大成对共享直系同源序列。由于使用氨基酸比对得分而非核苷酸,并通过正匹配进行归一化,它允许纳入较长的进化距离(高于物种水平)。新测序的物种和菌株可以很容易地放入GSS树状图中,以推断总体基因组相关性。GSS不限于基因特征的普遍保守性;因此,它反映了细菌基因组中基因获得和丢失的镶嵌结构和动态性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4a4c/6336011/a8f50ff02d8c/peerj-07-6233-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验