• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过 16S rRNA 基因序列数据分析的相似性和聚类分析定义诺卡氏菌属的参考序列。

Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.

机构信息

Sydney Medical School, The University of Sydney, Sydney, New South Wales, Australia.

出版信息

PLoS One. 2011;6(6):e19517. doi: 10.1371/journal.pone.0019517. Epub 2011 Jun 8.

DOI:10.1371/journal.pone.0019517
PMID:21687706
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3110597/
Abstract

BACKGROUND

The intra- and inter-species genetic diversity of bacteria and the absence of 'reference', or the most representative, sequences of individual species present a significant challenge for sequence-based identification. The aims of this study were to determine the utility, and compare the performance of several clustering and classification algorithms to identify the species of 364 sequences of 16S rRNA gene with a defined species in GenBank, and 110 sequences of 16S rRNA gene with no defined species, all within the genus Nocardia.

METHODS

A total of 364 16S rRNA gene sequences of Nocardia species were studied. In addition, 110 16S rRNA gene sequences assigned only to the Nocardia genus level at the time of submission to GenBank were used for machine learning classification experiments. Different clustering algorithms were compared with a novel algorithm or the linear mapping (LM) of the distance matrix. Principal Components Analysis was used for the dimensionality reduction and visualization.

RESULTS

The LM algorithm achieved the highest performance and classified the set of 364 16S rRNA sequences into 80 clusters, the majority of which (83.52%) corresponded with the original species. The most representative 16S rRNA sequences for individual Nocardia species have been identified as 'centroids' in respective clusters from which the distances to all other sequences were minimized; 110 16S rRNA gene sequences with identifications recorded only at the genus level were classified using machine learning methods. Simple kNN machine learning demonstrated the highest performance and classified Nocardia species sequences with an accuracy of 92.7% and a mean frequency of 0.578.

CONCLUSION

The identification of centroids of 16S rRNA gene sequence clusters using novel distance matrix clustering enables the identification of the most representative sequences for each individual species of Nocardia and allows the quantitation of inter- and intra-species variability.

摘要

背景

细菌的种内和种间遗传多样性以及缺乏“参考”(即最具代表性的)单个物种序列,给基于序列的鉴定带来了重大挑战。本研究旨在确定几种聚类和分类算法的效用,并比较它们的性能,以鉴定 GenBank 中定义的种属内 364 条 16S rRNA 基因序列和 110 条无明确种属的 16S rRNA 基因序列的物种,这些序列均属于诺卡氏菌属。

方法

研究了总共 364 条诺卡氏菌属 16S rRNA 基因序列。此外,还使用在提交到 GenBank 时仅被归类为诺卡氏菌属水平的 110 条 16S rRNA 基因序列进行机器学习分类实验。比较了不同的聚类算法与一种新算法或距离矩阵的线性映射(LM)。主成分分析用于降维和可视化。

结果

LM 算法的性能最高,将 364 条 16S rRNA 序列集分为 80 个聚类,其中大多数(83.52%)与原始物种相对应。已确定各个诺卡氏菌属物种的最具代表性的 16S rRNA 序列为各自聚类的“质心”,从这些聚类中可以最小化到所有其他序列的距离;使用机器学习方法对仅在属水平上记录有鉴定的 110 条 16S rRNA 基因序列进行分类。简单的 kNN 机器学习显示出最高的性能,对诺卡氏菌属物种序列的分类准确率为 92.7%,平均频率为 0.578。

结论

使用新的距离矩阵聚类方法确定 16S rRNA 基因序列聚类的质心,可鉴定诺卡氏菌属每个种属的最具代表性的序列,并可量化种间和种内变异性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/34afffdecdb9/pone.0019517.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/0ee70b6b9a36/pone.0019517.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/1b5d0be0f69d/pone.0019517.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/de3c62129212/pone.0019517.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/34afffdecdb9/pone.0019517.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/0ee70b6b9a36/pone.0019517.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/1b5d0be0f69d/pone.0019517.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/de3c62129212/pone.0019517.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7828/3110597/34afffdecdb9/pone.0019517.g004.jpg

相似文献

1
Defining reference sequences for Nocardia species by similarity and clustering analyses of 16S rRNA gene sequence data.通过 16S rRNA 基因序列数据分析的相似性和聚类分析定义诺卡氏菌属的参考序列。
PLoS One. 2011;6(6):e19517. doi: 10.1371/journal.pone.0019517. Epub 2011 Jun 8.
2
Analysis of secA1 gene sequences for identification of Nocardia species.用于鉴定诺卡氏菌属菌种的secA1基因序列分析
J Clin Microbiol. 2006 Aug;44(8):2760-6. doi: 10.1128/JCM.00155-06.
3
A phylogenetic analysis of the genus Nocardia with 16S rRNA gene sequences.利用16S rRNA基因序列对诺卡氏菌属进行系统发育分析。
Int J Syst Bacteriol. 1995 Apr;45(2):240-5. doi: 10.1099/00207713-45-2-240.
4
secA1 gene sequence polymorphisms for species identification of Nocardia species and recognition of intraspecies genetic diversity.secA1 基因序列多态性用于诺卡氏菌属种的鉴定和种内遗传多样性的识别。
J Clin Microbiol. 2010 Nov;48(11):3928-34. doi: 10.1128/JCM.01113-10. Epub 2010 Sep 1.
5
Characterization of the ribosomal rrnD operon of the cephamycin-producer 'Nocardia lactamdurans' shows that this actinomycete belongs to the genus Amycolatopsis.对头孢霉素产生菌“产内酰胺诺卡氏菌”核糖体rrnD操纵子的特征分析表明,这种放线菌属于拟无枝酸菌属。
Syst Appl Microbiol. 2000 Apr;23(1):15-24. doi: 10.1016/S0723-2020(00)80041-7.
6
Phylogeny and identification of Nocardia species on the basis of multilocus sequence analysis.基于多位点序列分析的诺卡氏菌属种的系统发育和鉴定。
J Clin Microbiol. 2010 Dec;48(12):4525-33. doi: 10.1128/JCM.00883-10. Epub 2010 Sep 15.
7
Molecular identification and phylogenetic relationships of clinical Nocardia isolates.临床诺卡氏菌分离株的分子鉴定和系统发育关系。
Antonie Van Leeuwenhoek. 2019 Dec;112(12):1755-1766. doi: 10.1007/s10482-019-01296-2. Epub 2019 Jul 26.
8
CLUSTOM: a novel method for clustering 16S rRNA next generation sequences by overlap minimization.CLUSTOM:一种通过最小化重叠来聚类 16S rRNA 下一代序列的新方法。
PLoS One. 2013 May 1;8(5):e62623. doi: 10.1371/journal.pone.0062623. Print 2013.
9
Multiple copies of the 16S rRNA gene in Nocardia nova isolates and implications for sequence-based identification procedures.新星诺卡氏菌分离株中16S rRNA基因的多个拷贝及其对基于序列的鉴定程序的影响。
J Clin Microbiol. 2005 Jun;43(6):2881-5. doi: 10.1128/JCM.43.6.2881-2885.2005.
10
Evaluation of the integrated database network system (IDNS) SmartGene software for analysis of 16S rRNA gene sequences for identification of Nocardia species.评价整合数据库网络系统(IDNS)SmartGene 软件用于分析 16S rRNA 基因序列以鉴定诺卡氏菌属。
J Clin Microbiol. 2010 Aug;48(8):2995-8. doi: 10.1128/JCM.00681-10. Epub 2010 Jun 23.

引用本文的文献

1
Updated Review on Species: 2006-2021.物种更新综述:2006-2021 年。
Clin Microbiol Rev. 2022 Dec 21;35(4):e0002721. doi: 10.1128/cmr.00027-21. Epub 2022 Oct 31.
2
Molecular characterization and improved diagnostics of strains isolated over the last two decades at a German tertiary care center.德国一家三级护理中心过去二十年来分离菌株的分子特征分析及诊断方法改进
EXCLI J. 2021 Apr 30;20:851-862. doi: 10.17179/excli2021-3787. eCollection 2021.
3
Performance and Application of 16S rRNA Gene Cycle Sequencing for Routine Identification of Bacteria in the Clinical Microbiology Laboratory.

本文引用的文献

1
Linear normalised hash function for clustering gene sequences and identifying reference sequences from multiple sequence alignments.用于对基因序列进行聚类并从多序列比对中识别参考序列的线性归一化哈希函数。
Microb Inform Exp. 2012 Jan 26;2(1):2. doi: 10.1186/2042-5783-2-2.
2
Phylogeny and identification of Nocardia species on the basis of multilocus sequence analysis.基于多位点序列分析的诺卡氏菌属种的系统发育和鉴定。
J Clin Microbiol. 2010 Dec;48(12):4525-33. doi: 10.1128/JCM.00883-10. Epub 2010 Sep 15.
3
Evaluation of the integrated database network system (IDNS) SmartGene software for analysis of 16S rRNA gene sequences for identification of Nocardia species.
16S rRNA 基因测序在临床微生物实验室常规细菌鉴定中的性能和应用。
Clin Microbiol Rev. 2020 Sep 9;33(4). doi: 10.1128/CMR.00053-19. Print 2020 Sep 16.
4
The Complexities of Nocardia Taxonomy and Identification.诺卡氏菌分类学和鉴定的复杂性。
J Clin Microbiol. 2017 Dec 26;56(1). doi: 10.1128/JCM.01419-17. Print 2018 Jan.
5
A PCR-based intergenic spacer region-capillary gel electrophoresis typing method for identification and subtyping of Nocardia species.基于 PCR 的种间间隔区-毛细管凝胶电泳分型方法,用于鉴定和细分诺卡氏菌属。
J Clin Microbiol. 2012 Nov;50(11):3478-84. doi: 10.1128/JCM.01311-12. Epub 2012 Aug 8.
6
Linear normalised hash function for clustering gene sequences and identifying reference sequences from multiple sequence alignments.用于对基因序列进行聚类并从多序列比对中识别参考序列的线性归一化哈希函数。
Microb Inform Exp. 2012 Jan 26;2(1):2. doi: 10.1186/2042-5783-2-2.
评价整合数据库网络系统(IDNS)SmartGene 软件用于分析 16S rRNA 基因序列以鉴定诺卡氏菌属。
J Clin Microbiol. 2010 Aug;48(8):2995-8. doi: 10.1128/JCM.00681-10. Epub 2010 Jun 23.
4
Phylogenetic evidence for lateral gene transfer in the intestine of marine iguanas.海洋鬣蜥肠道中侧向基因转移的系统发育证据。
PLoS One. 2010 May 24;5(5):e10785. doi: 10.1371/journal.pone.0010785.
5
Identification of pathogenic Nocardia species by reverse line blot hybridization targeting the 16S rRNA and 16S-23S rRNA gene spacer regions.采用针对 16S rRNA 和 16S-23S rRNA 基因间隔区的反向线杂交技术鉴定致病性诺卡氏菌。
J Clin Microbiol. 2010 Feb;48(2):503-11. doi: 10.1128/JCM.01761-09. Epub 2009 Dec 2.
6
GenBank.GenBank。
Nucleic Acids Res. 2010 Jan;38(Database issue):D46-51. doi: 10.1093/nar/gkp1024. Epub 2009 Nov 12.
7
MicrobesOnline: an integrated portal for comparative and functional genomics.微生物在线:一个用于比较和功能基因组学的综合门户。
Nucleic Acids Res. 2010 Jan;38(Database issue):D396-400. doi: 10.1093/nar/gkp919. Epub 2009 Nov 11.
8
The comprehensive microbial resource.全面微生物资源库。
Nucleic Acids Res. 2010 Jan;38(Database issue):D340-5. doi: 10.1093/nar/gkp912. Epub 2009 Nov 5.
9
The integrated microbial genomes system: an expanding comparative analysis resource.整合微生物基因组系统:一个不断扩展的比较分析资源。
Nucleic Acids Res. 2010 Jan;38(Database issue):D382-90. doi: 10.1093/nar/gkp887. Epub 2009 Oct 28.
10
Assignment of reference 5'-end 16S rDNA sequences and species-specific sequence polymorphisms improves species identification of Nocardia.参考5'-端16S rDNA序列的分配和物种特异性序列多态性改善了诺卡氏菌属的物种鉴定。
Open Microbiol J. 2009 Jun 23;3:97-105. doi: 10.2174/1874285800903010097.