Kerkhof Lee J, Roth Pierce A, Deshpande Samir V, Bernhards R Cory, Liem Alvin T, Hill Jessica M, Häggblom Max M, Webster Nicole S, Ibironke Olufunmilola, Mirzoyan Seda, Polashock James J, Sullivan Raymond F
Department of Marine and Coastal Sciences, Rutgers, The State University of New Jersey, New Brunswick, NJ 08901-8521, USA.
DCS Corp, 4696 Millennium Drive, Suite 450, Belcamp, MD 21017, USA.
FEMS Microbes. 2022 Jan 27;3:xtac002. doi: 10.1093/femsmc/xtac002. eCollection 2022.
Current methods to characterize microbial communities generally employ sequencing of the 16S rRNA gene (<500 bp) with high accuracy (∼99%) but limited phylogenetic resolution. However, long-read sequencing now allows for the profiling of near-full-length ribosomal operons (16S-ITS-23S rRNA genes) on platforms such as the Oxford Nanopore MinION. Here, we describe an rRNA operon database with >300 ,000 entries, representing >10 ,000 prokaryotic species and ∼ 150, 000 strains. Additionally, BLAST parameters were identified for strain-level resolution using mutated, mock rRNA operon sequences (70-95% identity) from four bacterial phyla and two members of the Euryarchaeota, mimicking MinION reads. MegaBLAST settings were determined that required <3 s per read on a Mac Mini with strain-level resolution for sequences with >84% identity. These settings were tested on rRNA operon libraries from the human respiratory tract, farm/forest soils and marine sponges ( = 1, 322, 818 reads for all sample sets). Most rRNA operon reads in this data set yielded best BLAST hits (95 ± 8%). However, only 38-82% of library reads were compatible with strain-level resolution, reflecting the dominance of human/biomedical-associated prokaryotic entries in the database. Since the MinION and the Mac Mini are both portable, this study demonstrates the possibility of rapid strain-level microbiome analysis in the field.
目前用于表征微生物群落的方法通常采用对16S rRNA基因(<500 bp)进行测序,其准确性较高(约99%),但系统发育分辨率有限。然而,现在的长读长测序技术使得在牛津纳米孔MinION等平台上能够对近乎全长的核糖体操纵子(16S-ITS-23S rRNA基因)进行分析。在此,我们描述了一个拥有超过30万个条目的rRNA操纵子数据库,代表了超过1万个原核生物物种和约15万个菌株。此外,利用来自四个细菌门和广古菌门的两个成员的突变模拟rRNA操纵子序列(70-95%的同一性)确定了用于菌株水平分辨率的BLAST参数,模拟MinION读数。确定了MegaBLAST设置,在配备Mac Mini的计算机上,对于同一性>84%的序列,每个读数所需时间<3秒,且具有菌株水平分辨率。这些设置在来自人类呼吸道、农场/森林土壤和海洋海绵的rRNA操纵子文库上进行了测试(所有样本集的读数为1,322,818条)。该数据集中的大多数rRNA操纵子读数产生了最佳的BLAST匹配结果(95±8%)。然而,只有38-82%的文库读数与菌株水平分辨率兼容,这反映了数据库中人类/生物医学相关原核生物条目的主导地位。由于MinION和Mac Mini都是便携式的,本研究证明了在野外进行快速菌株水平微生物组分析的可能性。