Department of Population Health and Pathobiology, College of Veterinary Medicine, North Carolina State University, Raleigh, NC, USA.
Bioinformatics Research Center, North Carolina State University, Raleigh, NC, USA.
Microbiome. 2021 Jun 5;9(1):130. doi: 10.1186/s40168-021-01072-3.
Out of the many pathogenic bacterial species that are known, only a fraction are readily identifiable directly from a complex microbial community using standard next generation DNA sequencing. Long-read sequencing offers the potential to identify a wider range of species and to differentiate between strains within a species, but attaining sufficient accuracy in complex metagenomes remains a challenge.
Here, we describe and analytically validate LoopSeq, a commercially available synthetic long-read (SLR) sequencing technology that generates highly accurate long reads from standard short reads.
LoopSeq reads are sufficiently long and accurate to identify microbial genes and species directly from complex samples. LoopSeq perfectly recovered the full diversity of 16S rRNA genes from known strains in a synthetic microbial community. Full-length LoopSeq reads had a per-base error rate of 0.005%, which exceeds the accuracy reported for other long-read sequencing technologies. 18S-ITS and genomic sequencing of fungal and bacterial isolates confirmed that LoopSeq sequencing maintains that accuracy for reads up to 6 kb in length. LoopSeq full-length 16S rRNA reads could accurately classify organisms down to the species level in rinsate from retail meat samples, and could differentiate strains within species identified by the CDC as potential foodborne pathogens.
The order-of-magnitude improvement in length and accuracy over standard Illumina amplicon sequencing achieved with LoopSeq enables accurate species-level and strain identification from complex- to low-biomass microbiome samples. The ability to generate accurate and long microbiome sequencing reads using standard short read sequencers will accelerate the building of quality microbial sequence databases and removes a significant hurdle on the path to precision microbial genomics. Video abstract.
在已知的众多致病细菌物种中,只有一小部分可以直接通过标准的下一代 DNA 测序从复杂的微生物群落中识别出来。长读测序技术具有识别更广泛物种和区分物种内菌株的潜力,但在复杂的宏基因组中获得足够的准确性仍然是一个挑战。
本文描述并分析验证了 LoopSeq,这是一种商业可用的合成长读(SLR)测序技术,它可以从标准短读中生成高度准确的长读。
LoopSeq 读足够长且准确,可以直接从复杂样本中识别微生物基因和物种。LoopSeq 完美地从合成微生物群落中的已知菌株中恢复了 16S rRNA 基因的全部多样性。全长 LoopSeq 读的每个碱基错误率为 0.005%,超过了其他长读测序技术报告的准确性。真菌和细菌分离物的 18S-ITS 和基因组测序证实,LoopSeq 测序技术可以保持长达 6kb 长度的读长的准确性。LoopSeq 全长 16S rRNA 读可以准确地对零售肉样冲洗物中的生物体进行分类,达到物种水平,并且可以区分美国疾病控制与预防中心(CDC)确定的潜在食源性病原体的物种内菌株。
与标准 Illumina 扩增子测序相比,LoopSeq 在长度和准确性上有数量级的提高,使得从复杂到低生物量微生物组样本中进行准确的物种水平和菌株鉴定成为可能。使用标准短读测序仪生成准确和长的微生物组测序读的能力将加速高质量微生物序列数据库的建立,并消除迈向精准微生物基因组学的重要障碍。