Suppr超能文献

用于净化导尿管尿液16S rRNA测序数据的CleanSeqU算法。

CleanSeqU algorithm for decontamination of catheterized urine 16S rRNA sequencing data.

作者信息

Yoon Sung Min, Ki Chang-Seok, Song Ju Sun

机构信息

Department of Laboratory Medicine, GC Genome, Seoul, 16924, Korea.

Department of Laboratory Medicine, Green Cross Laboratories, Seoul, 16924, Korea.

出版信息

Sci Rep. 2025 Jun 2;15(1):19270. doi: 10.1038/s41598-025-98875-3.

Abstract

Contamination in low-biomass samples, such as urine, presents a major challenge for 16S rRNA gene sequencing, as extraneous DNA from reagents and the environment often obscures microbial signals. Existing in silico decontamination algorithms face limitations in accurately identifying and removing these contaminants. To address this issue, we developed CleanSeqU, a novel decontamination algorithm designed to enhance the accuracy of 16S rRNA gene sequencing data for catheterized urine samples. This approach is grounded in the principle that the compositional pattern of potential contaminant taxa remains similar between biological samples and blank controls. Also, the algorithm identifies potential contaminants based on ecological plausibility and custom blacklist. We evaluated CleanSeqU's performance using vaginal microbiome dilution experiments as a proxy for low-biomass urine samples and compared it to the Decontam, Microdecon, and SCRuB algorithm. CleanSeqU consistently outperformed Decontam, Microdecon, and SCRuB across various contamination levels, with superior accuracy, F1-scores, and reduced beta-dissimilarity. CleanSeqU improved specificity and positive predictive value by correctly identifying and removing a higher number of contaminant amplicon sequence variants (ASVs). Furthermore, the reduced alpha diversity in the decontaminated datasets suggests more precise contaminant elimination. With its practical use of a single blank extraction control per batch and adjustable decontamination rules, CleanSeqU provides an efficient and scalable solution that delivers accurate microbial profiles. Our findings highlight its potential to significantly advance urine microbiome research by delivering more accurate microbial profiles.

摘要

低生物量样本(如尿液)中的污染对16S rRNA基因测序构成了重大挑战,因为来自试剂和环境的外源DNA常常会掩盖微生物信号。现有的计算机去污染算法在准确识别和去除这些污染物方面存在局限性。为了解决这个问题,我们开发了CleanSeqU,这是一种新颖的去污染算法,旨在提高导尿尿液样本16S rRNA基因测序数据的准确性。这种方法基于这样一个原则,即潜在污染物分类群的组成模式在生物样本和空白对照之间保持相似。此外,该算法还基于生态合理性和自定义黑名单来识别潜在污染物。我们使用阴道微生物群稀释实验作为低生物量尿液样本的替代物来评估CleanSeqU的性能,并将其与Decontam、Microdecon和SCRuB算法进行比较。在各种污染水平下,CleanSeqU始终优于Decontam、Microdecon和SCRuB,具有更高的准确性、F1分数和更低的β-差异。CleanSeqU通过正确识别和去除更多的污染物扩增子序列变体(ASV)提高了特异性和阳性预测值。此外,去污染数据集中α多样性的降低表明污染物的去除更加精确。凭借每批使用单个空白提取对照和可调整的去污染规则,CleanSeqU提供了一种高效且可扩展的解决方案,能够提供准确的微生物图谱。我们的研究结果突出了它通过提供更准确的微生物图谱显著推进尿液微生物组研究的潜力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验