文献检索，用中文搜 PubMed

应用&插件

Zotero 插件浏览器插件 Mac 客户端 Windows 客户端微信小程序

定价

高级版会员购买积分包购买API积分包

服务

文献检索文档翻译深度研究 API 文档 MCP 服务

关于我们

关于 Suppr 公司介绍联系我们用户协议隐私条款

关注我们

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

粤ICP备2023148730 号-1Suppr @ 2026

Assigning amplicon sequences to operational taxonomic units (OTUs) is an important step in characterizing microbial communities across large data sets. A notable difference between clustering and database-dependent reference clustering methods is that OTU assignments from methods may change when new sequences are added. However, one may wish to incorporate new samples to previously clustered data sets without clustering all sequences again, such as when comparing across data sets or deploying machine learning models. Existing reference-based methods produce consistent OTUs but only consider the similarity of each query sequence to a single reference sequence in an OTU, resulting in assignments that are worse than those generated by methods. To provide an efficient method to fit sequences to existing OTUs, we developed the OptiFit algorithm. Inspired by the OptiClust algorithm, OptiFit considers the similarity of all pairs of reference and query sequences to produce OTUs of the best possible quality. We tested OptiFit using four data sets with two strategies: (i) clustering to a reference database and (ii) splitting the data set into a reference and query set, clustering the references using OptiClust, and then clustering the queries to the references. The result is an improved implementation of reference-based clustering. OptiFit produces OTUs of a quality similar to that of OptiClust at faster speeds when using the split data set strategy. OptiFit provides a suitable option for users requiring consistent OTU assignments at the same quality as afforded by clustering methods. Advancements in DNA sequencing technology have allowed researchers to affordably generate millions of sequence reads from microorganisms in diverse environments. Efficient and robust software tools are needed to assign microbial sequences into taxonomic groups for characterization and comparison of communities. The OptiClust algorithm produces high-quality groups by comparing sequences to each other, but the assignments can change when new sequences are added to a data set, making it difficult to compare different studies. Other approaches assign sequences to groups by comparing them to sequences in a reference database to produce consistent assignments, but the quality of the groups produced is reduced compared to that with OptiClust. We developed OptiFit, a new reference-based algorithm that produces consistent yet high-quality assignments like OptiClust. OptiFit allows researchers to compare microbial communities across different studies or add new data to existing studies without sacrificing the quality of the group assignments.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

OptiFit：一种改进的扩增子序列与现有 OTU 拟合方法。

OptiFit: an Improved Method for Fitting Amplicon Sequences to Existing OTUs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

OptiFit：一种改进的扩增子序列与现有 OTU 拟合方法。

OptiFit: an Improved Method for Fitting Amplicon Sequences to Existing OTUs.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献