Bio-Sciences R&D Division, TCS Research, Tata Consultancy Services Limited, Pune, Maharashtra, India.
DNA Res. 2019 Apr 1;26(2):147-156. doi: 10.1093/dnares/dsy045.
Many microbiome studies employ reference-based operational taxonomic unit (OTU)-picking methods, which in general, rely on databases cataloguing reference OTUs identified through clustering full-length 16S rRNA genes. Given that the rate of accumulation of mutations are not uniform throughout the length of a 16S rRNA gene across different taxonomic clades, results of OTU identification or taxonomic classification obtained using 'short-read' sequence queries (as generated by next-generation sequencing platforms) can be inconsistent and of suboptimal accuracy. De novo OTU clustering results too can significantly vary depending upon the hypervariable region (V-region) targeted for sequencing. As a consequence, comparison of microbiomes profiled in different scientific studies becomes difficult and often poses a challenge in analysing new findings in context of prior knowledge. The OTUX approach of reference-based OTU-picking proposes to overcome these limitations by using 'customized' OTU reference databases, which can cater to different sets of short-read sequences corresponding to different 16S V-regions. The results obtained with OTUX-approach (which are in terms of OTUX-OTU identifiers) can also be 'mapped back' or represented in terms of other OTU database identifiers/taxonomy, e.g. Greengenes, thus allowing for easy cross-study comparisons. Validation with simulated datasets indicates more efficient, accurate, and consistent taxonomic classifications obtained using OTUX-approach, as compared with conventional methods.
许多微生物组研究采用基于参考的操作分类单元 (OTU) 选择方法,这些方法通常依赖于通过聚类全长 16S rRNA 基因来识别参考 OTU 的数据库。鉴于不同分类群中 16S rRNA 基因长度上的突变积累率并不均匀,使用“短读”序列查询(如下一代测序平台生成)获得的 OTU 识别或分类结果可能不一致,且准确性不高。从头开始的 OTU 聚类结果也会因测序的高变区 (V 区) 而显著不同。因此,不同科学研究中分析的微生物组之间的比较变得困难,并且通常在根据先前知识分析新发现时会带来挑战。基于参考的 OTU 选择的 OTUX 方法通过使用“定制”的 OTU 参考数据库来克服这些限制,这些数据库可以满足对应不同 16S V 区的不同短读序列集。OTUX 方法获得的结果(以 OTUX-OTU 标识符的形式)也可以“映射回”或以其他 OTU 数据库标识符/分类的形式表示,例如 Greengenes,从而允许轻松进行跨研究比较。使用模拟数据集进行验证表明,与传统方法相比,OTUX 方法可以获得更高效、准确和一致的分类结果。