Department of Bioengineering, University of California San Diego, La Jolla, CA, USA.
Micronoma, San Diego, CA, USA.
Oncogene. 2024 Apr;43(15):1127-1148. doi: 10.1038/s41388-024-02974-w. Epub 2024 Feb 23.
In 2020, we identified cancer-specific microbial signals in The Cancer Genome Atlas (TCGA) [1]. Multiple peer-reviewed papers independently verified or extended our findings [2-12]. Given this impact, we carefully considered concerns by Gihawi et al. [13] that batch correction and database contamination with host sequences artificially created the appearance of cancer type-specific microbiomes. (1) We tested batch correction by comparing raw and Voom-SNM-corrected data per-batch, finding predictive equivalence and significantly similar features. We found consistent results with a modern microbiome-specific method (ConQuR [14]), and when restricting to taxa found in an independent, highly-decontaminated cohort. (2) Using Conterminator [15], we found low levels of human contamination in our original databases (~1% of genomes). We demonstrated that the increased detection of human reads in Gihawi et al. [13] was due to using a newer human genome reference. (3) We developed Exhaustive, a method twice as sensitive as Conterminator, to clean RefSeq. We comprehensively host-deplete TCGA with many human (pan)genome references. We repeated all analyses with this and the Gihawi et al. [13] pipeline, and found cancer type-specific microbiomes. These extensive re-analyses and updated methods validate our original conclusion that cancer type-specific microbial signatures exist in TCGA, and show they are robust to methodology.
2020 年,我们在癌症基因组图谱(TCGA)[1]中鉴定了癌症特异性微生物信号。多项经过同行评审的论文独立验证或扩展了我们的发现[2-12]。鉴于这一影响,我们仔细考虑了 Gihawi 等人的担忧[13],即批次校正和宿主序列污染数据库会人为地创建癌症类型特异性微生物组的外观。(1) 我们通过逐批比较原始数据和 Voom-SNM 校正数据来测试批次校正,发现预测等效且具有显著相似特征。我们使用现代微生物组特异性方法(ConQuR [14])得到了一致的结果,并且当限制在独立的、高度净化队列中发现的分类群时也是如此。(2) 使用 Conterminator [15],我们发现我们原始数据库中的人类污染水平很低(~1%的基因组)。我们证明了 Gihawi 等人[13]中人类读数检测增加是由于使用了更新的人类基因组参考。(3) 我们开发了 Exhaustive,这是一种比 Conterminator 敏感两倍的方法,用于清理 RefSeq。我们使用许多人类(泛)基因组参考对 TCGA 进行全面的宿主耗竭。我们使用这和 Gihawi 等人[13]的管道重复了所有分析,并发现了癌症类型特异性微生物组。这些广泛的重新分析和更新的方法验证了我们的原始结论,即 TCGA 中存在癌症类型特异性微生物特征,并且表明它们对方法具有稳健性。