Ge Yuchen, Lu Jennifer, Puiu Daniela, Revsine Mahler, Salzberg Steven L
Center for Computational Biology, Johns Hopkins University, Baltimore, MD 21218, USA.
Department of Biomedical Engineering, Johns Hopkins University, Baltimore, MD 21218, USA.
Sci Transl Med. 2025 Sep 3;17(814):eads6335. doi: 10.1126/scitranslmed.ads6335.
In recent years, a growing number of publications have reported the presence of microbial species in human tumors and mixtures of microbes that appear to be highly specific to different cancer types. Our recent reanalysis of data from three cancer types revealed that technical errors have caused erroneous reports of numerous microbial species found in sequencing data from The Cancer Genome Atlas (TCGA) project. Here, we have expanded our analysis to cover all 5734 whole-genome sequencing (WGS) datasets currently available from TCGA, covering 25 distinct types of cancer. We analyzed the microbial content using updated computational methods and databases and compared our results to those from two major recent studies that focused on bacteria, viruses, and fungi in cancer. Our results expand upon and reinforce our recent findings, which show that the presence of microbes is far smaller than had been previously reported and that many species identified in TCGA data might not be present at all. As part of this expanded analysis and to help others avoid being misled by flawed data, we have released a dataset that contains detailed read counts for bacteria, viruses, archaea, and fungi detected in all 5734 TCGA samples, which can serve as a public reference for future investigations.
近年来,越来越多的出版物报道了人类肿瘤中存在微生物物种以及似乎对不同癌症类型具有高度特异性的微生物混合物。我们最近对来自三种癌症类型的数据进行的重新分析表明,技术错误导致了对从癌症基因组图谱(TCGA)项目的测序数据中发现的众多微生物物种的错误报告。在此,我们扩大了分析范围,涵盖了目前可从TCGA获得的所有5734个全基因组测序(WGS)数据集,涉及25种不同类型的癌症。我们使用更新的计算方法和数据库分析了微生物含量,并将我们的结果与最近两项专注于癌症中的细菌、病毒和真菌的主要研究结果进行了比较。我们的结果扩展并强化了我们最近的发现,即微生物的存在远比先前报道的要少,并且在TCGA数据中鉴定出的许多物种可能根本不存在。作为这一扩展分析的一部分,并为了帮助其他人避免被有缺陷的数据误导,我们发布了一个数据集,其中包含在所有5734个TCGA样本中检测到的细菌、病毒、古细菌和真菌的详细读数计数,可作为未来调查的公共参考。