Cui Zhen, Wu Yan, Zhang Qin-Hu, Wang Si-Guo, He Ying, Huang De-Shuang
Institute of Machine Learning and Systems Biology, College of Electronics and Information Engineering, Tongji University, Shanghai, China.
College of Electronics and Information Engineering, Tongji University, Shanghai, China.
Front Microbiol. 2023 Aug 22;14:1238199. doi: 10.3389/fmicb.2023.1238199. eCollection 2023.
Imbalances in gut microbes have been implied in many human diseases, including colorectal cancer (CRC), inflammatory bowel disease, type 2 diabetes, obesity, autism, and Alzheimer's disease. Compared with other human diseases, CRC is a gastrointestinal malignancy with high mortality and a high probability of metastasis. However, current studies mainly focus on the prediction of colorectal cancer while neglecting the more serious malignancy of metastatic colorectal cancer (mCRC). In addition, high dimensionality and small samples lead to the complexity of gut microbial data, which increases the difficulty of traditional machine learning models.
To address these challenges, we collected and processed 16S rRNA data and calculated abundance data from patients with non-metastatic colorectal cancer (non-mCRC) and mCRC. Different from the traditional health-disease classification strategy, we adopted a novel disease-disease classification strategy and proposed a microbiome-based multi-view convolutional variational information bottleneck (MV-CVIB).
The experimental results show that MV-CVIB can effectively predict mCRC. This model can achieve AUC values above 0.9 compared to other state-of-the-art models. Not only that, MV-CVIB also achieved satisfactory predictive performance on multiple published CRC gut microbiome datasets.
Finally, multiple gut microbiota analyses were used to elucidate communities and differences between mCRC and non-mCRC, and the metastatic properties of CRC were assessed by patient age and microbiota expression.
肠道微生物失衡与许多人类疾病有关,包括结直肠癌(CRC)、炎症性肠病、2型糖尿病、肥胖症、自闭症和阿尔茨海默病。与其他人类疾病相比,结直肠癌是一种胃肠道恶性肿瘤,死亡率高且转移概率高。然而,目前的研究主要集中在结直肠癌的预测上,而忽略了转移性结直肠癌(mCRC)这种更严重的恶性肿瘤。此外,高维度和小样本导致肠道微生物数据的复杂性增加,这也增加了传统机器学习模型的难度。
为应对这些挑战,我们收集并处理了16S rRNA数据,并计算了非转移性结直肠癌(non-mCRC)和mCRC患者的丰度数据。与传统的健康-疾病分类策略不同,我们采用了一种新颖的疾病-疾病分类策略,并提出了基于微生物组的多视图卷积变分信息瓶颈(MV-CVIB)。
实验结果表明,MV-CVIB能够有效预测mCRC。与其他先进模型相比,该模型的AUC值可达到0.9以上。不仅如此,MV-CVIB在多个已发表的CRC肠道微生物组数据集上也取得了令人满意的预测性能。
最后,通过多种肠道微生物群分析来阐明mCRC和non-mCRC之间的群落及差异,并通过患者年龄和微生物群表达评估CRC的转移特性。