Qiu Wenjing, Yang Jiasheng, Wang Bing, Yang Min, Tian Geng, Wang Peizhen, Yang Jialiang
School of Electrical and Information Engineering, Anhui University of Technology, Maanshan, China.
Science System Department, Geneis Beijing Co., Ltd., Beijing, China.
Front Oncol. 2022 Jul 5;12:925079. doi: 10.3389/fonc.2022.925079. eCollection 2022.
Microsatellite instability (MSI), an important biomarker for immunotherapy and the diagnosis of Lynch syndrome, refers to the change of microsatellite (MS) sequence length caused by insertion or deletion during DNA replication. However, traditional wet-lab experiment-based MSI detection is time-consuming and relies on experimental conditions. In addition, a comprehensive study on the associations between MSI status and various molecules like mRNA and miRNA has not been performed. In this study, we first studied the association between MSI status and several molecules including mRNA, miRNA, lncRNA, DNA methylation, and copy number variation (CNV) using colorectal cancer data from The Cancer Genome Atlas (TCGA). Then, we developed a novel deep learning framework to predict MSI status based solely on hematoxylin and eosin (H&E) staining images, and combined the H&E image with the above-mentioned molecules by multimodal compact bilinear pooling. Our results showed that there were significant differences in mRNA, miRNA, and lncRNA between the high microsatellite instability (MSI-H) patient group and the low microsatellite instability or microsatellite stability (MSI-L/MSS) patient group. By using the H&E image alone, one can predict MSI status with an acceptable prediction area under the curve (AUC) of 0.809 in 5-fold cross-validation. The fusion models integrating H&E image with a single type of molecule have higher prediction accuracies than that using H&E image alone, with the highest AUC of 0.952 achieved when combining H&E image with DNA methylation data. However, prediction accuracy will decrease when combining H&E image with all types of molecular data. In conclusion, combining H&E image with deep learning can predict the MSI status of colorectal cancer, the accuracy of which can further be improved by integrating appropriate molecular data. This study may have clinical significance in practice.
微卫星不稳定性(MSI)是免疫治疗和林奇综合征诊断的重要生物标志物,指DNA复制过程中因插入或缺失导致的微卫星(MS)序列长度改变。然而,传统的基于湿实验室实验的MSI检测耗时且依赖实验条件。此外,尚未对MSI状态与mRNA和miRNA等各种分子之间的关联进行全面研究。在本研究中,我们首先使用来自癌症基因组图谱(TCGA)的结直肠癌数据,研究了MSI状态与包括mRNA、miRNA、lncRNA、DNA甲基化和拷贝数变异(CNV)在内的几种分子之间的关联。然后,我们开发了一种新颖的深度学习框架,仅基于苏木精和伊红(H&E)染色图像预测MSI状态,并通过多模态紧凑双线性池化将H&E图像与上述分子相结合。我们的结果表明,高微卫星不稳定性(MSI-H)患者组和低微卫星不稳定性或微卫星稳定(MSI-L/MSS)患者组之间的mRNA、miRNA和lncRNA存在显著差异。仅使用H&E图像,在五折交叉验证中,预测MSI状态的曲线下面积(AUC)为0.809,可接受。将H&E图像与单一类型分子整合的融合模型比仅使用H&E图像具有更高的预测准确率,当将H&E图像与DNA甲基化数据相结合时,AUC最高可达0.952。然而,将H&E图像与所有类型分子数据相结合时,预测准确率会降低。总之,将H&E图像与深度学习相结合可以预测结直肠癌的MSI状态,通过整合适当的分子数据可进一步提高其准确性。本研究在实践中可能具有临床意义。