Department of Systems Biology, College of Bioinformatics Science and Technology, Harbin Medical University, Harbin, 150086, China.
Department of Bioinformatics, Key Laboratory of Ministry of Education for Gastrointestinal Cancer, School of Basic Medical Sciences, Fujian Medical University, Fuzhou, 350122, China.
BMC Genomics. 2019 Oct 23;20(1):769. doi: 10.1186/s12864-019-6129-8.
Microsatellite instability (MSI) accounts for about 15% of colorectal cancer and is associated with prognosis. Today, MSI is usually detected by polymerase chain reaction amplification of specific microsatellite markers. However, the instability is identified by comparing the length of microsatellite repeats in tumor and normal samples. In this work, we developed a qualitative transcriptional signature to individually predict MSI status for right-sided colon cancer (RCC) based on tumor samples.
Using RCC samples, based on the relative expression orderings (REOs) of gene pairs, we extracted a signature consisting of 10 gene pairs (10-GPS) to predict MSI status for RCC through a feature selection process. A sample is predicted as MSI when the gene expression orderings of at least 7 gene pairs vote for MSI; otherwise the microsatellite stability (MSS). The classification performance reached the largest F-score in the training dataset. This signature was verified in four independent datasets of RCCs with the F-scores of 1, 0.9630, 0.9412 and 0.8798, respectively. Additionally, the hierarchical clustering analyses and molecular features also supported the correctness of the reclassifications of the MSI status by 10-GPS.
The qualitative transcriptional signature can be used to classify MSI status of RCC samples at the individualized level.
微卫星不稳定性(MSI)约占结直肠癌的 15%,与预后相关。目前,MSI 通常通过聚合酶链反应扩增特定的微卫星标记来检测。然而,不稳定性是通过比较肿瘤和正常样本中微卫星重复序列的长度来确定的。在这项工作中,我们开发了一种定性转录特征,基于肿瘤样本,对右侧结肠癌(RCC)的 MSI 状态进行个体预测。
我们使用 RCC 样本,基于基因对的相对表达顺序(REO),通过特征选择过程提取了一个由 10 个基因对组成的特征(10-GPS),用于通过肿瘤样本预测 RCC 的 MSI 状态。当至少 7 个基因对的基因表达顺序投票支持 MSI 时,预测为 MSI;否则为微卫星稳定性(MSS)。该分类性能在训练数据集达到了最大的 F 分数。该特征在四个独立的 RCC 数据集得到了验证,F 分数分别为 1、0.9630、0.9412 和 0.8798。此外,层次聚类分析和分子特征也支持了 10-GPS 对 MSI 状态重新分类的正确性。
定性转录特征可用于对 RCC 样本的 MSI 状态进行个体化分类。