Rahmatallah Yasir, Khaidakov Magomed, Lai Keith K, Goyne Hannah E, Lamps Laura W, Hagedorn Curt H, Glazko Galina
Department of Biomedical Informatics, University of Arkansas for Medical Sciences, Little Rock, AR, 72205, USA.
The Central Arkansas Veterans Healthcare System, Little Rock, AR, 72205, USA.
BMC Med Genomics. 2017 Dec 28;10(1):81. doi: 10.1186/s12920-017-0317-7.
Sessile serrated adenomas/polyps are distinguished from hyperplastic colonic polyps subjectively by their endoscopic appearance and histological morphology. However, hyperplastic and sessile serrated polyps can have overlapping morphological features resulting in sessile serrated polyps diagnosed as hyperplastic. While sessile serrated polyps can progress into colon cancer, hyperplastic polyps have virtually no risk for colon cancer. Objective measures, differentiating these types of polyps would improve cancer prevention and treatment outcome.
RNA-seq training data set and Affimetrix, Illumina testing data sets were obtained from Gene Expression Omnibus (GEO). RNA-seq single-end reads were filtered with FastX toolkit. Read mapping to the human genome, gene abundance estimation, and differential expression analysis were performed with Tophat-Cufflinks pipeline. Background correction, normalization, and probe summarization steps for Affimetrix arrays were performed using the robust multi-array method (RMA). For Illumina arrays, log-scale expression data was obtained from GEO. Pathway analysis was implemented using Bioconductor package GSAR. To build a platform-independent molecular classifier that accurately differentiates sessile serrated and hyperplastic polyps we developed a new feature selection step. We also developed a simple procedure to classify new samples as either sessile serrated or hyperplastic with a class probability assigned to the decision, estimated using Cantelli's inequality.
The classifier trained on RNA-seq data and tested on two independent microarray data sets resulted in zero and three errors. The classifier was further tested using quantitative real-time PCR expression levels of 45 blinded independent formalin-fixed paraffin-embedded specimens and was highly accurate. Pathway analyses have shown that sessile serrated polyps are distinguished from hyperplastic polyps and normal controls by: up-regulation of pathways implicated in proliferation, inflammation, cell-cell adhesion and down-regulation of serine threonine kinase signaling pathway; differential co-expression of pathways regulating cell division, protein trafficking and kinase activities.
Most of the differentially expressed pathways are known as hallmarks of cancer and likely to explain why sessile serrated polyps are more prone to neoplastic transformation than hyperplastic. The new molecular classifier includes 13 genes and may facilitate objective differentiation between two polyps.
无蒂锯齿状腺瘤/息肉通过内镜表现和组织形态学与增生性结肠息肉在主观上加以区分。然而,增生性息肉和无蒂锯齿状息肉可能具有重叠的形态学特征,导致无蒂锯齿状息肉被诊断为增生性息肉。虽然无蒂锯齿状息肉可进展为结肠癌,但增生性息肉几乎没有患结肠癌的风险。能够区分这些类型息肉的客观方法将改善癌症的预防和治疗效果。
从基因表达综合数据库(GEO)获取RNA测序训练数据集以及Affimetrix和Illumina检测数据集。使用FastX工具包对RNA测序单端读数进行过滤。使用Tophat-Cufflinks流程进行读段与人基因组的比对、基因丰度估计以及差异表达分析。使用稳健多阵列方法(RMA)对Affimetrix阵列进行背景校正、标准化和探针汇总步骤。对于Illumina阵列,从GEO获取对数尺度的表达数据。使用生物导体包GSAR进行通路分析。为构建一个能准确区分无蒂锯齿状息肉和增生性息肉的与平台无关的分子分类器,我们开发了一个新的特征选择步骤。我们还开发了一个简单程序,使用坎泰利不等式估计的类概率将新样本分类为无蒂锯齿状或增生性。
在RNA测序数据上训练并在两个独立微阵列数据集上测试的分类器产生了零个和三个错误。使用45个盲法独立福尔马林固定石蜡包埋标本的定量实时PCR表达水平对该分类器进行进一步测试,结果显示其具有高度准确性。通路分析表明,无蒂锯齿状息肉与增生性息肉和正常对照的区别在于:与增殖、炎症、细胞间粘附相关的通路上调,丝氨酸苏氨酸激酶信号通路下调;调节细胞分裂、蛋白质运输和激酶活性的通路的差异共表达。
大多数差异表达的通路是已知的癌症标志,可能解释了为什么无蒂锯齿状息肉比增生性息肉更容易发生肿瘤转化。新的分子分类器包含13个基因,可能有助于对这两种息肉进行客观区分。