Guo Shicheng, Yan Fengyang, Xu Jibin, Bao Yang, Zhu Ji, Wang Xiaotian, Wu Junjie, Li Yi, Pu Weilin, Liu Yan, Jiang Zhengwen, Ma Yanyun, Chen Xiaofeng, Xiong Momiao, Jin Li, Wang Jiucun
State Key Laboratory of Genetic Engineering and Ministry of Education Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University Jiangwan Campus, 2005 Songhu Road, Shanghai, 200438 China ; Fudan-Taizhou Institute of Health Sciences, 1 Yaocheng Road, Taizhou, Jiangsu 225300 China.
State Key Laboratory of Genetic Engineering and Ministry of Education Key Laboratory of Contemporary Anthropology, Collaborative Innovation Center for Genetics and Development, School of Life Sciences and Institutes of Biomedical Sciences, Fudan University Jiangwan Campus, 2005 Songhu Road, Shanghai, 200438 China.
Clin Epigenetics. 2015 Jan 22;7(1):3. doi: 10.1186/s13148-014-0035-3. eCollection 2015.
DNA methylation was suggested as the promising biomarker for lung cancer diagnosis. However, it is a great challenge to search for the optimal combination of methylation biomarkers to obtain maximum diagnostic performance.
In this study, we developed a panel of DNA methylation biomarkers and validated their diagnostic efficiency for non-small cell lung cancer (NSCLC) in a large Chinese Han NSCLC retrospective cohort. Three high-throughput DNA methylation microarray datasets (458 samples) were collected in the discovery stage. After normalization, batch effect elimination and integration, significantly differentially methylated genes and the best combination of the biomarkers were determined by the leave-one-out SVM (support vector machine) feature selection procedure. Then, candidate promoters were examined by the methylation status determined single nucleotide primer extension technique (MSD-SNuPET) in an independent set of 150 pairwise NSCLC/normal tissues. Four statistical models with fivefold cross-validation were used to evaluate the performance of the discriminatory algorithms. The sensitivity, specificity and accuracy were 86.3%, 95.7% and 91%, respectively, in Bayes tree model. The logistic regression model incorporated five gene methylation signatures at AGTR1, GALR1, SLC5A8, ZMYND10 and NTSR1, adjusted for age, sex and smoking, showed robust performances in which the sensitivity, specificity, accuracy, and area under the curve (AUC) were 78%, 97%, 87%, and 0.91, respectively.
In summary, a high-throughput DNA methylation microarray dataset followed by batch effect elimination can be a good strategy to discover optimal DNA methylation diagnostic panels. Methylation profiles of AGTR1, GALR1, SLC5A8, ZMYND10 and NTSR1, could be an effective methylation-based assay for NSCLC diagnosis.
DNA甲基化被认为是肺癌诊断中很有前景的生物标志物。然而,寻找甲基化生物标志物的最佳组合以获得最大诊断性能是一项巨大挑战。
在本研究中,我们开发了一组DNA甲基化生物标志物,并在一个大型中国汉族非小细胞肺癌(NSCLC)回顾性队列中验证了它们对NSCLC的诊断效率。在发现阶段收集了三个高通量DNA甲基化微阵列数据集(458个样本)。经过标准化、批次效应消除和整合后,通过留一法支持向量机(SVM)特征选择程序确定了显著差异甲基化基因和生物标志物的最佳组合。然后,在150对NSCLC/正常组织的独立样本集中,通过甲基化状态测定单核苷酸引物延伸技术(MSD-SNuPET)检测候选启动子。使用四种具有五重交叉验证的统计模型来评估判别算法的性能。在贝叶斯树模型中,敏感性、特异性和准确性分别为86.3%、95.7%和91%。纳入AGTR1、GALR1、SLC5A8、ZMYND10和NTSR1五个基因甲基化特征并根据年龄、性别和吸烟情况进行调整的逻辑回归模型表现稳健,其敏感性、特异性、准确性和曲线下面积(AUC)分别为78%、97%、87%和0.91。
总之,高通量DNA甲基化微阵列数据集结合批次效应消除可能是发现最佳DNA甲基化诊断面板的良好策略。AGTR1、GALR1、SLC5A8、ZMYND10和NTSR1的甲基化谱可能是一种有效的基于甲基化的NSCLC诊断检测方法。