Zhang Zi-Mei, Wang Jia-Shu, Zulfiqar Hasan, Lv Hao, Dao Fu-Ying, Lin Hao
Key Laboratory for Neuro-Information of Ministry of Education, Center for Informational Biology, School of Life Sciences and Technology, University of Electronic Science and Technology of China, Chengdu, China.
Front Cell Dev Biol. 2020 Oct 15;8:582864. doi: 10.3389/fcell.2020.582864. eCollection 2020.
Pancreatic ductal adenocarcinoma (PDAC) is an aggressive and lethal cancer deeply affecting human health. Diagnosing early-stage PDAC is the key point to PDAC patients' survival. However, the biomarkers for diagnosing early PDAC are inexact in most cases. Therefore, it is highly desirable to identify an effective PDAC diagnostic biomarker. In the current work, we designed a novel computational approach based on within-sample relative expression orderings (REOs). A feature selection technique called minimum redundancy maximum relevance was used to pick out optimal REOs. We then compared the performances of different classification algorithms for discriminating PDAC and its adjacent normal tissues from non-PDAC tissues. The support vector machine algorithm is the best one for identifying early PDAC diagnostic biomarker. At first, a signature composed of nine gene pairs was acquired from microarray gene expression data sets. These gene pairs could produce satisfactory classification accuracy up to 97.53% in fivefold cross-validation. Subsequently, two types of data from diverse platforms, namely, microarray and RNA-Seq, were used to validate this signature. For microarray data, all (100.00%) of 115 PDAC tissues and all (100.00%) of 31 PDAC adjacent normal tissues were correctly recognized as PDAC. In addition, 88.24% of 17 non-PDAC (normal or pancreatitis) tissues were correctly classified. For the RNA-Seq data, all (100.00%) of 177 PDAC tissues and all (100.00%) of 4 PDAC adjacent normal tissues were correctly recognized as PDAC. Validation results demonstrated that the signature had a good cross-platform effect for early detection of PDAC. This work developed a new robust signature that might be a promising biomarker for early PDAC diagnosis.
胰腺导管腺癌(PDAC)是一种侵袭性强且致命的癌症,严重影响人类健康。早期诊断PDAC是PDAC患者生存的关键。然而,在大多数情况下,用于诊断早期PDAC的生物标志物并不准确。因此,非常需要鉴定一种有效的PDAC诊断生物标志物。在当前工作中,我们基于样本内相对表达顺序(REO)设计了一种新颖的计算方法。使用一种称为最小冗余最大相关性的特征选择技术来挑选出最佳REO。然后,我们比较了不同分类算法区分PDAC及其相邻正常组织与非PDAC组织的性能。支持向量机算法是识别早期PDAC诊断生物标志物的最佳算法。首先,从微阵列基因表达数据集中获得了一个由九个基因对组成的特征。在五折交叉验证中,这些基因对可产生高达97.53%的令人满意的分类准确率。随后,使用来自不同平台的两种类型的数据,即微阵列和RNA测序,来验证这个特征。对于微阵列数据,115个PDAC组织全部(100.00%)和31个PDAC相邻正常组织全部(100.00%)被正确识别为PDAC。此外,17个非PDAC(正常或胰腺炎)组织中有88.24%被正确分类。对于RNA测序数据,177个PDAC组织全部(100.00%)和4个PDAC相邻正常组织全部(100.00%)被正确识别为PDAC。验证结果表明,该特征对早期检测PDAC具有良好的跨平台效果。这项工作开发了一种新的稳健特征,可能是早期PDAC诊断的有前途的生物标志物。