Zhou Rui-Quan, Ji Hong-Chen, Liu Qu, Zhu Chun-Yu, Liu Rong
School of Medicine, Nankai University, Tianjin 300071, China.
The Second Department of Hepatobiliary Surgery, Chinese PLA General Hospital, Beijing 100853, China.
World J Clin Cases. 2019 Jul 6;7(13):1611-1622. doi: 10.12998/wjcc.v7.i13.1611.
The incidence of pancreatic neuroendocrine tumors (PNETs) is now increasing rapidly. The tumor grade of PNETs significantly affects the treatment strategy and prognosis. However, there is still no effective way to non-invasively classify PNET grades. Machine learning (ML) algorithms have shown potential in improving the prediction accuracy using comprehensive data.
To provide a ML approach to predict PNET tumor grade using clinical data.
The clinical data of histologically confirmed PNET cases between 2012 and 2018 were collected. A method of minimum for the Chi-square test was used to divide the continuous variables into binary variables. The continuous variables were transformed into binary variables according to the cutoff value, while the value was minimum. Four classical supervised ML models, including logistic regression, support vector machine (SVM), linear discriminant analysis (LDA) and multi-layer perceptron (MLP) were trained by clinical data, and the models were labeled with the pathological tumor grade of each PNET patient. The performance of each model, including the weight of the different parameters, were evaluated.
In total, 91 PNET cases were included in this study, in which 32 were G1, 48 were G2 and 11 were G3. The results showed that there were significant differences among the clinical parameters of patients with different grades. Patients with higher grades tended to have higher values of total bilirubin, alpha fetoprotein, carcinoembryonic antigen, carbohydrate antigen 19-9 and carbohydrate antigen 72-4. Among the models we used, LDA performed best in predicting the PNET tumor grade. Meanwhile, MLP had the highest recall rate for G3 cases. All of the models stabilized when the sample size was over 70 percent of the total, except for SVM. Different parameters varied in affecting the outcomes of the models. Overall, alanine transaminase, total bilirubin, carcinoembryonic antigen, carbohydrate antigen 19-9 and carbohydrate antigen 72-4 affected the outcome greater than other parameters.
ML could be a simple and effective method in non-invasively predicting PNET grades by using the routine data obtained from the results of biochemical and tumor markers.
胰腺神经内分泌肿瘤(PNETs)的发病率目前正在迅速上升。PNETs的肿瘤分级显著影响治疗策略和预后。然而,仍然没有有效的非侵入性方法来对PNET分级进行分类。机器学习(ML)算法在利用综合数据提高预测准确性方面已显示出潜力。
提供一种使用临床数据预测PNET肿瘤分级的ML方法。
收集2012年至2018年间组织学确诊的PNET病例的临床数据。采用卡方检验的最小化方法将连续变量转换为二元变量。根据截止值将连续变量转换为二元变量,同时卡方值最小。通过临床数据训练了四个经典的监督ML模型,包括逻辑回归、支持向量机(SVM)、线性判别分析(LDA)和多层感知器(MLP),并将每个PNET患者的病理肿瘤分级标记到模型上。评估了每个模型的性能,包括不同参数的权重。
本研究共纳入91例PNET病例,其中G1级32例,G2级48例,G3级11例。结果表明,不同分级患者的临床参数存在显著差异。分级较高的患者总胆红素、甲胎蛋白、癌胚抗原、糖类抗原19-9和糖类抗原72-4的值往往较高。在我们使用的模型中,LDA在预测PNET肿瘤分级方面表现最佳。同时,MLP对G3病例的召回率最高。除SVM外,当样本量超过总数的70%时,所有模型均趋于稳定。不同参数对模型结果的影响各不相同。总体而言,谷丙转氨酶、总胆红素、癌胚抗原、糖类抗原19-9和糖类抗原72-4对结果的影响大于其他参数。
ML可能是一种通过使用从生化和肿瘤标志物结果中获得的常规数据来非侵入性预测PNET分级的简单有效方法。