Yamazaki Koji, Kawauchi Shigeto, Okamoto Masaki, Tanabe Kazuhiro, Hayashi Chihiro, Mikami Mikio, Kusumoto Tetsuya
Department of Thoracic Surgery, National Hospital Organization Kyushu Medical Center, Chuo-ku, Fukuoka 810-0065, Japan.
Department of Pathology, Clinical Research Centre, National Hospital Organization Kyushu Medical Centre, Chuo-ku, Fukuoka 810-0065, Japan.
Cancers (Basel). 2025 Apr 27;17(9):1474. doi: 10.3390/cancers17091474.
Lung cancer is among the most prevalent and fatal cancers worldwide. Traditional diagnostic methods, such as computed tomography, are not ideal for screening due to their high cost and radiation exposure. In contrast, blood-based diagnostics, as non-invasive approaches, are expected to reduce patient burden, thereby increasing screening participation and ultimately improving survival rates. However, conventional tumor markers have shown limited effectiveness in early detection.
We recruited 199 patients with lung cancer and 590 healthy volunteers. Nine tumor markers (CEA, CA19-9, CYFRA, AFP, PSA, CA125, CA15-3, SCC antigen, and NCC-ST439) were analyzed, along with enriched glycopeptides (EGPs) derived from serum proteins using liquid chromatography-mass spectrometry. Machine learning models, including decision trees and deep learning approaches, were employed to develop a predictive model for accurately distinguishing lung cancer from healthy controls based on tumor markers and EGP profiles.
We found that α1-antitrypsin with fully sialylated biantennary glycan, attached to asparagine 271 (AT271-FSG), and α2-macroglobulin with fully sialylated biantennary glycan, attached to asparagine 70 (MG70-FSG), could significantly distinguish between patients with lung cancer and healthy individuals. Comprehensive Serum Glycopeptide Spectra Analysis (CSGSA), integrating nine conventional tumor markers and 1688 EGPs using a machine learning model, enhanced diagnostic accuracy and achieved an ROC-AUC score of 0.935. It also identified stage I cases with an ROC-AUC of 0.914, indicating the possibility of early-stage detection. The PPV reached 2.8%, which was sufficient for practical application.
This method represents a significant advancement in cancer diagnostics, combining multiple biomarkers with cutting-edge machine learning to improve the early detection of lung cancer.
肺癌是全球最常见且致命的癌症之一。传统的诊断方法,如计算机断层扫描,由于成本高和辐射暴露,并不适合用于筛查。相比之下,基于血液的诊断作为非侵入性方法,有望减轻患者负担,从而提高筛查参与率并最终提高生存率。然而,传统的肿瘤标志物在早期检测中的有效性有限。
我们招募了199名肺癌患者和590名健康志愿者。分析了九种肿瘤标志物(癌胚抗原、糖类抗原19-9、细胞角蛋白片段、甲胎蛋白、前列腺特异抗原、糖类抗原125、糖类抗原15-3、鳞状细胞癌抗原和NCC-ST439),以及使用液相色谱-质谱法从血清蛋白中提取的富集糖肽。采用包括决策树和深度学习方法在内的机器学习模型,基于肿瘤标志物和糖肽谱开发一种预测模型,以准确区分肺癌患者和健康对照。
我们发现,与天冬酰胺271相连的具有完全唾液酸化双天线聚糖的α1-抗胰蛋白酶(AT271-FSG),以及与天冬酰胺70相连的具有完全唾液酸化双天线聚糖的α2-巨球蛋白(MG70-FSG),能够显著区分肺癌患者和健康个体。综合血清糖肽谱分析(CSGSA)使用机器学习模型整合了九种传统肿瘤标志物和1688种富集糖肽,提高了诊断准确性,ROC-AUC评分为0.935。它还识别出I期病例,ROC-AUC为0.914,表明有早期检测的可能性。阳性预测值达到2.8%,足以用于实际应用。
该方法代表了癌症诊断的重大进展,将多种生物标志物与前沿的机器学习相结合,以改善肺癌的早期检测。