Yuan Huijie, Duan Shuyin, Effah Clement Yaw, He Sitian, Chai Yaru, Liu Xia, Ding Lihua, Wu Yongjun
College of Public Health, Zhengzhou University, Zhengzhou, China.
School of Public Health, Shandong First Medical University & Shandong Academy of Medical Sciences, Jinan, China.
Front Oncol. 2025 May 27;15:1567673. doi: 10.3389/fonc.2025.1567673. eCollection 2025.
Molecular biomarkers have the potential to improve the current state of early screening of lung cancer. This investigation aimed to identify novel protein markers for early-stage lung cancer and combine them with traditional tumor markers to develop machine learning models for lung cancer screening.
The protein alters of peripheral blood (5 patients with early-stage lung adenocarcinoma, 5 patients with early-stage lung squamous cell carcinoma, and 8 healthy controls) were detected by label-free quantitative proteomics. The novel candidate protein markers were preferentially selected by multi-omics technology. Then, the malignant transformation of BEAS-2B cells and lung carcinogenesis in C57BL/6 mice were induced by coal tar pitch extracts (CTPE) so that the expressions of these markers at different stages of lung carcinogenesis could be dynamically tracked and validated. These markers in human plasma were detected and further confirmed by ELISA. Machine learning models were established to screen high-risk individuals of lung cancer.
The C-type lectin domain family 3 member B (CLEC3B), membrane primary amine oxidase (AOC3), hemoglobin subunit beta (HBB), catalase (CAT), and selenoprotein P (SEPP1) were screened as candidate protein markers for early-stage lung cancer. The expressions of CLEC3B, AOC3, CAT, and SEPP1 were statistically significant in various passages of cells cultured with exposure to CTPE compared to the saline group (<0.05). In addition, the expressions of these 5 proteins were statistically significant in lung tissues, plasma, and alveolar lavage fluid of mice exposed to CTPE for 3, 6, 9 and 12 months compared to normal controls (<0.05). There were notable variations in AOC3, CAT, CLEC3B, SEPP1, HBB, CEA, CYFRA21-1, and NSE among the healthy control group, lung cancer group and coke oven workers (<0.05). The decision tree C5.0 (AUC=0.868) and artificial neural network (AUC=0.844) which combined these 8 markers showed better performance.
The differential changes of AOC3, CAT, CLEC3B, SEPP1, and HBB protein were proven as early molecular events in lung tumorigenesis. The screening models of lung cancer based on the novel protein markers and traditional tumor markers might be applied for the screening of high-risk individuals.
分子生物标志物有潜力改善肺癌早期筛查的现状。本研究旨在识别早期肺癌的新型蛋白质标志物,并将它们与传统肿瘤标志物相结合,以开发用于肺癌筛查的机器学习模型。
采用无标记定量蛋白质组学技术检测外周血(5例早期肺腺癌患者、5例早期肺鳞癌患者和8名健康对照者)的蛋白质变化。通过多组学技术优先筛选新型候选蛋白质标志物。然后,用煤焦油沥青提取物(CTPE)诱导BEAS-2B细胞恶性转化和C57BL/6小鼠肺癌发生,以便动态跟踪和验证这些标志物在肺癌发生不同阶段的表达。通过酶联免疫吸附测定法(ELISA)检测并进一步确认人血浆中的这些标志物。建立机器学习模型以筛查肺癌高危个体。
筛选出C型凝集素结构域家族3成员B(CLEC3B)、膜原发性胺氧化酶(AOC3)、血红蛋白亚基β(HBB)、过氧化氢酶(CAT)和硒蛋白P(SEPP1)作为早期肺癌的候选蛋白质标志物。与生理盐水组相比,在暴露于CTPE培养的不同代细胞中,CLEC3B、AOC3、CAT和SEPP1的表达具有统计学意义(<0.05)。此外,与正常对照组相比,在暴露于CTPE 3、6、9和12个月的小鼠的肺组织、血浆和肺泡灌洗液中,这5种蛋白质的表达具有统计学意义(<0.05)。健康对照组、肺癌组和焦炉工人之间的AOC3、CAT、CLEC3B、SEPP1、HBB、癌胚抗原(CEA)、细胞角蛋白19片段(CYFRA21-1)和神经元特异性烯醇化酶(NSE)存在显著差异(<0.05)。结合这8种标志物的决策树C5.0(曲线下面积[AUC]=0.868)和人工神经网络(AUC=0.844)表现出更好的性能。
AOC3、CAT、CLEC3B、SEPP1和HBB蛋白的差异变化被证明是肺癌发生过程中的早期分子事件。基于新型蛋白质标志物和传统肿瘤标志物的肺癌筛查模型可能适用于高危个体的筛查。