Suppr超能文献

集成机器学习算法——基于质谱蛋白质组学数据的宫颈癌增强预测

Integrated Machine Learning Algorithms-Enhanced Predication for Cervical Cancer from Mass Spectrometry-Based Proteomics Data.

作者信息

Zhang Da, Zhao Lihong, Guo Bo, Guo Aihong, Ding Jiangbo, Tong Dongdong, Wang Bingju, Zhou Zhangjian

机构信息

Department of Oncology, The Second Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710000, China.

Department of Dermatology, The Second Affiliated Hospital, Xi'an Jiaotong University, Xi'an 710000, China.

出版信息

Bioengineering (Basel). 2025 Mar 9;12(3):269. doi: 10.3390/bioengineering12030269.

Abstract

Early diagnosis is critical for improving outcomes in cancer patients; however, the application of diagnostic markers derived from serum proteomic screening remains challenging. Artificial intelligence (AI), encompassing deep learning and machine learning (ML), has gained increasing prominence across various scientific disciplines. In this study, we utilized cervical cancer (CC) as a model to develop an AI-driven pipeline for the identification and validation of serum biomarkers for early cancer diagnosis, leveraging mass spectrometry-based proteomics data. By processing and normalizing serum polypeptide differential peaks from 240 patients, we employed eight distinct ML algorithms to classify and analyze these differential polypeptide peaks, subsequently constructing receiver operating characteristic (ROC) curves and confusion matrices. Key performance metrics, including accuracy, precision, recall, and F1 score, were systematically evaluated. Furthermore, by integrating feature importance values, Shapley values, and local interpretable model-agnostic explanation (LIME) values, we demonstrated that the diagnostic area under the curve (AUC) achieved by our multi-dimensional learning models approached 1, significantly outperforming the diagnostic AUC of single markers derived from the PRIDE database. These findings underscore the potential of proteomics-driven integrated machine learning as a robust strategy to enhance early cancer diagnosis, offering a promising avenue for clinical translation.

摘要

早期诊断对于改善癌症患者的预后至关重要;然而,应用源自血清蛋白质组学筛查的诊断标志物仍然具有挑战性。包括深度学习和机器学习(ML)在内的人工智能(AI)在各个科学领域日益受到关注。在本研究中,我们以宫颈癌(CC)为模型,利用基于质谱的蛋白质组学数据,开发了一种由人工智能驱动的流程,用于识别和验证早期癌症诊断的血清生物标志物。通过处理和归一化来自240名患者的血清多肽差异峰,我们采用了八种不同的机器学习算法对这些差异多肽峰进行分类和分析,随后构建受试者工作特征(ROC)曲线和混淆矩阵。系统地评估了关键性能指标,包括准确性、精确性、召回率和F1分数。此外,通过整合特征重要性值、Shapley值和局部可解释模型无关解释(LIME)值,我们证明我们的多维学习模型所实现的曲线下诊断面积(AUC)接近1,显著优于源自PRIDE数据库的单一标志物的诊断AUC。这些发现强调了蛋白质组学驱动的集成机器学习作为增强早期癌症诊断的有力策略的潜力,为临床转化提供了一条有前景的途径。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3d0e/11939187/d9ebb6eded96/bioengineering-12-00269-g001.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验