Rahangdale Ayush, Ranpise Shraddha, Chauhan Shweta Singh, Devireddy Nirnith, Karmwar Pranav
Dr. D. Y. Patil Art's, Commerce and Science College, Sant Tukaram Nagar, Pimpri, Pune, Maharashtra, 411018, India.
InSilicoMinds, Block-A, Ramky Selenium, Financial District, Gachibowli, Nanakramguda, Hyderabad, Telangana, 500035, India.
Metabolomics. 2025 Jun 14;21(4):78. doi: 10.1007/s11306-025-02265-9.
Breast cancer is the most common cancer among women, with its burden increasing over the past decades. Early diagnosis significantly improves survival rates and reduces lethality. Innovative technologies are being developed for early detection, making accurate tumor identification crucial.
The research aims to identify significant metabolomics biomarkers that can help in detecting tumor progression, which could contribute to early breast cancer diagnosis.
A dataset of 228 metabolites from breast cancer patients and healthy individuals was curated from the Metabolomics Workbench Database. Statistical tests and Machine Learning (ML) algorithms were applied for feature selection, assessing normality, variance homogeneity, and significance Recursive Feature Elimination (RFE) with a Random Forest (RF) classifier was used to identify a minimal set of six significant metabolites with strong predictive potential. A Ridge Classifier was employed for classification, achieving an 83% accuracy in distinguishing between cancerous and healthy individuals.
A minimal set of six significant metabolites was identified in plasma samples. The developed model showed an 83% accuracy in classifying cancerous vs. healthy individuals using the Ridge Classifier.
The study provides valuable insights into metabolomic changes associated with breast cancer, identifying potential biomarkers that could enhance early detection and diagnosis.
乳腺癌是女性中最常见的癌症,在过去几十年中其负担不断增加。早期诊断可显著提高生存率并降低致死率。目前正在开发用于早期检测的创新技术,因此准确识别肿瘤至关重要。
本研究旨在识别可帮助检测肿瘤进展的重要代谢组学生物标志物,这可能有助于早期乳腺癌诊断。
从代谢组学工作台数据库中整理了一个包含来自乳腺癌患者和健康个体的228种代谢物的数据集。应用统计测试和机器学习(ML)算法进行特征选择,评估正态性、方差齐性和显著性。使用随机森林(RF)分类器的递归特征消除(RFE)来识别具有强大预测潜力的最少六种重要代谢物。采用岭分类器进行分类,在区分癌症患者和健康个体方面达到了83%的准确率。
在血浆样本中识别出最少六种重要代谢物。使用岭分类器,所开发的模型在区分癌症患者和健康个体方面显示出83%的准确率。
该研究为与乳腺癌相关的代谢组学变化提供了有价值的见解,识别出了可能增强早期检测和诊断的潜在生物标志物。