文献检索文档翻译深度研究
Suppr Zotero 插件Zotero 插件
邀请有礼套餐&价格历史记录

新学期,新优惠

限时优惠:9月1日-9月22日

30天高级会员仅需29元

1天体验卡首发特惠仅需5.99元

了解详情
不再提醒
插件&应用
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
高级版
套餐订阅购买积分包
AI 工具
文献检索文档翻译深度研究
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2025

基于转录组谱特征选择和机器学习方法的乳腺癌预测。

Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods.

机构信息

Department of Medical Genetic, Faculty of Medicine, Ahvaz Jundishapur University of Medical Sciences, Ahvaz, Iran.

Department of Radiology Technology, Shoushtar Faculty of Medical Sciences, Shoushtar, Iran.

出版信息

BMC Bioinformatics. 2022 Oct 1;23(1):410. doi: 10.1186/s12859-022-04965-8.


DOI:10.1186/s12859-022-04965-8
PMID:36183055
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9526906/
Abstract

BACKGROUND: We used a hybrid machine learning systems (HMLS) strategy that includes the extensive search for the discovery of the most optimal HMLSs, including feature selection algorithms, a feature extraction algorithm, and classifiers for diagnosing breast cancer. Hence, this study aims to obtain a high-importance transcriptome profile linked with classification procedures that can facilitate the early detection of breast cancer. METHODS: In the present study, 762 breast cancer patients and 138 solid tissue normal subjects were included. Three groups of machine learning (ML) algorithms were employed: (i) four feature selection procedures are employed and compared to select the most valuable feature: (1) ANOVA; (2) Mutual Information; (3) Extra Trees Classifier; and (4) Logistic Regression (LGR), (ii) a feature extraction algorithm (Principal Component Analysis), iii) we utilized 13 classification algorithms accompanied with automated ML hyperparameter tuning, including (1) LGR; (2) Support Vector Machine; (3) Bagging; (4) Gaussian Naive Bayes; (5) Decision Tree; (6) Gradient Boosting Decision Tree; (7) K Nearest Neighborhood; (8) Bernoulli Naive Bayes; (9) Random Forest; (10) AdaBoost, (11) ExtraTrees; (12) Linear Discriminant Analysis; and (13) Multilayer Perceptron (MLP). For evaluating the proposed models' performance, balance accuracy and area under the curve (AUC) were used. RESULTS: Feature selection procedure LGR + MLP classifier achieved the highest prediction accuracy and AUC (balanced accuracy: 0.86, AUC = 0.94), followed by an LGR + LGR classifier (balanced accuracy: 0.84, AUC = 0.94). The results showed that achieved AUC for the LGR + LGR classifier belonged to the 20 biomarkers as follows: TMEM212, SNORD115-13, ATP1A4, FRG2, CFHR4, ZCCHC13, FLJ46361, LY6G6E, ZNF323, KRT28, KRT25, LPPR5, C10orf99, PRKACG, SULT2A1, GRIN2C, EN2, GBA2, CUX2, and SNORA66. CONCLUSIONS: The best performance was achieved using the LGR feature selection procedure and MLP classifier. Results show that the 20 biomarkers had the highest score or ranking in breast cancer detection.

摘要

背景:我们使用了一种混合机器学习系统(HMLS)策略,该策略包括广泛搜索发现最佳 HMLS,包括特征选择算法、特征提取算法和用于诊断乳腺癌的分类器。因此,本研究旨在获得与分类过程相关的高重要性转录组谱,以促进乳腺癌的早期发现。

方法:本研究纳入了 762 例乳腺癌患者和 138 例实体组织正常对照。使用了三组机器学习(ML)算法:(i)采用了四种特征选择程序进行比较,以选择最有价值的特征:(1)方差分析;(2)互信息;(3)Extra Trees 分类器;和(4)逻辑回归(LGR),(ii)特征提取算法(主成分分析),(iii)我们使用了 13 种分类算法,并结合自动化 ML 超参数调整,包括(1)LGR;(2)支持向量机;(3)袋装;(4)高斯朴素贝叶斯;(5)决策树;(6)梯度提升决策树;(7)K 近邻;(8)伯努利朴素贝叶斯;(9)随机森林;(10)AdaBoost,(11)ExtraTrees;(12)线性判别分析;和(13)多层感知机(MLP)。为了评估所提出模型的性能,使用平衡准确性和曲线下面积(AUC)。

结果:特征选择程序 LGR+MLP 分类器实现了最高的预测准确性和 AUC(平衡准确性:0.86,AUC=0.94),其次是 LGR+LGR 分类器(平衡准确性:0.84,AUC=0.94)。结果表明,LGR+LGR 分类器的 AUC 属于以下 20 个生物标志物:TMEM212、SNORD115-13、ATP1A4、FRG2、CFHR4、ZCCHC13、FLJ46361、LY6G6E、ZNF323、KRT28、KRT25、LPPR5、C10orf99、PRKACG、SULT2A1、GRIN2C、EN2、GBA2、CUX2 和 SNORA66。

结论:LGR 特征选择程序和 MLP 分类器的性能最佳。结果表明,这 20 个生物标志物在乳腺癌检测中具有最高的评分或排名。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/f82326f7911c/12859_2022_4965_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/898bc0f441ee/12859_2022_4965_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/c7c704aec722/12859_2022_4965_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/f82326f7911c/12859_2022_4965_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/898bc0f441ee/12859_2022_4965_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/c7c704aec722/12859_2022_4965_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d278/9526906/f82326f7911c/12859_2022_4965_Fig3_HTML.jpg

相似文献

[1]
Breast cancer prediction with transcriptome profiling using feature selection and machine learning methods.

BMC Bioinformatics. 2022-10-1

[2]
Machine learning-based models for the prediction of breast cancer recurrence risk.

BMC Med Inform Decis Mak. 2023-11-29

[3]
Screening of COVID-19 based on the extracted radiomics features from chest CT images.

J Xray Sci Technol. 2021

[4]
Machine Learning Hybrid Model for the Prediction of Chronic Kidney Disease.

Comput Intell Neurosci. 2023

[5]
Machine learning algorithms for outcome prediction in (chemo)radiotherapy: An empirical comparison of classifiers.

Med Phys. 2018-6-13

[6]
Prediction and Diagnosis of Breast Cancer Using Machine and Modern Deep Learning Models.

Asian Pac J Cancer Prev. 2024-3-1

[7]
Comparison of Classification Success Rates of Different Machine Learning Algorithms in the Diagnosis of Breast Cancer.

Asian Pac J Cancer Prev. 2022-10-1

[8]
Prediction and feature selection of low birth weight using machine learning algorithms.

J Health Popul Nutr. 2024-10-12

[9]
Automatic migraine classification via feature selection committee and machine learning techniques over imaging and questionnaire data.

BMC Med Inform Decis Mak. 2017-4-13

[10]
Non-invasive thyroid detection based on electroglottogram signal using machine learning classifiers.

Proc Inst Mech Eng H. 2021-10

引用本文的文献

[1]
ST-deconv: an accurate deconvolution approach for spatial transcriptome data utilizing self-encoding and contrastive learning.

NAR Genom Bioinform. 2025-8-27

[2]
i-DENV: development of QSAR based regression models for predicting inhibitors targeting non-structural (NS) proteins of dengue virus.

Front Pharmacol. 2025-6-26

[3]
Early detection and analysis of accurate breast cancer for improved diagnosis using deep supervised learning for enhanced patient outcomes.

PeerJ Comput Sci. 2025-4-24

[4]
Hybrid convolutional neural network and bi-LSTM model with EfficientNet-B0 for high-accuracy breast cancer detection and classification.

Sci Rep. 2025-4-9

[5]
Optimizing Tacrolimus Dosing During Hospitalization After Kidney Transplantation: A Comparative Model Analysis.

Ann Transplant. 2025-4-1

[6]
Advanced machine learning framework for enhancing breast cancer diagnostics through transcriptomic profiling.

Discov Oncol. 2025-3-17

[7]
A novel aggregated coefficient ranking based feature selection strategy for enhancing the diagnosis of breast cancer classification using machine learning.

Sci Rep. 2025-2-4

[8]
A quantum-optimized approach for breast cancer detection using SqueezeNet-SVM.

Sci Rep. 2025-1-25

[9]
Utilizing Feature Selection Techniques for AI-Driven Tumor Subtype Classification: Enhancing Precision in Cancer Diagnostics.

Biomolecules. 2025-1-8

[10]
The effect of consuming bread contaminated with heavy metals on cardiovascular disease and calculating its risk assessment.

Sci Rep. 2025-1-21

本文引用的文献

[1]
Machine learning analysis of TCGA cancer data.

PeerJ Comput Sci. 2021-7-12

[2]
Predicting breast cancer response to neoadjuvant chemotherapy using ensemble deep transfer learning based on CT images.

J Xray Sci Technol. 2021

[3]
MicroRNAs in breast cancer: Roles, functions, and mechanism of actions.

J Cell Physiol. 2020-6

[4]
Identification of Potential Crucial Genes and Key Pathways in Breast Cancer Using Bioinformatic Analysis.

Front Genet. 2019-8-2

[5]
A Machine Learning Approach for Identifying Gene Biomarkers Guiding the Treatment of Breast Cancer.

Front Genet. 2019-3-27

[6]
Identifying a miRNA signature for predicting the stage of breast cancer.

Sci Rep. 2018-10-31

[7]
Risk Factors and Preventions of Breast Cancer.

Int J Biol Sci. 2017-11-1

[8]
Early Detection and Screening for Breast Cancer.

Semin Oncol Nurs. 2017-5

[9]
Breast cancer risk factors.

Prz Menopauzalny. 2015-9

[10]
Cancer incidence and mortality worldwide: sources, methods and major patterns in GLOBOCAN 2012.

Int J Cancer. 2014-10-9

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

推荐工具

医学文档翻译智能文献检索