• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于基因表达数据的不平衡分类的学习错误分类代价。

Learning misclassification costs for imbalanced classification on gene expression data.

机构信息

Key Laboratory of Electromagnetic Wave Information Technology and Metrology of Zhejiang Province, College of Information Engineering, China Jiliang University, Hangzhou, China.

College of Computer Science and Technology, Hangzhou Dianzi University, Hangzhou, China.

出版信息

BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):681. doi: 10.1186/s12859-019-3255-x.

DOI:10.1186/s12859-019-3255-x
PMID:31874599
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6929277/
Abstract

BACKGROUND

Cost-sensitive algorithm is an effective strategy to solve imbalanced classification problem. However, the misclassification costs are usually determined empirically based on user expertise, which leads to unstable performance of cost-sensitive classification. Therefore, an efficient and accurate method is needed to calculate the optimal cost weights.

RESULTS

In this paper, two approaches are proposed to search for the optimal cost weights, targeting at the highest weighted classification accuracy (WCA). One is the optimal cost weights grid searching and the other is the function fitting. Comparisons are made between these between the two algorithms above. In experiments, we classify imbalanced gene expression data using extreme learning machine to test the cost weights obtained by the two approaches.

CONCLUSIONS

Comprehensive experimental results show that the function fitting method is generally more efficient, which can well find the optimal cost weights with acceptable WCA.

摘要

背景

代价敏感算法是解决不平衡分类问题的有效策略。然而,误分类代价通常是根据用户经验进行经验性确定的,这导致代价敏感分类的性能不稳定。因此,需要一种高效准确的方法来计算最优的代价权重。

结果

本文提出了两种方法来搜索最优代价权重,旨在获得最高加权分类准确率(WCA)。一种是最优代价权重网格搜索,另一种是函数拟合。在实验中,我们使用极限学习机对不平衡基因表达数据进行分类,以测试这两种算法得到的代价权重。

结论

综合实验结果表明,函数拟合方法通常更有效,它可以很好地找到最优代价权重,同时保持可接受的 WCA。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/d8d4ff6723bf/12859_2019_3255_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/0be4f428ecd9/12859_2019_3255_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/7b2bf60d925a/12859_2019_3255_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/60ca46e4a1a5/12859_2019_3255_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/9e65398090da/12859_2019_3255_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/d8d4ff6723bf/12859_2019_3255_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/0be4f428ecd9/12859_2019_3255_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/7b2bf60d925a/12859_2019_3255_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/60ca46e4a1a5/12859_2019_3255_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/9e65398090da/12859_2019_3255_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/30e2/6929277/d8d4ff6723bf/12859_2019_3255_Fig7_HTML.jpg

相似文献

1
Learning misclassification costs for imbalanced classification on gene expression data.基于基因表达数据的不平衡分类的学习错误分类代价。
BMC Bioinformatics. 2019 Dec 24;20(Suppl 25):681. doi: 10.1186/s12859-019-3255-x.
2
Regularised extreme learning machine with misclassification cost and rejection cost for gene expression data classification.用于基因表达数据分类的具有误分类成本和拒绝成本的正则化极限学习机
Int J Data Min Bioinform. 2015;12(3):294-312. doi: 10.1504/ijdmb.2015.069657.
3
Applying Cost-Sensitive Extreme Learning Machine and Dissimilarity Integration to Gene Expression Data Classification.应用代价敏感极限学习机和不相似性集成对基因表达数据进行分类。
Comput Intell Neurosci. 2016;2016:8056253. doi: 10.1155/2016/8056253. Epub 2016 Aug 23.
4
A Cost-Sensitive Deep Belief Network for Imbalanced Classification.一种用于不平衡分类的成本敏感深度信念网络。
IEEE Trans Neural Netw Learn Syst. 2019 Jan;30(1):109-122. doi: 10.1109/TNNLS.2018.2832648. Epub 2018 May 28.
5
Learning to improve medical decision making from imbalanced data without a priori cost.学习从不均衡数据中改进医疗决策,且无需先验成本。
BMC Med Inform Decis Mak. 2014 Dec 5;14:111. doi: 10.1186/s12911-014-0111-9.
6
A hybrid cost-sensitive ensemble for imbalanced breast thermogram classification.一种用于不平衡乳腺热成像分类的混合成本敏感集成方法。
Artif Intell Med. 2015 Nov;65(3):219-27. doi: 10.1016/j.artmed.2015.07.005. Epub 2015 Jul 31.
7
Evolutionary extreme learning machine with sparse cost matrix for imbalanced learning.用于不平衡学习的具有稀疏代价矩阵的进化极限学习机。
ISA Trans. 2020 May;100:198-209. doi: 10.1016/j.isatra.2019.11.020. Epub 2019 Nov 23.
8
A hybrid sampling algorithm combining M-SMOTE and ENN based on Random forest for medical imbalanced data.基于随机森林的 M-SMOTE 与ENN 混合采样算法在医学不平衡数据中的应用
J Biomed Inform. 2020 Jul;107:103465. doi: 10.1016/j.jbi.2020.103465. Epub 2020 Jun 5.
9
Online sequential class-specific extreme learning machine for binary imbalanced learning.在线序贯类特定极端学习机用于二进制不平衡学习。
Neural Netw. 2019 Nov;119:235-248. doi: 10.1016/j.neunet.2019.08.018. Epub 2019 Aug 23.
10
Near-Bayesian Support Vector Machines for imbalanced data classification with equal or unequal misclassification costs.带有同等或不等误分类代价的不平衡数据分类的近贝叶斯支持向量机。
Neural Netw. 2015 Oct;70:39-52. doi: 10.1016/j.neunet.2015.06.005. Epub 2015 Jul 8.

引用本文的文献

1
Forecasting readmission in COVID-19 patients utilizing blood biomarkers and machine learning in the Hospital-at-Home program.在“居家医院”项目中利用血液生物标志物和机器学习预测新冠患者的再入院情况。
Front Med (Lausanne). 2025 Mar 26;12:1469245. doi: 10.3389/fmed.2025.1469245. eCollection 2025.
2
Pruning-based oversampling technique with smoothed bootstrap resampling for imbalanced clinical dataset of Covid-19.基于剪枝的过采样技术与平滑自助重采样用于新冠疫情不平衡临床数据集
J King Saud Univ Comput Inf Sci. 2022 Oct;34(9):7830-7839. doi: 10.1016/j.jksuci.2021.09.021. Epub 2021 Sep 30.
3
Supporting the decision to perform molecular profiling for cancer patients based on routinely collected data through the use of machine learning.

本文引用的文献

1
Identifying Stages of Kidney Renal Cell Carcinoma by Combining Gene Expression and DNA Methylation Data.通过结合基因表达和 DNA 甲基化数据来鉴定肾细胞癌的阶段。
IEEE/ACM Trans Comput Biol Bioinform. 2017 Sep-Oct;14(5):1147-1153. doi: 10.1109/TCBB.2016.2607717. Epub 2016 Sep 9.
2
Evolutionary Cost-Sensitive Extreme Learning Machine.进化代价敏感极限学习机。
IEEE Trans Neural Netw Learn Syst. 2017 Dec;28(12):3045-3060. doi: 10.1109/TNNLS.2016.2607757. Epub 2016 Oct 11.
3
Applying Cost-Sensitive Extreme Learning Machine and Dissimilarity Integration to Gene Expression Data Classification.
支持基于机器学习使用常规收集的数据为癌症患者做出分子谱分析决策。
Clin Exp Med. 2024 Apr 10;24(1):73. doi: 10.1007/s10238-024-01336-w.
4
Maternal Plasma RNA in First Trimester Nullipara for the Prediction of Spontaneous Preterm Birth ≤ 32 Weeks: Validation Study.孕早期未产妇母血血浆RNA预测≤32周自然早产的验证研究
Biomedicines. 2023 Apr 11;11(4):1149. doi: 10.3390/biomedicines11041149.
5
Clinical Risk Factors of Need for Intensive Care Unit Admission of COVID-19 Patients; a Cross-sectional Study.新型冠状病毒肺炎患者入住重症监护病房的临床风险因素;一项横断面研究。
Arch Acad Emerg Med. 2023 Jan 1;11(1):e15. doi: 10.22037/aaem.v11i1.. eCollection 2023.
6
Evaluation of a Maternal Plasma RNA Panel Predicting Spontaneous Preterm Birth and Its Expansion to the Prediction of Preeclampsia.评估预测自发性早产的母体血浆RNA检测板及其在子痫前期预测中的扩展应用
Diagnostics (Basel). 2022 May 27;12(6):1327. doi: 10.3390/diagnostics12061327.
7
Predicting Hospital Readmission for Campylobacteriosis from Electronic Health Records: A Machine Learning and Text Mining Perspective.从电子健康记录预测弯曲杆菌病的医院再入院率:机器学习与文本挖掘视角
J Pers Med. 2022 Jan 10;12(1):86. doi: 10.3390/jpm12010086.
应用代价敏感极限学习机和不相似性集成对基因表达数据进行分类。
Comput Intell Neurosci. 2016;2016:8056253. doi: 10.1155/2016/8056253. Epub 2016 Aug 23.
4
Predicting Hub Genes Associated with Cervical Cancer through Gene Co-Expression Networks.通过基因共表达网络预测与宫颈癌相关的枢纽基因
IEEE/ACM Trans Comput Biol Bioinform. 2016 Jan-Feb;13(1):27-35. doi: 10.1109/TCBB.2015.2476790. Epub 2015 Sep 25.
5
Mining the bladder cancer-associated genes by an integrated strategy for the construction and analysis of differential co-expression networks.通过构建和分析差异共表达网络的综合策略挖掘膀胱癌相关基因。
BMC Genomics. 2015;16 Suppl 3(Suppl 3):S4. doi: 10.1186/1471-2164-16-S3-S4. Epub 2015 Jan 29.
6
Normalized feature vectors: a novel alignment-free sequence comparison method based on the numbers of adjacent amino acids.标准化特征向量:一种新颖的基于相邻氨基酸数量的无比对序列比较方法。
IEEE/ACM Trans Comput Biol Bioinform. 2013 Mar-Apr;10(2):457-67. doi: 10.1109/TCBB.2013.10.
7
Robust classification method of tumor subtype by using correlation filters.基于相关滤波器的肿瘤亚型稳健分类方法。
IEEE/ACM Trans Comput Biol Bioinform. 2012;9(2):580-91. doi: 10.1109/TCBB.2011.135. Epub 2011 Oct 17.
8
Molecular pattern discovery based on penalized matrix decomposition.基于惩罚矩阵分解的分子模式发现。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Nov-Dec;8(6):1592-603. doi: 10.1109/TCBB.2011.79.
9
Metasample-based sparse representation for tumor classification.基于元样本稀疏表示的肿瘤分类。
IEEE/ACM Trans Comput Biol Bioinform. 2011 Sep-Oct;8(5):1273-82. doi: 10.1109/TCBB.2011.20.
10
Tumor classification by combining PNN classifier ensemble with neighborhood rough set based gene reduction.基于邻域粗糙集的基因约简和概率神经网络集成的肿瘤分类方法
Comput Biol Med. 2010 Feb;40(2):179-89. doi: 10.1016/j.compbiomed.2009.11.014. Epub 2009 Dec 30.