基于代表性和不确定性的生物医学数据分类的主动学习研究。

Exploring Active Learning Based on Representativeness and Uncertainty for Biomedical Data Classification.

出版信息

IEEE J Biomed Health Inform. 2019 Nov;23(6):2238-2244. doi: 10.1109/JBHI.2018.2881155. Epub 2018 Nov 13.

DOI:10.1109/JBHI.2018.2881155

Abstract

Nowadays, there is an abundance of biomedical data, such as images and genetic sequences, among others. However, there is a lack of annotation to such volume of data, due to the high costs involved to perform this task. Thus, it is mandatory to develop techniques to ease the burden of human annotation. To reach such goal active learning strategies can be applied. However, the state-of-the-art active learning methods, generally, are not feasible to lead with real-world datasets. Another important issue, that is generally neglected by these methods, is related to the conception that the classifier tends to learn more and more at each iteration. Their adopted selection criteria do not properly exploit the knowledge of the classifier. Therefore, in this paper, we propose the use of an active learning approach, in order to leverage the learning process, including the proposal of a novel active learning strategy. The main difference of our proposed strategy is related to the participation of the classifier in an extremely active way in its learning process. So, we can better maximize and prioritize the knowledge that is obtained by the classifier at each iteration, making use of this knowledge in a more appropriate and useful way when selecting more informative samples. To do so, in our selection criteria, we give significant importance to the classifications suggested by the classifier. In addition, jointly with the participation and the knowledge of the classifier, we consider both uncertainty and representativeness criteria through a fine-grained analysis of the samples. Experimental results show that our novel active learning approach outperforms state-of-the-art active learning methods, considering several supervised classifiers. Hence, dealing with real dataset problems in a better way, equalizing the tradeoff between annotation task and higher accuracy rates.

摘要

如今，存在着大量的生物医学数据，例如图像和基因序列等等。然而，由于执行这项任务的成本很高，这些数据缺乏标注。因此，必须开发技术来减轻人工标注的负担。为了实现这一目标，可以应用主动学习策略。然而，最先进的主动学习方法通常不适用于真实世界的数据集。另一个重要的问题是，这些方法通常忽略了分类器在每次迭代中都会越来越多地学习的概念。它们采用的选择标准并没有很好地利用分类器的知识。因此，在本文中，我们提出了使用主动学习方法来利用学习过程，包括提出一种新的主动学习策略。我们提出的策略的主要区别在于分类器以极其积极的方式参与其学习过程。因此，我们可以更好地最大化和优先考虑分类器在每次迭代中获得的知识，在选择更具信息量的样本时，以更适当和有用的方式利用这些知识。为此，在我们的选择标准中，我们非常重视分类器提出的分类。此外，我们通过对样本进行细粒度分析，联合分类器的参与和知识，考虑不确定性和代表性标准。实验结果表明，我们的新主动学习方法在考虑了几个监督分类器后，优于最先进的主动学习方法。因此，以更好的方式处理真实数据集问题，在标注任务和更高的准确率之间实现平衡。

相似文献

Exploring Active Learning Based on Representativeness and Uncertainty for Biomedical Data Classification.

IEEE J Biomed Health Inform. 2019 Nov;23(6):2238-2244. doi: 10.1109/JBHI.2018.2881155. Epub 2018 Nov 13.

Heart sound classification using the SNMFNet classifier.

Physiol Meas. 2019 Oct 30;40(10):105003. doi: 10.1088/1361-6579/ab45c8.

Reviewing ensemble classification methods in breast cancer.

Comput Methods Programs Biomed. 2019 Aug;177:89-112. doi: 10.1016/j.cmpb.2019.05.019. Epub 2019 May 20.

Active learning based segmentation of Crohns disease from abdominal MRI.

Comput Methods Programs Biomed. 2016 May;128:75-85. doi: 10.1016/j.cmpb.2016.01.014. Epub 2016 Feb 26.

Fine-grained leukocyte classification with deep residual learning for microscopic images.

Comput Methods Programs Biomed. 2018 Aug;162:243-252. doi: 10.1016/j.cmpb.2018.05.024. Epub 2018 May 22.

Optimism in Active Learning.

Comput Intell Neurosci. 2015;2015:973696. doi: 10.1155/2015/973696. Epub 2015 Nov 23.

A novel end-to-end classifier using domain transferred deep convolutional neural networks for biomedical images.

Comput Methods Programs Biomed. 2017 Mar;140:283-293. doi: 10.1016/j.cmpb.2016.12.019. Epub 2017 Jan 6.

Multimodal manifold-regularized transfer learning for MCI conversion prediction.

Brain Imaging Behav. 2015 Dec;9(4):913-26. doi: 10.1007/s11682-015-9356-x.

IntelliHealth: A medical decision support application using a novel weighted multi-layer classifier ensemble framework.

J Biomed Inform. 2016 Feb;59:185-200. doi: 10.1016/j.jbi.2015.12.001. Epub 2015 Dec 15.

A novel biomedical image indexing and retrieval system via deep preference learning.

Comput Methods Programs Biomed. 2018 May;158:53-69. doi: 10.1016/j.cmpb.2018.02.003. Epub 2018 Feb 6.

引用本文的文献

Active Learning Improves Ionization Efficiency Predictions and Quantification in Nontargeted LC/HRMS.

Anal Chem. 2025 Jul 1;97(25):13131-13139. doi: 10.1021/acs.analchem.5c00816. Epub 2025 Jun 13.

Rethinking Domain-Specific Pretraining by Supervised or Self-Supervised Learning for Chest Radiograph Classification: A Comparative Study Against ImageNet Counterparts in Cold-Start Active Learning.

Health Care Sci. 2025 Apr 6;4(2):110-143. doi: 10.1002/hcs2.70009. eCollection 2025 Apr.

Deep active learning with high structural discriminability for molecular mutagenicity prediction.

Commun Biol. 2024 Aug 31;7(1):1071. doi: 10.1038/s42003-024-06758-6.

Assessment of clustering techniques to support the analyses of soybean seed vigor.

PLoS One. 2023 Aug 25;18(8):e0285566. doi: 10.1371/journal.pone.0285566. eCollection 2023.

Integrated Random Negative Sampling and Uncertainty Sampling in Active Learning Improve Clinical Drug Safety Drug-Drug Interaction Information Retrieval.

Front Pharmacol. 2021 Apr 23;11:582470. doi: 10.3389/fphar.2020.582470. eCollection 2020.

Photoplethysmography based atrial fibrillation detection: a review.

NPJ Digit Med. 2020 Jan 10;3:3. doi: 10.1038/s41746-019-0207-9. eCollection 2020.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于代表性和不确定性的生物医学数据分类的主动学习研究。

Exploring Active Learning Based on Representativeness and Uncertainty for Biomedical Data Classification.

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献