一种新型的逻辑回归模型，结合了半监督学习和主动学习的疾病分类方法。

A novel logistic regression model combining semi-supervised learning and active learning for disease classification.

机构信息

Faculty of Information Technology & State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, 999078, China.

出版信息

Sci Rep. 2018 Aug 29;8(1):13009. doi: 10.1038/s41598-018-31395-5.

DOI:10.1038/s41598-018-31395-5

PMID:30158596

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6115447/

Abstract

Traditional supervised learning classifier needs a lot of labeled samples to achieve good performance, however in many biological datasets there is only a small size of labeled samples and the remaining samples are unlabeled. Labeling these unlabeled samples manually is difficult or expensive. Technologies such as active learning and semi-supervised learning have been proposed to utilize the unlabeled samples for improving the model performance. However in active learning the model suffers from being short-sighted or biased and some manual workload is still needed. The semi-supervised learning methods are easy to be affected by the noisy samples. In this paper we propose a novel logistic regression model based on complementarity of active learning and semi-supervised learning, for utilizing the unlabeled samples with least cost to improve the disease classification accuracy. In addition to that, an update pseudo-labeled samples mechanism is designed to reduce the false pseudo-labeled samples. The experiment results show that this new model can achieve better performances compared the widely used semi-supervised learning and active learning methods in disease classification and gene selection.

摘要

传统的监督学习分类器需要大量的标记样本才能获得良好的性能，但在许多生物数据集，只有少量的标记样本，而其余的样本是未标记的。手动标记这些未标记的样本是困难或昂贵的。因此，提出了主动学习和半监督学习等技术，以利用未标记的样本来提高模型性能。然而，在主动学习中，模型存在目光短浅或偏见的问题，仍然需要一定的人工工作量。半监督学习方法容易受到噪声样本的影响。在本文中，我们提出了一种基于主动学习和半监督学习互补性的新型逻辑回归模型，用于以最小的成本利用未标记的样本，以提高疾病分类准确性。此外，还设计了一种更新伪标记样本的机制，以减少错误的伪标记样本。实验结果表明，与疾病分类和基因选择中广泛使用的半监督学习和主动学习方法相比，这种新模型可以取得更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53ed/6115447/66bc1886e4e7/41598_2018_31395_Fig1_HTML.jpg

相似文献

A novel logistic regression model combining semi-supervised learning and active learning for disease classification.一种新型的逻辑回归模型，结合了半监督学习和主动学习的疾病分类方法。

Sci Rep. 2018 Aug 29;8(1):13009. doi: 10.1038/s41598-018-31395-5.

Multi-class motor imagery EEG classification using collaborative representation-based semi-supervised extreme learning machine.基于协同表示的半监督极限学习机的多类运动想象 EEG 分类。

Med Biol Eng Comput. 2020 Sep;58(9):2119-2130. doi: 10.1007/s11517-020-02227-4. Epub 2020 Jul 16.

CPSS: Fusing consistency regularization and pseudo-labeling techniques for semi-supervised deep cardiovascular disease detection using all unlabeled electrocardiograms.CPSS：利用所有未标记的心电图进行半监督深度心血管疾病检测的一致性正则化和伪标记技术融合。

Comput Methods Programs Biomed. 2024 Sep;254:108315. doi: 10.1016/j.cmpb.2024.108315. Epub 2024 Jul 4.

ℓ-norm based safe semi-supervised learning.基于 l-范数的安全半监督学习。

Math Biosci Eng. 2021 Sep 7;18(6):7727-7742. doi: 10.3934/mbe.2021383.

A Generic Semi-Supervised and Active Learning Framework for Biomedical Text Classification.一种用于生物医学文本分类的通用半监督和主动学习框架。

Annu Int Conf IEEE Eng Med Biol Soc. 2022 Jul;2022:4445-4448. doi: 10.1109/EMBC48229.2022.9871846.

Semantic contrast with uncertainty-aware pseudo label for lumbar semi-supervised classification.基于具有不确定性感知的伪标签的语义对比进行腰椎半监督分类。

Comput Biol Med. 2024 Aug;178:108754. doi: 10.1016/j.compbiomed.2024.108754. Epub 2024 Jun 15.

Weakly Semi-supervised phenotyping using Electronic Health records.基于电子健康记录的弱监督表型研究

J Biomed Inform. 2022 Oct;134:104175. doi: 10.1016/j.jbi.2022.104175. Epub 2022 Sep 5.

FaxMatch: Multi-Curriculum Pseudo-Labeling for semi-supervised medical image classification.FaxMatch：用于半监督医学图像分类的多课程伪标签

Med Phys. 2023 May;50(5):3210-3222. doi: 10.1002/mp.16312. Epub 2023 Feb 21.

Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。

BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.

Semi-supervised classifier guided by discriminator.基于判别器的半监督分类器。

Sci Rep. 2022 Aug 29;12(1):14665. doi: 10.1038/s41598-022-18947-6.

引用本文的文献

Artificial intelligence for dementia prevention.人工智能在预防痴呆中的应用。

Alzheimers Dement. 2023 Dec;19(12):5952-5969. doi: 10.1002/alz.13463. Epub 2023 Oct 14.

A clinician's guide to understanding and critically appraising machine learning studies: a checklist for Ruling Out Bias Using Standard Tools in Machine Learning (ROBUST-ML).临床医生理解和批判性评估机器学习研究指南：使用机器学习标准工具排除偏倚的清单（ROBUST-ML）

Eur Heart J Digit Health. 2022 Apr 12;3(2):125-140. doi: 10.1093/ehjdh/ztac016. eCollection 2022 Jun.

Active semi-supervised learning for biological data classification.生物数据分类的主动半监督学习。

PLoS One. 2020 Aug 19;15(8):e0237428. doi: 10.1371/journal.pone.0237428. eCollection 2020.

本文引用的文献

Neuron-specific enolase, histopathological types, and age as risk factors for bone metastases in lung cancer.神经元特异性烯醇化酶、组织病理学类型及年龄作为肺癌骨转移的危险因素

Tumour Biol. 2017 Jul;39(7):1010428317714194. doi: 10.1177/1010428317714194.

Role of microRNA-7 and selenoprotein P in hepatocellular carcinoma.微小RNA-7和硒蛋白P在肝细胞癌中的作用

Tumour Biol. 2017 May;39(5):1010428317698372. doi: 10.1177/1010428317698372.

Ror2, a Developmentally Regulated Kinase, Is Associated With Tumor Growth, Apoptosis, Migration, and Invasion in Renal Cell Carcinoma.Ror2是一种受发育调控的激酶，与肾细胞癌的肿瘤生长、凋亡、迁移和侵袭相关。

Oncol Res. 2017 Jan 26;25(2):195-205. doi: 10.3727/096504016X14732772150424.

MDM4 is a rational target for treating breast cancers with mutant p53.MDM4 是治疗携带突变型 p53 的乳腺癌的合理靶点。

J Pathol. 2017 Apr;241(5):661-670. doi: 10.1002/path.4877. Epub 2017 Mar 1.

Active Self-Paced Learning for Cost-Effective and Progressive Face Identification.主动自定步速学习实现具有成本效益的渐进式人脸识别。

IEEE Trans Pattern Anal Mach Intell. 2018 Jan;40(1):7-19. doi: 10.1109/TPAMI.2017.2652459. Epub 2017 Jan 16.

The AP-1 transcription factor JunB is essential for multiple myeloma cell proliferation and drug resistance in the bone marrow microenvironment.AP-1 转录因子 JunB 对多发性骨髓瘤细胞在骨髓微环境中的增殖和耐药性是必需的。

Leukemia. 2017 Jul;31(7):1570-1581. doi: 10.1038/leu.2016.358. Epub 2016 Nov 28.

Towards Making Unlabeled Data Never Hurt.迈向让无标签数据不再造成伤害。

IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):175-88. doi: 10.1109/TPAMI.2014.2299812.

Serial expression analysis of breast tumors during neoadjuvant chemotherapy reveals changes in cell cycle and immune pathways associated with recurrence and response.新辅助化疗期间乳腺肿瘤的系列表达分析揭示了与复发和反应相关的细胞周期和免疫途径的变化。

Breast Cancer Res. 2015 May 29;17(1):73. doi: 10.1186/s13058-015-0582-3.

Increased MTHFD2 expression is associated with poor prognosis in breast cancer.MTHFD2表达增加与乳腺癌预后不良相关。

Tumour Biol. 2014 Sep;35(9):8685-90. doi: 10.1007/s13277-014-2111-x. Epub 2014 May 29.

hABCF3, a TPD52L2 interacting partner, enhances the proliferation of human liver cancer cell lines in vitro.hABCF3，一个与 TPD52L2 相互作用的伙伴，增强了人肝癌细胞系在体外的增殖。

Mol Biol Rep. 2013 Oct;40(10):5759-67. doi: 10.1007/s11033-013-2679-z. Epub 2013 Sep 20.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一种新型的逻辑回归模型，结合了半监督学习和主动学习的疾病分类方法。

A novel logistic regression model combining semi-supervised learning and active learning for disease classification.

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献