Suppr超能文献

一种新型的逻辑回归模型,结合了半监督学习和主动学习的疾病分类方法。

A novel logistic regression model combining semi-supervised learning and active learning for disease classification.

机构信息

Faculty of Information Technology & State Key Laboratory of Quality Research in Chinese Medicines, Macau University of Science and Technology, Avenida Wai Long, Taipa, Macau, 999078, China.

出版信息

Sci Rep. 2018 Aug 29;8(1):13009. doi: 10.1038/s41598-018-31395-5.

Abstract

Traditional supervised learning classifier needs a lot of labeled samples to achieve good performance, however in many biological datasets there is only a small size of labeled samples and the remaining samples are unlabeled. Labeling these unlabeled samples manually is difficult or expensive. Technologies such as active learning and semi-supervised learning have been proposed to utilize the unlabeled samples for improving the model performance. However in active learning the model suffers from being short-sighted or biased and some manual workload is still needed. The semi-supervised learning methods are easy to be affected by the noisy samples. In this paper we propose a novel logistic regression model based on complementarity of active learning and semi-supervised learning, for utilizing the unlabeled samples with least cost to improve the disease classification accuracy. In addition to that, an update pseudo-labeled samples mechanism is designed to reduce the false pseudo-labeled samples. The experiment results show that this new model can achieve better performances compared the widely used semi-supervised learning and active learning methods in disease classification and gene selection.

摘要

传统的监督学习分类器需要大量的标记样本才能获得良好的性能,但在许多生物数据集,只有少量的标记样本,而其余的样本是未标记的。手动标记这些未标记的样本是困难或昂贵的。因此,提出了主动学习和半监督学习等技术,以利用未标记的样本来提高模型性能。然而,在主动学习中,模型存在目光短浅或偏见的问题,仍然需要一定的人工工作量。半监督学习方法容易受到噪声样本的影响。在本文中,我们提出了一种基于主动学习和半监督学习互补性的新型逻辑回归模型,用于以最小的成本利用未标记的样本,以提高疾病分类准确性。此外,还设计了一种更新伪标记样本的机制,以减少错误的伪标记样本。实验结果表明,与疾病分类和基因选择中广泛使用的半监督学习和主动学习方法相比,这种新模型可以取得更好的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/53ed/6115447/66bc1886e4e7/41598_2018_31395_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验