• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

半监督斜向预测聚类树

Semi-supervised oblique predictive clustering trees.

作者信息

Stepišnik Tomaž, Kocev Dragi

机构信息

Department of Knowledge Technologies, Jožef Stefan Institute, Ljubljana, Slovenia.

Jožef Stefan International Postgraduate School, Ljubljana, Slovenia.

出版信息

PeerJ Comput Sci. 2021 May 3;7:e506. doi: 10.7717/peerj-cs.506. eCollection 2021.

DOI:10.7717/peerj-cs.506
PMID:33987461
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8101547/
Abstract

Semi-supervised learning combines supervised and unsupervised learning approaches to learn predictive models from both labeled and unlabeled data. It is most appropriate for problems where labeled examples are difficult to obtain but unlabeled examples are readily available (e.g., drug repurposing). Semi-supervised predictive clustering trees (SSL-PCTs) are a prominent method for semi-supervised learning that achieves good performance on various predictive modeling tasks, including structured output prediction tasks. The main issue, however, is that the learning time scales quadratically with the number of features. In contrast to axis-parallel trees, which only use individual features to split the data, oblique predictive clustering trees (SPYCTs) use linear combinations of features. This makes the splits more flexible and expressive and often leads to better predictive performance. With a carefully designed criterion function, we can use efficient optimization techniques to learn oblique splits. In this paper, we propose semi-supervised oblique predictive clustering trees (SSL-SPYCTs). We adjust the split learning to take unlabeled examples into account while remaining efficient. The main advantage over SSL-PCTs is that the proposed method scales linearly with the number of features. The experimental evaluation confirms the theoretical computational advantage and shows that SSL-SPYCTs often outperform SSL-PCTs and supervised PCTs both in single-tree setting and ensemble settings. We also show that SSL-SPYCTs are better at producing meaningful feature importance scores than supervised SPYCTs when the amount of labeled data is limited.

摘要

半监督学习结合了监督学习和无监督学习方法,以便从有标签和无标签数据中学习预测模型。它最适用于难以获取有标签示例但无标签示例很容易获得的问题(例如,药物重新利用)。半监督预测聚类树(SSL - PCT)是半监督学习的一种突出方法,在各种预测建模任务(包括结构化输出预测任务)中都能取得良好性能。然而,主要问题是学习时间与特征数量呈二次方关系。与仅使用单个特征来分割数据的轴平行树不同,倾斜预测聚类树(SPYCT)使用特征的线性组合。这使得分割更加灵活且表现力更强,通常会带来更好的预测性能。通过精心设计的准则函数,我们可以使用高效的优化技术来学习倾斜分割。在本文中,我们提出了半监督倾斜预测聚类树(SSL - SPYCT)。我们调整分割学习以考虑无标签示例,同时保持高效。相对于SSL - PCT的主要优势在于,所提出的方法与特征数量呈线性关系。实验评估证实了理论上的计算优势,并表明SSL - SPYCT在单树设置和集成设置中通常都优于SSL - PCT和监督式PCT。我们还表明,当有标签数据量有限时,SSL - SPYCT在生成有意义的特征重要性分数方面比监督式SPYCT表现更好。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/9a18e17e5a70/peerj-cs-07-506-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/0a3788743a64/peerj-cs-07-506-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/51220cd000a9/peerj-cs-07-506-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/468aecc9a3ee/peerj-cs-07-506-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/88a8dbb22b4f/peerj-cs-07-506-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/41a425e21c63/peerj-cs-07-506-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/9a18e17e5a70/peerj-cs-07-506-g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/0a3788743a64/peerj-cs-07-506-g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/51220cd000a9/peerj-cs-07-506-g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/468aecc9a3ee/peerj-cs-07-506-g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/88a8dbb22b4f/peerj-cs-07-506-g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/41a425e21c63/peerj-cs-07-506-g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/aa33/8101547/9a18e17e5a70/peerj-cs-07-506-g006.jpg

相似文献

1
Semi-supervised oblique predictive clustering trees.半监督斜向预测聚类树
PeerJ Comput Sci. 2021 May 3;7:e506. doi: 10.7717/peerj-cs.506. eCollection 2021.
2
A Cluster-then-label Semi-supervised Learning Approach for Pathology Image Classification.一种基于聚类后标记的半监督学习方法在病理图像分类中的应用。
Sci Rep. 2018 May 8;8(1):7193. doi: 10.1038/s41598-018-24876-0.
3
A distributed semi-supervised learning algorithm based on manifold regularization using wavelet neural network.基于流形正则化的小波神经网络的分布式半监督学习算法。
Neural Netw. 2019 Oct;118:300-309. doi: 10.1016/j.neunet.2018.10.014. Epub 2018 Nov 14.
4
Survival analysis with semi-supervised predictive clustering trees.使用半监督预测聚类树的生存分析。
Comput Biol Med. 2022 Feb;141:105001. doi: 10.1016/j.compbiomed.2021.105001. Epub 2021 Nov 3.
5
ℓ-norm based safe semi-supervised learning.基于 l-范数的安全半监督学习。
Math Biosci Eng. 2021 Sep 7;18(6):7727-7742. doi: 10.3934/mbe.2021383.
6
Comprehensive study of semi-supervised learning for DNA methylation-based supervised classification of central nervous system tumors.基于 DNA 甲基化的中枢神经系统肿瘤有监督分类的半监督学习综合研究。
BMC Bioinformatics. 2022 Jun 8;23(1):223. doi: 10.1186/s12859-022-04764-1.
7
Semi Supervised Learning with Deep Embedded Clustering for Image Classification and Segmentation.用于图像分类和分割的深度嵌入聚类半监督学习
IEEE Access. 2019;7:11093-11104. doi: 10.1109/ACCESS.2019.2891970. Epub 2019 Jan 9.
8
Cross-Domain Semi-Supervised Learning Using Feature Formulation.使用特征公式化的跨域半监督学习
IEEE Trans Syst Man Cybern B Cybern. 2011 Dec;41(6):1627-38. doi: 10.1109/TSMCB.2011.2157999. Epub 2011 Jun 27.
9
Computerized breast cancer analysis system using three stage semi-supervised learning method.使用三阶段半监督学习方法的计算机化乳腺癌分析系统
Comput Methods Programs Biomed. 2016 Oct;135:77-88. doi: 10.1016/j.cmpb.2016.07.017. Epub 2016 Jul 8.
10
Generalizability of Self-Supervised Training Models for Digital Pathology: A Multicountry Comparison in Colorectal Cancer.基于数字病理的自监督训练模型的泛化能力:结直肠癌的多国比较。
JCO Clin Cancer Inform. 2023 Sep;7:e2200178. doi: 10.1200/CCI.22.00178.

本文引用的文献

1
From Local Explanations to Global Understanding with Explainable AI for Trees.利用可解释人工智能实现从局部解释到树木的全局理解
Nat Mach Intell. 2020 Jan;2(1):56-67. doi: 10.1038/s42256-019-0138-9. Epub 2020 Jan 17.