• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

半监督特征选择的简单策略。

Simple strategies for semi-supervised feature selection.

作者信息

Sechidis Konstantinos, Brown Gavin

机构信息

School of Computer Science, University of Manchester, Manchester, M13 9PL UK.

出版信息

Mach Learn. 2018;107(2):357-395. doi: 10.1007/s10994-017-5648-2. Epub 2017 Jul 17.

DOI:10.1007/s10994-017-5648-2
PMID:31983804
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6954040/
Abstract

What is the simplest thing you can do to solve a problem? In the context of semi-supervised feature selection, we tackle exactly this-how much we can gain from two simple strategies. If we have some binary labelled data and some unlabelled, we could assume the unlabelled data are all positives, or assume them all negatives. These minimalist, seemingly naive, approaches have not previously been studied in depth. However, with theoretical and empirical studies, we show they provide powerful results for feature selection, via hypothesis testing and feature ranking. Combining them with some "soft" prior knowledge of the domain, we derive two novel algorithms (-JMI, -IAMB) that outperform significantly more complex competing methods, showing particularly good performance when the labels are missing-not-at-random. We conclude that simple approaches to this problem can work surprisingly well, and in many situations we can provably recover the exact feature selection dynamics, .

摘要

为解决一个问题,你能做的最简单的事情是什么?在半监督特征选择的背景下,我们恰恰要解决这个问题——从两种简单策略中我们能获得多少收益。如果我们有一些二元标记数据和一些未标记数据,我们可以假设未标记数据全是正例,或者假设它们全是反例。这些极简主义的、看似天真的方法此前尚未得到深入研究。然而,通过理论和实证研究,我们表明它们通过假设检验和特征排序为特征选择提供了强大的结果。将它们与该领域的一些“软”先验知识相结合,我们推导出了两种新颖的算法(-JMI,-IAMB),它们的性能显著优于更复杂的竞争方法,在标签非随机缺失时表现尤为出色。我们得出结论,针对这个问题的简单方法可能会出奇地有效,并且在许多情况下我们可以证明能够恢复精确的特征选择动态。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/b173ebde2345/10994_2017_5648_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/2908561b2843/10994_2017_5648_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/3047d6b57740/10994_2017_5648_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/3613201cdf1a/10994_2017_5648_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/1b1d5916be8f/10994_2017_5648_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/9a639ecd93fe/10994_2017_5648_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/2daf0f6d09ab/10994_2017_5648_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/771825e22d7d/10994_2017_5648_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/0afe0ff7714a/10994_2017_5648_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/14917acede6c/10994_2017_5648_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/250136d98187/10994_2017_5648_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/93ea04f79124/10994_2017_5648_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/99d5d172f978/10994_2017_5648_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/71477a606f95/10994_2017_5648_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/b173ebde2345/10994_2017_5648_Fig14_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/2908561b2843/10994_2017_5648_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/3047d6b57740/10994_2017_5648_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/3613201cdf1a/10994_2017_5648_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/1b1d5916be8f/10994_2017_5648_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/9a639ecd93fe/10994_2017_5648_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/2daf0f6d09ab/10994_2017_5648_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/771825e22d7d/10994_2017_5648_Fig7_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/0afe0ff7714a/10994_2017_5648_Fig8_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/14917acede6c/10994_2017_5648_Fig9_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/250136d98187/10994_2017_5648_Fig10_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/93ea04f79124/10994_2017_5648_Fig11_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/99d5d172f978/10994_2017_5648_Fig12_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/71477a606f95/10994_2017_5648_Fig13_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9ec1/6954040/b173ebde2345/10994_2017_5648_Fig14_HTML.jpg

相似文献

1
Simple strategies for semi-supervised feature selection.半监督特征选择的简单策略。
Mach Learn. 2018;107(2):357-395. doi: 10.1007/s10994-017-5648-2. Epub 2017 Jul 17.
2
Discriminative semi-supervised feature selection via manifold regularization.基于流形正则化的判别式半监督特征选择
IEEE Trans Neural Netw. 2010 Jul;21(7):1033-47. doi: 10.1109/TNN.2010.2047114. Epub 2010 Jun 21.
3
Learning image features with fewer labels using a semi-supervised deep convolutional network.使用半监督深度卷积网络学习具有较少标签的图像特征。
Neural Netw. 2020 Dec;132:131-143. doi: 10.1016/j.neunet.2020.08.016. Epub 2020 Aug 25.
4
Semi-supervised learning for ordinal Kernel Discriminant Analysis.基于半监督学习的有序核判别分析。
Neural Netw. 2016 Dec;84:57-66. doi: 10.1016/j.neunet.2016.08.004. Epub 2016 Aug 25.
5
Bias and Stability of Single Variable Classifiers for Feature Ranking and Selection.用于特征排序与选择的单变量分类器的偏差与稳定性
Expert Syst Appl. 2014 Nov 1;14(15):6945-6958. doi: 10.1016/j.eswa.2014.05.007.
6
Adaptive Semi-Supervised Classifier Ensemble for High Dimensional Data Classification.高维数据分类的自适应半监督分类器集成。
IEEE Trans Cybern. 2019 Feb;49(2):366-379. doi: 10.1109/TCYB.2017.2761908. Epub 2017 Oct 26.
7
Cost-sensitive learning for semi-supervised hit-and-run analysis.基于代价敏感学习的半监督命中-逃离分析。
Accid Anal Prev. 2021 Aug;158:106199. doi: 10.1016/j.aap.2021.106199. Epub 2021 May 25.
8
IMMAN: free software for information theory-based chemometric analysis.IMMAN:用于基于信息论的化学计量学分析的免费软件。
Mol Divers. 2015 May;19(2):305-19. doi: 10.1007/s11030-014-9565-z. Epub 2015 Jan 26.
9
Feature selection and semi-supervised clustering using multiobjective optimization.使用多目标优化的特征选择与半监督聚类
Springerplus. 2014 Aug 26;3:465. doi: 10.1186/2193-1801-3-465. eCollection 2014.
10
A Semi-Supervised Transfer Learning with Grid Segmentation for Outdoor Localization over LoRaWans.基于网格分割的半监督迁移学习在 LoRaWans 上的室外定位。
Sensors (Basel). 2021 Apr 9;21(8):2640. doi: 10.3390/s21082640.

引用本文的文献

1
Combining meta and ensemble learning to classify EEG for seizure detection.结合元学习和集成学习对脑电图进行分类以检测癫痫发作。
Sci Rep. 2025 Mar 28;15(1):10755. doi: 10.1038/s41598-025-88270-3.
2
Motor Imagery EEG Classification Based on Decision Tree Framework and Riemannian Geometry.基于决策树框架和黎曼几何的运动想象脑电分类。
Comput Intell Neurosci. 2019 Jan 21;2019:5627156. doi: 10.1155/2019/5627156. eCollection 2019.
3
Distinguishing prognostic and predictive biomarkers: an information theoretic approach.区分预后和预测生物标志物:一种信息论方法。

本文引用的文献

1
Semisupervised Feature Selection Based on Relevance and Redundancy Criteria.基于相关性和冗余性准则的半监督特征选择。
IEEE Trans Neural Netw Learn Syst. 2017 Sep;28(9):1974-1984. doi: 10.1109/TNNLS.2016.2562670. Epub 2016 May 20.
2
MINT: Mutual Information Based Transductive Feature Selection for Genetic Trait Prediction.MINT:基于互信息的转导特征选择用于遗传性状预测。
IEEE/ACM Trans Comput Biol Bioinform. 2016 May-Jun;13(3):578-83. doi: 10.1109/TCBB.2015.2448071.
3
Contrastive Pessimistic Likelihood Estimation for Semi-Supervised Classification.
Bioinformatics. 2018 Oct 1;34(19):3365-3376. doi: 10.1093/bioinformatics/bty357.
对比悲观似然估计在半监督分类中的应用。
IEEE Trans Pattern Anal Mach Intell. 2016 Mar;38(3):462-75. doi: 10.1109/TPAMI.2015.2452921.
4
Supervised, Unsupervised, and Semi-Supervised Feature Selection: A Review on Gene Selection.监督式、无监督式和半监督式特征选择:基因选择综述
IEEE/ACM Trans Comput Biol Bioinform. 2016 Sep-Oct;13(5):971-989. doi: 10.1109/TCBB.2015.2478454. Epub 2015 Sep 14.
5
Towards Making Unlabeled Data Never Hurt.迈向让无标签数据不再造成伤害。
IEEE Trans Pattern Anal Mach Intell. 2015 Jan;37(1):175-88. doi: 10.1109/TPAMI.2014.2299812.
6
Semi-supervised learning of class balance under class-prior change by distribution matching.通过分布匹配在类先验变化下进行类平衡的半监督学习。
Neural Netw. 2014 Feb;50:110-9. doi: 10.1016/j.neunet.2013.11.010. Epub 2013 Nov 18.
7
Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy.基于互信息的特征选择:最大依赖、最大相关和最小冗余准则。
IEEE Trans Pattern Anal Mach Intell. 2005 Aug;27(8):1226-38. doi: 10.1109/TPAMI.2005.159.