• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于小数据分类的尺度最优近似学习

Gauge-Optimal Approximate Learning for Small Data Classification.

作者信息

Vecchi Edoardo, Bassetti Davide, Graziato Fabio, Pospíšil Lukáš, Horenko Illia

机构信息

Università della Svizzera Italiana, Faculty of Informatics, Institute of Computing, 6962 Lugano, Switzerland

Technical University of Kaiserslautern, Faculty of Mathematics, Group of Mathematics of AI, 67663 Kaiserslautern, Germany

出版信息

Neural Comput. 2024 May 10;36(6):1198-1227. doi: 10.1162/neco_a_01664.

DOI:10.1162/neco_a_01664
PMID:38669692
Abstract

Small data learning problems are characterized by a significant discrepancy between the limited number of response variable observations and the large feature space dimension. In this setting, the common learning tools struggle to identify the features important for the classification task from those that bear no relevant information and cannot derive an appropriate learning rule that allows discriminating among different classes. As a potential solution to this problem, here we exploit the idea of reducing and rotating the feature space in a lower-dimensional gauge and propose the gauge-optimal approximate learning (GOAL) algorithm, which provides an analytically tractable joint solution to the dimension reduction, feature segmentation, and classification problems for small data learning problems. We prove that the optimal solution of the GOAL algorithm consists in piecewise-linear functions in the Euclidean space and that it can be approximated through a monotonically convergent algorithm that presents-under the assumption of a discrete segmentation of the feature space-a closed-form solution for each optimization substep and an overall linear iteration cost scaling. The GOAL algorithm has been compared to other state-of-the-art machine learning tools on both synthetic data and challenging real-world applications from climate science and bioinformatics (i.e., prediction of the El Niño Southern Oscillation and inference of epigenetically induced gene-activity networks from limited experimental data). The experimental results show that the proposed algorithm outperforms the reported best competitors for these problems in both learning performance and computational cost.

摘要

小数据学习问题的特点是响应变量观测数量有限与特征空间维度较大之间存在显著差异。在这种情况下,常见的学习工具难以从那些不包含相关信息的特征中识别出对分类任务重要的特征,并且无法得出能够区分不同类别的合适学习规则。作为解决此问题的一种潜在方法,我们在此利用在低维规范中减少和旋转特征空间的思想,并提出规范最优近似学习(GOAL)算法,该算法为小数据学习问题的降维、特征分割和分类问题提供了一种解析上易于处理的联合解决方案。我们证明,GOAL算法的最优解在于欧几里得空间中的分段线性函数,并且它可以通过一种单调收敛算法来近似,该算法在特征空间离散分割的假设下,为每个优化子步骤提供闭式解以及整体线性迭代成本缩放。GOAL算法已在合成数据以及来自气候科学和生物信息学的具有挑战性的实际应用(即厄尔尼诺南方涛动的预测和从有限实验数据推断表观遗传诱导的基因活性网络)上与其他先进的机器学习工具进行了比较。实验结果表明,对于这些问题,所提出的算法在学习性能和计算成本方面均优于已报道的最佳竞争对手。

相似文献

1
Gauge-Optimal Approximate Learning for Small Data Classification.用于小数据分类的尺度最优近似学习
Neural Comput. 2024 May 10;36(6):1198-1227. doi: 10.1162/neco_a_01664.
2
On a Scalable Entropic Breaching of the Overfitting Barrier for Small Data Problems in Machine Learning.基于机器学习中小数据问题的可扩展信息泄露突破过拟合障碍
Neural Comput. 2020 Aug;32(8):1563-1579. doi: 10.1162/neco_a_01296. Epub 2020 Jun 10.
3
eSPA+: Scalable Entropy-Optimal Machine Learning Classification for Small Data Problems.eSPA+:针对小数据问题的可扩展熵最优机器学习分类方法
Neural Comput. 2022 Apr 15;34(5):1220-1255. doi: 10.1162/neco_a_01490.
4
On cheap entropy-sparsified regression learning.关于廉价的熵稀疏回归学习。
Proc Natl Acad Sci U S A. 2023 Jan 3;120(1):e2214972120. doi: 10.1073/pnas.2214972120. Epub 2022 Dec 29.
5
Kernel-imbedded Gaussian processes for disease classification using microarray gene expression data.使用微阵列基因表达数据的用于疾病分类的核嵌入高斯过程。
BMC Bioinformatics. 2007 Feb 28;8:67. doi: 10.1186/1471-2105-8-67.
6
An Innovative Excited-ACS-IDGWO Algorithm for Optimal Biomedical Data Feature Selection.一种创新的基于激发 ACS-IDGWO 算法的最优生物医学数据特征选择方法。
Biomed Res Int. 2020 Aug 17;2020:8506365. doi: 10.1155/2020/8506365. eCollection 2020.
7
Optimizing multimodal feature selection using binary reinforced cuckoo search algorithm for improved classification performance.使用二进制增强布谷鸟搜索算法优化多模态特征选择以提高分类性能。
PeerJ Comput Sci. 2024 Jan 29;10:e1816. doi: 10.7717/peerj-cs.1816. eCollection 2024.
8
Asynchronous Parallel Large-Scale Gaussian Process Regression.异步并行大规模高斯过程回归
IEEE Trans Neural Netw Learn Syst. 2024 Jun;35(6):8683-8694. doi: 10.1109/TNNLS.2022.3200602. Epub 2024 Jun 3.
9
Feature Selection by Hybrid Brain Storm Optimization Algorithm for COVID-19 Classification.基于混合脑暴优化算法的 COVID-19 分类特征选择。
J Comput Biol. 2022 Jun;29(6):515-529. doi: 10.1089/cmb.2021.0256. Epub 2022 Apr 19.
10
Integrated Evolutionary Learning: An Artificial Intelligence Approach to Joint Learning of Features and Hyperparameters for Optimized, Explainable Machine Learning.集成进化学习:一种用于特征和超参数联合学习以实现优化、可解释机器学习的人工智能方法。
Front Artif Intell. 2022 Apr 5;5:832530. doi: 10.3389/frai.2022.832530. eCollection 2022.

引用本文的文献

1
On Entropic Learning from Noisy Time Series in the Small Data Regime.小数据条件下基于噪声时间序列的熵学习
Entropy (Basel). 2024 Jun 28;26(7):553. doi: 10.3390/e26070553.