• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

相似文献

1
Clustering high-dimensional data via feature selection.基于特征选择的高维数据聚类。
Biometrics. 2023 Jun;79(2):940-950. doi: 10.1111/biom.13665. Epub 2022 Apr 22.
2
Simultaneous estimation of cluster number and feature sparsity in high-dimensional cluster analysis.高维聚类分析中同时估计簇数和特征稀疏性。
Biometrics. 2022 Jun;78(2):574-585. doi: 10.1111/biom.13449. Epub 2021 Mar 15.
3
Balanced Spectral Feature Selection.均衡谱特征选择。
IEEE Trans Cybern. 2023 Jul;53(7):4232-4244. doi: 10.1109/TCYB.2022.3160244. Epub 2023 Jun 15.
4
A novel biomarker selection method combining graph neural network and gene relationships applied to microarray data.一种结合图神经网络和基因关系的新型生物标志物选择方法,应用于微阵列数据。
BMC Bioinformatics. 2022 Jul 26;23(1):303. doi: 10.1186/s12859-022-04848-y.
5
Multi-view projected clustering with graph learning.基于图学习的多视图投影聚类。
Neural Netw. 2020 Jun;126:335-346. doi: 10.1016/j.neunet.2020.03.020. Epub 2020 Apr 1.
6
Joint feature selection and optimal bipartite graph learning for subspace clustering.基于联合特征选择和最优二分图学习的子空间聚类。
Neural Netw. 2023 Jul;164:408-418. doi: 10.1016/j.neunet.2023.04.044. Epub 2023 May 5.
7
Unsupervised Feature Selection via Nonnegative Spectral Analysis and Redundancy Control.非负谱分析和冗余控制的无监督特征选择。
IEEE Trans Image Process. 2015 Dec;24(12):5343-55. doi: 10.1109/TIP.2015.2479560. Epub 2015 Sep 17.
8
Spectral embedded clustering: a framework for in-sample and out-of-sample spectral clustering.谱嵌入聚类:一种用于样本内和样本外谱聚类的框架。
IEEE Trans Neural Netw. 2011 Nov;22(11):1796-808. doi: 10.1109/TNN.2011.2162000. Epub 2011 Sep 29.
9
An improved binary particle swarm optimization algorithm for clinical cancer biomarker identification in microarray data.一种用于微阵列数据中临床癌症生物标志物识别的改进二元粒子群优化算法。
Comput Methods Programs Biomed. 2024 Feb;244:107987. doi: 10.1016/j.cmpb.2023.107987. Epub 2023 Dec 21.
10
Medical data mining in sentiment analysis based on optimized swarm search feature selection.基于优化群体搜索特征选择的情感分析中的医学数据挖掘
Australas Phys Eng Sci Med. 2018 Dec;41(4):1087-1100. doi: 10.1007/s13246-018-0674-3. Epub 2018 Sep 11.

引用本文的文献

1
A comprehensive review of machine learning algorithms and their application in geriatric medicine: present and future.机器学习算法及其在老年医学中的应用的全面综述:现状与未来。
Aging Clin Exp Res. 2023 Nov;35(11):2363-2397. doi: 10.1007/s40520-023-02552-2. Epub 2023 Sep 8.
2
Recognizing the Differentiation Degree of Human Induced Pluripotent Stem Cell-Derived Retinal Pigment Epithelium Cells Using Machine Learning and Deep Learning-Based Approaches.利用机器学习和深度学习方法识别人诱导多能干细胞衍生的视网膜色素上皮细胞的分化程度。
Cells. 2023 Jan 4;12(2):211. doi: 10.3390/cells12020211.

基于特征选择的高维数据聚类。

Clustering high-dimensional data via feature selection.

机构信息

Google Research, New York, New York, USA.

Two Sigma Investments, New York, New York, USA.

出版信息

Biometrics. 2023 Jun;79(2):940-950. doi: 10.1111/biom.13665. Epub 2022 Apr 22.

DOI:10.1111/biom.13665
PMID:35338489
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10119907/
Abstract

High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called spectral clustering with feature selection (SC-FS), where we first obtain an initial estimate of labels via spectral clustering, then select a small fraction of features with the largest R-squared with these labels, that is, the proportion of variation explained by group labels, and conduct clustering again using selected features. Under mild conditions, we prove that the proposed method identifies all informative features with high probability and achieves the minimax optimal clustering error rate for the sparse Gaussian mixture model. Applications of SC-FS to four real-world datasets demonstrate its usefulness in clustering high-dimensional data.

摘要

高维聚类分析是统计学和机器学习中的一个具有挑战性的问题,具有广泛的应用,如微阵列数据和 RNA-seq 数据的分析。在本文中,我们提出了一种新的聚类方法,称为带特征选择的谱聚类(SC-FS),其中我们首先通过谱聚类获得标签的初始估计,然后选择具有最大 R 平方的一小部分特征与这些标签,即组标签解释的方差比例,并使用选择的特征再次进行聚类。在温和的条件下,我们证明了所提出的方法以高概率识别所有信息丰富的特征,并为稀疏高斯混合模型实现了最优的聚类误差率。SC-FS 在四个真实数据集上的应用表明了它在高维数据聚类中的有用性。