文献检索，用中文搜 PubMed

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Suppr 超能文献

核心技术专利：CN118964589B侵权必究

Google Research, New York, New York, USA.

Two Sigma Investments, New York, New York, USA.

Biometrics. 2023 Jun;79(2):940-950. doi: 10.1111/biom.13665. Epub 2022 Apr 22.

High-dimensional clustering analysis is a challenging problem in statistics and machine learning, with broad applications such as the analysis of microarray data and RNA-seq data. In this paper, we propose a new clustering procedure called spectral clustering with feature selection (SC-FS), where we first obtain an initial estimate of labels via spectral clustering, then select a small fraction of features with the largest R-squared with these labels, that is, the proportion of variation explained by group labels, and conduct clustering again using selected features. Under mild conditions, we prove that the proposed method identifies all informative features with high probability and achieves the minimax optimal clustering error rate for the sparse Gaussian mixture model. Applications of SC-FS to four real-world datasets demonstrate its usefulness in clustering high-dimensional data.

高维聚类分析是统计学和机器学习中的一个具有挑战性的问题，具有广泛的应用，如微阵列数据和 RNA-seq 数据的分析。在本文中，我们提出了一种新的聚类方法，称为带特征选择的谱聚类（SC-FS），其中我们首先通过谱聚类获得标签的初始估计，然后选择具有最大 R 平方的一小部分特征与这些标签，即组标签解释的方差比例，并使用选择的特征再次进行聚类。在温和的条件下，我们证明了所提出的方法以高概率识别所有信息丰富的特征，并为稀疏高斯混合模型实现了最优的聚类误差率。SC-FS 在四个真实数据集上的应用表明了它在高维数据聚类中的有用性。

Google Research, New York, New York, USA.

Two Sigma Investments, New York, New York, USA.

Biometrics. 2023 Jun;79(2):940-950. doi: 10.1111/biom.13665. Epub 2022 Apr 22.

Suppr 超能文献

文献检索

文件翻译

深度研究

Suppr 超能文献

文献检索

文件翻译

深度研究

基于特征选择的高维数据聚类。

Clustering high-dimensional data via feature selection.

机构信息

出版信息

相似文献

引用本文的文献

相似文献

引用本文的文献

基于特征选择的高维数据聚类。

Clustering high-dimensional data via feature selection.

机构信息

出版信息