• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于图拉普拉斯算子和双稀疏约束的主成分分析用于多视图数据的特征选择和样本聚类

Principal Component Analysis Based on Graph Laplacian and Double Sparse Constraints for Feature Selection and Sample Clustering on Multi-View Data.

作者信息

Wu Ming-Juan, Gao Ying-Lian, Liu Jin-Xing, Zhu Rong, Wang Juan

机构信息

School of Information Science and Engineering, Qufu Normal University, Rizhao, China.

Library of Qufu Normal University, Qufu Normal University, Rizhao, China,

出版信息

Hum Hered. 2019;84(1):47-58. doi: 10.1159/000501653. Epub 2019 Aug 29.

DOI:10.1159/000501653
PMID:31466072
Abstract

Principal component analysis (PCA) is a widely used method for evaluating low-dimensional data. Some variants of PCA have been proposed to improve the interpretation of the principal components (PCs). One of the most common methods is sparse PCA which aims at finding a sparse basis to improve the interpretability over the dense basis of PCA. However, the performances of these improved methods are still far from satisfactory because the data still contain redundant PCs. In this paper, a novel method called PCA based on graph Laplacian and double sparse constraints (GDSPCA) is proposed to improve the interpretation of the PCs and consider the internal geometry of the data. In detail, GDSPCA utilizes L2,1-norm and L1-norm regularization terms simultaneously to enforce the matrix to be sparse by filtering redundant and irrelative PCs, where the L2,1-norm regularization term can produce row sparsity, while the L1-norm regularization term can enforce element sparsity. This way, we can make a better interpretation of the new PCs in low-dimensional subspace. Meanwhile, the method of GDSPCA integrates graph Laplacian into PCA to explore the geometric structure hidden in the data. A simple and effective optimization solution is provided. Extensive experiments on multi-view biological data demonstrate the feasibility and effectiveness of the proposed approach.

摘要

主成分分析(PCA)是一种广泛用于评估低维数据的方法。人们提出了一些PCA的变体来改进主成分(PC)的可解释性。最常用的方法之一是稀疏PCA,其目的是找到一个稀疏基,以提高相对于PCA密集基的可解释性。然而,这些改进方法的性能仍然远不能令人满意,因为数据中仍然包含冗余的主成分。本文提出了一种基于图拉普拉斯算子和双稀疏约束的PCA新方法(GDSPCA),以改进主成分的可解释性并考虑数据的内部几何结构。具体而言,GDSPCA同时利用L2,1范数和L1范数正则化项,通过过滤冗余和不相关的主成分来强制矩阵稀疏,其中L2,1范数正则化项可产生行稀疏性,而L1范数正则化项可强制元素稀疏性。通过这种方式,我们可以在低维子空间中对新的主成分进行更好的解释。同时,GDSPCA方法将图拉普拉斯算子集成到PCA中,以探索隐藏在数据中的几何结构。并提供了一种简单有效的优化解决方案。对多视图生物数据进行的大量实验证明了所提方法的可行性和有效性。

相似文献

1
Principal Component Analysis Based on Graph Laplacian and Double Sparse Constraints for Feature Selection and Sample Clustering on Multi-View Data.基于图拉普拉斯算子和双稀疏约束的主成分分析用于多视图数据的特征选择和样本聚类
Hum Hered. 2019;84(1):47-58. doi: 10.1159/000501653. Epub 2019 Aug 29.
2
PCA via joint graph Laplacian and sparse constraint: Identification of differentially expressed genes and sample clustering on gene expression data.基于联合图拉普拉斯和稀疏约束的主成分分析:在基因表达数据上进行差异表达基因的识别和样本聚类。
BMC Bioinformatics. 2019 Dec 30;20(Suppl 22):716. doi: 10.1186/s12859-019-3229-z.
3
Robust Principal Component Analysis Based On Hypergraph Regularization for Sample Clustering and Co-Characteristic Gene Selection.基于超图正则化的鲁棒主成分分析用于样本聚类和共特征基因选择。
IEEE/ACM Trans Comput Biol Bioinform. 2022 Jul-Aug;19(4):2420-2430. doi: 10.1109/TCBB.2021.3065054. Epub 2022 Aug 8.
4
Joint Lp-Norm and L-Norm Constrained Graph Laplacian PCA for Robust Tumor Sample Clustering and Gene Network Module Discovery.用于鲁棒肿瘤样本聚类和基因网络模块发现的联合Lp范数和L范数约束图拉普拉斯主成分分析
Front Genet. 2021 Feb 23;12:621317. doi: 10.3389/fgene.2021.621317. eCollection 2021.
5
PCA Based on Graph Laplacian Regularization and P-Norm for Gene Selection and Clustering.基于图拉普拉斯正则化和P范数的主成分分析用于基因选择和聚类
IEEE Trans Nanobioscience. 2017 Jun;16(4):257-265. doi: 10.1109/TNB.2017.2690365. Epub 2017 Mar 31.
6
Joint L-norm and random walk graph constrained PCA for single-cell RNA-seq data.基于联合 L 范数和随机游走图约束的 PCA 方法在单细胞 RNA-seq 数据分析中的应用。
Comput Methods Biomech Biomed Engin. 2024 Jan-Mar;27(4):498-511. doi: 10.1080/10255842.2023.2188106. Epub 2023 Mar 13.
7
Robust hypergraph regularized non-negative matrix factorization for sample clustering and feature selection in multi-view gene expression data.用于多视图基因表达数据中样本聚类和特征选择的鲁棒超图正则化非负矩阵分解。
Hum Genomics. 2019 Oct 22;13(Suppl 1):46. doi: 10.1186/s40246-019-0222-6.
8
DSTPCA: Double-Sparse Constrained Tensor Principal Component Analysis Method for Feature Selection.DSTPCA:用于特征选择的双稀疏约束张量主成分分析方法
IEEE/ACM Trans Comput Biol Bioinform. 2021 Jul-Aug;18(4):1481-1491. doi: 10.1109/TCBB.2019.2943459. Epub 2021 Aug 6.
9
Enhancing Characteristic Gene Selection and Tumor Classification by the Robust Laplacian Supervised Discriminative Sparse PCA.基于鲁棒拉普拉斯监督判别稀疏 PCA 的特征基因选择与肿瘤分类
J Chem Inf Model. 2022 Apr 11;62(7):1794-1807. doi: 10.1021/acs.jcim.1c01403. Epub 2022 Mar 30.
10
Joint -Norm Constraint and Graph-Laplacian PCA Method for Feature Extraction.联合 - 范数约束和图拉普拉斯主成分分析方法的特征提取。
Biomed Res Int. 2017;2017:5073427. doi: 10.1155/2017/5073427. Epub 2017 Apr 2.

引用本文的文献

1
Accurate identification of single-cell types via correntropy-based Sparse PCA combining hypergraph and fusion similarity.基于核熵的稀疏主成分分析结合超图和融合相似性对单细胞类型进行准确识别。
J Appl Stat. 2024 Jul 21;52(2):356-380. doi: 10.1080/02664763.2024.2369955. eCollection 2025.