• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

稀疏偏最小二乘判别分析:用于多类问题的生物学相关特征选择和图形显示。

Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.

机构信息

Queensland Facility for Advanced Bioinformatics, University of Queensland, 4072 St Lucia, QLD, Australia.

出版信息

BMC Bioinformatics. 2011 Jun 22;12:253. doi: 10.1186/1471-2105-12-253.

DOI:10.1186/1471-2105-12-253
PMID:21693065
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3133555/
Abstract

BACKGROUND

Variable selection on high throughput biological data, such as gene expression or single nucleotide polymorphisms (SNPs), becomes inevitable to select relevant information and, therefore, to better characterize diseases or assess genetic structure. There are different ways to perform variable selection in large data sets. Statistical tests are commonly used to identify differentially expressed features for explanatory purposes, whereas Machine Learning wrapper approaches can be used for predictive purposes. In the case of multiple highly correlated variables, another option is to use multivariate exploratory approaches to give more insight into cell biology, biological pathways or complex traits.

RESULTS

A simple extension of a sparse PLS exploratory approach is proposed to perform variable selection in a multiclass classification framework.

CONCLUSIONS

sPLS-DA has a classification performance similar to other wrapper or sparse discriminant analysis approaches on public microarray and SNP data sets. More importantly, sPLS-DA is clearly competitive in terms of computational efficiency and superior in terms of interpretability of the results via valuable graphical outputs. sPLS-DA is available in the R package mixOmics, which is dedicated to the analysis of large biological data sets.

摘要

背景

在高通量生物数据(如基因表达或单核苷酸多态性 (SNP))上进行变量选择变得不可避免,以便选择相关信息,从而更好地描述疾病或评估遗传结构。在大型数据集上进行变量选择有不同的方法。统计检验常用于识别解释目的的差异表达特征,而机器学习包装器方法可用于预测目的。在多个高度相关变量的情况下,另一种选择是使用多元探索方法更深入地了解细胞生物学、生物途径或复杂特征。

结果

提出了一种简单的稀疏 PLS 探索性方法的扩展,以在多类分类框架中进行变量选择。

结论

sPLS-DA 在公共微阵列和 SNP 数据集上的分类性能与其他包装器或稀疏判别分析方法相似。更重要的是,sPLS-DA 在计算效率方面具有明显的竞争力,并且通过有价值的图形输出,在结果的可解释性方面具有优势。sPLS-DA 可在 R 包 mixOmics 中使用,该包专门用于分析大型生物数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/94f52d8a50a2/1471-2105-12-253-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/77166101138e/1471-2105-12-253-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/f01df15629fb/1471-2105-12-253-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/4c1fd153d845/1471-2105-12-253-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/c4defe709e75/1471-2105-12-253-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/44b18d5b677f/1471-2105-12-253-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/fc576d531e06/1471-2105-12-253-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/8b3f7187d74e/1471-2105-12-253-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/e1a9a60656a9/1471-2105-12-253-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/94f52d8a50a2/1471-2105-12-253-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/77166101138e/1471-2105-12-253-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/f01df15629fb/1471-2105-12-253-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/4c1fd153d845/1471-2105-12-253-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/c4defe709e75/1471-2105-12-253-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/44b18d5b677f/1471-2105-12-253-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/fc576d531e06/1471-2105-12-253-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/8b3f7187d74e/1471-2105-12-253-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/e1a9a60656a9/1471-2105-12-253-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/28b1/3133555/94f52d8a50a2/1471-2105-12-253-9.jpg

相似文献

1
Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.稀疏偏最小二乘判别分析:用于多类问题的生物学相关特征选择和图形显示。
BMC Bioinformatics. 2011 Jun 22;12:253. doi: 10.1186/1471-2105-12-253.
2
Stable feature selection and classification algorithms for multiclass microarray data.用于多类微阵列数据的稳定特征选择和分类算法。
Biol Direct. 2012 Oct 2;7:33. doi: 10.1186/1745-6150-7-33.
3
Sparse optimal scoring for multiclass cancer diagnosis and biomarker detection using microarray data.使用微阵列数据进行多类癌症诊断和生物标志物检测的稀疏最优评分
Comput Biol Chem. 2008 Dec;32(6):417-25. doi: 10.1016/j.compbiolchem.2008.07.015. Epub 2008 Jul 16.
4
Filter versus wrapper gene selection approaches in DNA microarray domains.DNA微阵列领域中过滤法与包装法基因选择方法
Artif Intell Med. 2004 Jun;31(2):91-103. doi: 10.1016/j.artmed.2004.01.007.
5
A novel feature selection approach for biomedical data classification.一种用于生物医学数据分类的新特征选择方法。
J Biomed Inform. 2010 Feb;43(1):15-23. doi: 10.1016/j.jbi.2009.07.008. Epub 2009 Jul 30.
6
PLS-DA vs sparse PLS-DA in food traceability. A case study: Authentication of avocado samples.偏最小二乘判别分析(PLS-DA)与稀疏偏最小二乘判别分析(sparse PLS-DA)在食品可追溯性中的比较。案例研究:鳄梨样品的鉴别。
Talanta. 2021 Mar 1;224:121904. doi: 10.1016/j.talanta.2020.121904. Epub 2020 Nov 19.
7
Variable selection using probability density function similarity for support vector machine classification of high-dimensional microarray data.利用概率密度函数相似度进行变量选择以用于高维微阵列数据的支持向量机分类
Talanta. 2009 Jul 15;79(2):260-7. doi: 10.1016/j.talanta.2009.03.044. Epub 2009 Mar 31.
8
Dynamic variable selection in SNP genotype autocalling from APEX microarray data.基于APEX微阵列数据的SNP基因型自动分型中的动态变量选择
BMC Bioinformatics. 2006 Nov 30;7:521. doi: 10.1186/1471-2105-7-521.
9
Feature selection and nearest centroid classification for protein mass spectrometry.蛋白质质谱的特征选择与最近质心分类
BMC Bioinformatics. 2005 Mar 23;6:68. doi: 10.1186/1471-2105-6-68.
10
mixOmics: An R package for 'omics feature selection and multiple data integration.mixOmics:一个用于“组学”特征选择和多数据整合的R包。
PLoS Comput Biol. 2017 Nov 3;13(11):e1005752. doi: 10.1371/journal.pcbi.1005752. eCollection 2017 Nov.

引用本文的文献

1
Evaluation of the Systemic Inflammation in Patients with Bell's Palsy: Monocyte-to-High-Density Lipoprotein Cholesterol Ratio and Hematologic Indices of Inflammation.贝尔麻痹患者全身炎症的评估:单核细胞与高密度脂蛋白胆固醇比值及血液学炎症指标
J Clin Med. 2025 Sep 2;14(17):6194. doi: 10.3390/jcm14176194.
2
Decoding longitudinal microbiome trajectories: an interpretable machine learning approach for biomarker discovery and prediction.解码纵向微生物组轨迹:一种用于生物标志物发现和预测的可解释机器学习方法。
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf374.
3
Cervicovaginal microbial features predict spread to the upper genital tract of infected women.

本文引用的文献

1
Discriminant analysis of principal components: a new method for the analysis of genetically structured populations.主成分判别分析:一种用于分析遗传结构群体的新方法。
BMC Genet. 2010 Oct 15;11:94. doi: 10.1186/1471-2156-11-94.
2
Sparse partial least squares classification for high dimensional data.高维数据的稀疏偏最小二乘分类
Stat Appl Genet Mol Biol. 2010;9(1):Article17. doi: 10.2202/1544-6115.1492. Epub 2010 Mar 3.
3
Integrative mixture of experts to combine clinical factors and gene markers.整合专家的综合意见,结合临床因素和基因标志物。
宫颈阴道微生物特征可预测感染女性的病原体是否会扩散到上生殖道。
Infect Immun. 2025 Sep 9;93(9):e0005725. doi: 10.1128/iai.00057-25. Epub 2025 Aug 12.
4
Oral microbial signatures of head and neck cancer patients with diverse longitudinal oral mucositis severity patterns.具有不同纵向口腔黏膜炎严重程度模式的头颈癌患者的口腔微生物特征。
bioRxiv. 2025 Jul 18:2025.07.15.665024. doi: 10.1101/2025.07.15.665024.
5
A technical review of multi-omics data integration methods: from classical statistical to deep generative approaches.多组学数据整合方法的技术综述:从经典统计方法到深度生成方法
Brief Bioinform. 2025 Jul 2;26(4). doi: 10.1093/bib/bbaf355.
6
Contamination-controlled upper gastrointestinal microbiota profiling reveals salivary-duodenal community types linked to opportunistic pathogen carriage and inflammation.污染控制的上消化道微生物群分析揭示了与机会性病原体携带和炎症相关的唾液-十二指肠群落类型。
Gut Microbes. 2025 Dec;17(1):2539452. doi: 10.1080/19490976.2025.2539452. Epub 2025 Aug 1.
7
Integrating Metabolomics and Machine Learning to Analyze Chemical Markers and Ecological Regulatory Mechanisms of Geographical Differentiation in Turcz.整合代谢组学与机器学习以分析地锦草地理分化的化学标志物及生态调控机制
Metabolites. 2025 Jun 20;15(7):423. doi: 10.3390/metabo15070423.
8
Immunosuppressants Rewire the Gut Microbiome-Alloimmune Axis Through Time-Dependent and Tissue-Specific Mechanisms.免疫抑制剂通过时间依赖性和组织特异性机制重塑肠道微生物群-同种免疫轴。
bioRxiv. 2025 Jul 11:2025.01.02.631100. doi: 10.1101/2025.01.02.631100.
9
Bacterial Dynamics in Newly Settled Acropora kenti: Insights From Inoculations With Individual Probiotic Candidates.新定居的肯氏鹿角珊瑚中的细菌动态:来自单个益生菌候选菌株接种的见解
Environ Microbiol. 2025 Jul;27(7):e70143. doi: 10.1111/1462-2920.70143.
10
Automated sparse feature selection in high-dimensional proteomics data via 1-bit compressed sensing and K-Medoids clustering.通过1位压缩感知和K-中心点聚类实现高维蛋白质组学数据的自动稀疏特征选择
BMC Bioinformatics. 2025 Jul 1;26(1):165. doi: 10.1186/s12859-025-06193-2.
Bioinformatics. 2010 May 1;26(9):1192-8. doi: 10.1093/bioinformatics/btq107. Epub 2010 Mar 11.
4
Sparse partial least squares regression for simultaneous dimension reduction and variable selection.用于同时进行降维和变量选择的稀疏偏最小二乘回归。
J R Stat Soc Series B Stat Methodol. 2010 Jan;72(1):3-25. doi: 10.1111/j.1467-9868.2009.00723.x.
5
integrOmics: an R package to unravel relationships between two omics datasets.integrOmics:一个用于揭示两个组学数据集之间关系的 R 包。
Bioinformatics. 2009 Nov 1;25(21):2855-6. doi: 10.1093/bioinformatics/btp515. Epub 2009 Aug 25.
6
A penalized matrix decomposition, with applications to sparse principal components and canonical correlation analysis.一种惩罚矩阵分解及其在稀疏主成分分析和典型相关分析中的应用。
Biostatistics. 2009 Jul;10(3):515-34. doi: 10.1093/biostatistics/kxp008. Epub 2009 Apr 17.
7
Sparse canonical correlation analysis with application to genomic data integration.应用于基因组数据整合的稀疏典型相关分析。
Stat Appl Genet Mol Biol. 2009;8:Article 1. doi: 10.2202/1544-6115.1406. Epub 2009 Jan 6.
8
Sparse canonical methods for biological data integration: application to a cross-platform study.用于生物数据整合的稀疏典型方法:在一项跨平台研究中的应用
BMC Bioinformatics. 2009 Jan 26;10:34. doi: 10.1186/1471-2105-10-34.
9
A genetic programming-based approach to the classification of multiclass microarray datasets.一种基于遗传编程的多类微阵列数据集分类方法。
Bioinformatics. 2009 Feb 1;25(3):331-7. doi: 10.1093/bioinformatics/btn644. Epub 2008 Dec 16.
10
A sparse PLS for variable selection when integrating omics data.整合组学数据时用于变量选择的稀疏偏最小二乘法
Stat Appl Genet Mol Biol. 2008;7(1):Article 35. doi: 10.2202/1544-6115.1390. Epub 2008 Nov 18.