Suppr超能文献

那么,你认为你可以进行 PLS-DA 分析吗?

So you think you can PLS-DA?

机构信息

Bioinformatics Research Group (BioRG), Florida International University, 11200 SW 8th St, Miami, 33199, FL, USA.

Department of Epidemiology, Florida International University, 11200 SW 8th St, Miami, 24105, FL, USA.

出版信息

BMC Bioinformatics. 2020 Dec 9;21(Suppl 1):2. doi: 10.1186/s12859-019-3310-7.

Abstract

BACKGROUND

Partial Least-Squares Discriminant Analysis (PLS-DA) is a popular machine learning tool that is gaining increasing attention as a useful feature selector and classifier. In an effort to understand its strengths and weaknesses, we performed a series of experiments with synthetic data and compared its performance to its close relative from which it was initially invented, namely Principal Component Analysis (PCA).

RESULTS

We demonstrate that even though PCA ignores the information regarding the class labels of the samples, this unsupervised tool can be remarkably effective as a feature selector. In some cases, it outperforms PLS-DA, which is made aware of the class labels in its input. Our experiments range from looking at the signal-to-noise ratio in the feature selection task, to considering many practical distributions and models encountered when analyzing bioinformatics and clinical data. Other methods were also evaluated. Finally, we analyzed an interesting data set from 396 vaginal microbiome samples where the ground truth for the feature selection was available. All the 3D figures shown in this paper as well as the supplementary ones can be viewed interactively at http://biorg.cs.fiu.edu/plsda CONCLUSIONS: Our results highlighted the strengths and weaknesses of PLS-DA in comparison with PCA for different underlying data models.

摘要

背景

偏最小二乘判别分析(PLS-DA)是一种流行的机器学习工具,作为一种有用的特征选择器和分类器,越来越受到关注。为了了解它的优缺点,我们用合成数据进行了一系列实验,并将其性能与其最初发明的近亲主成分分析(PCA)进行了比较。

结果

我们证明,尽管 PCA 忽略了样本类标签的信息,但作为一种特征选择器,这种无监督工具可以非常有效。在某些情况下,它的性能优于 PLS-DA,后者在输入中了解类标签。我们的实验范围从特征选择任务中的信噪比,到考虑分析生物信息学和临床数据时遇到的许多实际分布和模型。还评估了其他方法。最后,我们分析了一个有趣的来自 396 个阴道微生物组样本的数据集,其中特征选择的真实情况是可用的。本文显示的所有 3D 图以及补充图都可以在 http://biorg.cs.fiu.edu/plsda 上交互式查看。

结论

我们的结果突出了 PLS-DA 与 PCA 相比在不同基础数据模型下的优缺点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/1c5a/7724830/3b17c63a70bd/12859_2019_3310_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验