• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

多块变量对正交投影(MB-VIOP)的影响,用于增强 OnPLS 模型中总变异性、全局变异性、局部变异性和独特变异性的解释。

Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models.

机构信息

Department of Chemistry, Computational Life Science Cluster (CLiC), Umeå University, Umeå, Sweden.

Industrial Doctoral School (IDS), Umeå, Sweden.

出版信息

BMC Bioinformatics. 2021 Apr 3;22(1):176. doi: 10.1186/s12859-021-04015-9.

DOI:10.1186/s12859-021-04015-9
PMID:33812384
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC8019512/
Abstract

BACKGROUND

For multivariate data analysis involving only two input matrices (e.g., X and Y), the previously published methods for variable influence on projection (e.g., VIP or VIP) are widely used for variable selection purposes, including (i) variable importance assessment, (ii) dimensionality reduction of big data and (iii) interpretation enhancement of PLS, OPLS and O2PLS models. For multiblock analysis, the OnPLS models find relationships among multiple data matrices (more than two blocks) by calculating latent variables; however, a method for improving the interpretation of these latent variables (model components) by assessing the importance of the input variables was not available up to now.

RESULTS

A method for variable selection in multiblock analysis, called multiblock variable influence on orthogonal projections (MB-VIOP) is explained in this paper. MB-VIOP is a model based variable selection method that uses the data matrices, the scores and the normalized loadings of an OnPLS model in order to sort the input variables of more than two data matrices according to their importance for both simplification and interpretation of the total multiblock model, and also of the unique, local and global model components separately. MB-VIOP has been tested using three datasets: a synthetic four-block dataset, a real three-block omics dataset related to plant sciences, and a real six-block dataset related to the food industry.

CONCLUSIONS

We provide evidence for the usefulness and reliability of MB-VIOP by means of three examples (one synthetic and two real-world cases). MB-VIOP assesses in a trustable and efficient way the importance of both isolated and ranges of variables in any type of data. MB-VIOP connects the input variables of different data matrices according to their relevance for the interpretation of each latent variable, yielding enhanced interpretability for each OnPLS model component. Besides, MB-VIOP can deal with strong overlapping of types of variation, as well as with many data blocks with very different dimensionality. The ability of MB-VIOP for generating dimensionality reduced models with high interpretability makes this method ideal for big data mining, multi-omics data integration and any study that requires exploration and interpretation of large streams of data.

摘要

背景

对于仅涉及两个输入矩阵(例如 X 和 Y)的多元数据分析,先前发表的用于投影变量影响(例如 VIP 或 VIP)的方法广泛用于变量选择目的,包括(i)变量重要性评估,(ii)大数据降维和(iii)PLS、OPLS 和 O2PLS 模型的解释增强。对于多块分析,OnPLS 模型通过计算潜在变量来找到多个数据矩阵(两个以上块)之间的关系;然而,到目前为止,还没有一种方法可以通过评估输入变量的重要性来改进对这些潜在变量(模型组件)的解释。

结果

本文解释了一种用于多块分析的变量选择方法,称为多块正交投影变量影响(MB-VIOP)。MB-VIOP 是一种基于模型的变量选择方法,它使用数据矩阵、得分和 OnPLS 模型的归一化载荷,以便根据其对总多块模型简化和解释的重要性,对两个以上数据矩阵的输入变量进行排序,还可以分别对独特的、局部的和全局的模型组件进行排序。MB-VIOP 已经使用三个数据集进行了测试:一个合成的四组数据集、一个与植物科学相关的真实三组组学数据集和一个与食品工业相关的真实六组数据集。

结论

我们通过三个例子(一个合成的和两个真实的例子)提供了 MB-VIOP 有用性和可靠性的证据。MB-VIOP 以可靠且有效的方式评估了任何类型数据中孤立变量和变量范围的重要性。MB-VIOP 根据它们对每个潜在变量解释的相关性,将不同数据矩阵的输入变量连接起来,从而提高了每个 OnPLS 模型组件的可解释性。此外,MB-VIOP 可以处理类型变化的强烈重叠,以及具有非常不同维度的许多数据块。MB-VIOP 生成具有高可解释性的降维模型的能力使其成为大数据挖掘、多组学数据集成以及任何需要探索和解释大量数据流的研究的理想方法。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/c57ec457faed/12859_2021_4015_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/b0e532528cde/12859_2021_4015_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/e57d635509ea/12859_2021_4015_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/f06e694b3721/12859_2021_4015_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/c57ec457faed/12859_2021_4015_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/b0e532528cde/12859_2021_4015_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/e57d635509ea/12859_2021_4015_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/f06e694b3721/12859_2021_4015_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ce55/8019512/c57ec457faed/12859_2021_4015_Fig4_HTML.jpg

相似文献

1
Multiblock variable influence on orthogonal projections (MB-VIOP) for enhanced interpretation of total, global, local and unique variations in OnPLS models.多块变量对正交投影(MB-VIOP)的影响,用于增强 OnPLS 模型中总变异性、全局变异性、局部变异性和独特变异性的解释。
BMC Bioinformatics. 2021 Apr 3;22(1):176. doi: 10.1186/s12859-021-04015-9.
2
OnPLS-Based Multi-Block Data Integration: A Multivariate Approach to Interrogating Biological Interactions in Asthma.基于偏最小二乘(OnPLS)的多区块数据整合:一种用于哮喘中生物相互作用研究的多变量方法。
Anal Chem. 2018 Nov 20;90(22):13400-13408. doi: 10.1021/acs.analchem.8b03205. Epub 2018 Nov 2.
3
A Sequential Algorithm for Multiblock Orthogonal Projections to Latent Structures.一种用于多块正交投影到潜在结构的序列算法。
Chemometr Intell Lab Syst. 2015 Dec 15;149(Pt B):33-39. doi: 10.1016/j.chemolab.2015.10.018.
4
Global, local and unique decompositions in OnPLS for multiblock data analysis.全局、局部和独特分解在多块数据分析中的 OnPLS 方法。
Anal Chim Acta. 2013 Aug 12;791:13-24. doi: 10.1016/j.aca.2013.06.026. Epub 2013 Jun 26.
5
Application of Multiblock Analysis on Small Metabolomic Multi-Tissue Dataset.多块分析在小型代谢组多组织数据集中的应用。
Metabolites. 2020 Jul 17;10(7):295. doi: 10.3390/metabo10070295.
6
Opportunities offered by latent-based multiblock strategies to integrate biomarkers of chemical exposure and biomarkers of effect in environmental health studies.基于潜在变量的多区块策略为整合环境健康研究中的化学暴露标志物和效应标志物带来的机遇。
Chemosphere. 2024 Aug;361:142465. doi: 10.1016/j.chemosphere.2024.142465. Epub 2024 May 27.
7
Iterative weighting of multiblock data in the orthogonal partial least squares framework.在正交偏最小二乘框架中对多块数据进行迭代加权。
Anal Chim Acta. 2014 Feb 27;813:25-34. doi: 10.1016/j.aca.2014.01.025. Epub 2014 Jan 16.
8
Deep multiblock predictive modelling using parallel input convolutional neural networks.使用并行输入卷积神经网络的深度多块预测建模
Anal Chim Acta. 2021 Jun 8;1163:338520. doi: 10.1016/j.aca.2021.338520. Epub 2021 Apr 16.
9
Exploring Omics data from designed experiments using analysis of variance multiblock Orthogonal Partial Least Squares.使用方差分析多块正交偏最小二乘法探索来自设计实验的组学数据。
Anal Chim Acta. 2016 May 12;920:18-28. doi: 10.1016/j.aca.2016.03.042. Epub 2016 Mar 29.
10
Swiss knife partial least squares (SKPLS): One tool for modelling single block, multiblock, multiway, multiway multiblock including multi-responses and meta information under the ROSA framework.瑞士军刀偏最小二乘法(SKPLS):在 ROSA 框架下,用于对单块、多块、多向、多向多块(包括多响应和元信息)进行建模的一种工具。
Anal Chim Acta. 2022 May 8;1206:339786. doi: 10.1016/j.aca.2022.339786. Epub 2022 Mar 30.

引用本文的文献

1
Understanding the effect of acupuncture on nausea and vomiting during pregnancy from a metabolic perspective: study protocol for a single-blinded randomized controlled trial.从代谢角度理解针灸对妊娠恶心和呕吐的影响:一项单盲随机对照试验的研究方案。
BMC Complement Med Ther. 2024 Oct 3;24(1):354. doi: 10.1186/s12906-024-04656-2.
2
Detecting Respiratory Viruses Using a Portable NIR Spectrometer-A Preliminary Exploration with a Data Driven Approach.使用便携式近红外光谱仪检测呼吸道病毒 - 一种基于数据驱动方法的初步探索。
Sensors (Basel). 2024 Jan 4;24(1):308. doi: 10.3390/s24010308.

本文引用的文献

1
Multiset sparse partial least squares path modeling for high dimensional omics data analysis.多集稀疏偏最小二乘路径建模在高维组学数据分析中的应用。
BMC Bioinformatics. 2020 Jan 9;21(1):9. doi: 10.1186/s12859-019-3286-3.
2
OnPLS-Based Multi-Block Data Integration: A Multivariate Approach to Interrogating Biological Interactions in Asthma.基于偏最小二乘(OnPLS)的多区块数据整合:一种用于哮喘中生物相互作用研究的多变量方法。
Anal Chem. 2018 Nov 20;90(22):13400-13408. doi: 10.1021/acs.analchem.8b03205. Epub 2018 Nov 2.
3
Integrating omics datasets with the OmicsPLS package.
整合组学数据集与 OmicsPLS 包。
BMC Bioinformatics. 2018 Oct 11;19(1):371. doi: 10.1186/s12859-018-2371-3.
4
Multi-Omics Factor Analysis-a framework for unsupervised integration of multi-omics data sets.多组学因子分析——一种用于无监督整合多组学数据集的框架。
Mol Syst Biol. 2018 Jun 20;14(6):e8124. doi: 10.15252/msb.20178124.
5
Respective impact of bread structure and oral processing on dynamic texture perceptions through statistical multiblock analysis.通过统计多块分析研究面包结构和口腔加工对动态质地感知的各自影响。
Food Res Int. 2016 Sep;87:142-151. doi: 10.1016/j.foodres.2016.06.021. Epub 2016 Jun 29.
6
mixOmics: An R package for 'omics feature selection and multiple data integration.mixOmics:一个用于“组学”特征选择和多数据整合的R包。
PLoS Comput Biol. 2017 Nov 3;13(11):e1005752. doi: 10.1371/journal.pcbi.1005752. eCollection 2017 Nov.
7
Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods.正则化广义典型相关分析:一种用于顺序多块成分方法的框架。
Psychometrika. 2017 May 23. doi: 10.1007/s11336-017-9573-x.
8
Variable selection in multivariate calibration based on clustering of variable concept.基于变量概念聚类的多元校准中的变量选择
Anal Chim Acta. 2016 Jan 1;902:70-81. doi: 10.1016/j.aca.2015.11.002. Epub 2015 Nov 17.
9
Variable selection for generalized canonical correlation analysis.广义典型相关分析中的变量选择
Biostatistics. 2014 Jul;15(3):569-83. doi: 10.1093/biostatistics/kxu001. Epub 2014 Feb 17.
10
Global, local and unique decompositions in OnPLS for multiblock data analysis.全局、局部和独特分解在多块数据分析中的 OnPLS 方法。
Anal Chim Acta. 2013 Aug 12;791:13-24. doi: 10.1016/j.aca.2013.06.026. Epub 2013 Jun 26.