Suppr超能文献

稀疏偏最小二乘与分组和子分组结构。

Sparse partial least squares with group and subgroup structure.

机构信息

ARC Centre of Excellence for Mathematical and Statistical Frontiers, Queensland University of Technology, Brisbane, Australia.

Inria, SISTM, Talence and Inserm, U1219 Bordeaux University, Bordeaux, France.

出版信息

Stat Med. 2018 Oct 15;37(23):3338-3356. doi: 10.1002/sim.7821. Epub 2018 Jun 11.

Abstract

Integrative analysis of high dimensional omics datasets has been studied by many authors in recent years. By incorporating prior known relationships among the variables, these analyses have been successful in elucidating the relationships between different sets of omics data. In this article, our goal is to identify important relationships between genomic expression and cytokine data from a human immunodeficiency virus vaccine trial. We proposed a flexible partial least squares technique, which incorporates group and subgroup structure in the modelling process. Our new method accounts for both grouping of genetic markers (eg, gene sets) and temporal effects. The method generalises existing sparse modelling techniques in the partial least squares methodology and establishes theoretical connections to variable selection methods for supervised and unsupervised problems. Simulation studies are performed to investigate the performance of our methods over alternative sparse approaches. Our R package sgspls is available at https://github.com/matt-sutton/sgspls.

摘要

近年来,许多作者研究了高维组学数据集的综合分析。通过将变量之间已知的关系纳入其中,这些分析成功地阐明了不同组学数据集之间的关系。在本文中,我们的目标是确定人类免疫缺陷病毒疫苗试验中基因组表达和细胞因子数据之间的重要关系。我们提出了一种灵活的偏最小二乘技术,该技术在建模过程中纳入了组和亚组结构。我们的新方法考虑了遗传标记(例如基因集)的分组和时间效应。该方法推广了偏最小二乘方法中现有的稀疏建模技术,并为有监督和无监督问题的变量选择方法建立了理论联系。通过模拟研究来研究我们的方法相对于其他稀疏方法的性能。我们的 R 包 sgspls 可在 https://github.com/matt-sutton/sgspls 上获得。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验