• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

高维遗传和基因组数据的稀疏冗余分析。

Sparse redundancy analysis of high-dimensional genetic and genomic data.

机构信息

Department of Clinical Epidemiology, Biostatistics and Bioinformatics.

Department of Medical Informatics, Academic Medical Center, Amsterdam 1105 AZ, The Netherlands.

出版信息

Bioinformatics. 2017 Oct 15;33(20):3228-3234. doi: 10.1093/bioinformatics/btx374.

DOI:10.1093/bioinformatics/btx374
PMID:28605402
Abstract

MOTIVATION

Recent technological developments have enabled the possibility of genetic and genomic integrated data analysis approaches, where multiple omics datasets from various biological levels are combined and used to describe (disease) phenotypic variations. The main goal is to explain and ultimately predict phenotypic variations by understanding their genetic basis and the interaction of the associated genetic factors. Therefore, understanding the underlying genetic mechanisms of phenotypic variations is an ever increasing research interest in biomedical sciences. In many situations, we have a set of variables that can be considered to be the outcome variables and a set that can be considered to be explanatory variables. Redundancy analysis (RDA) is an analytic method to deal with this type of directionality. Unfortunately, current implementations of RDA cannot deal optimally with the high dimensionality of omics data (p≫n). The existing theoretical framework, based on Ridge penalization, is suboptimal, since it includes all variables in the analysis. As a solution, we propose to use Elastic Net penalization in an iterative RDA framework to obtain a sparse solution.

RESULTS

We proposed sparse redundancy analysis (sRDA) for high dimensional omics data analysis. We conducted simulation studies with our software implementation of sRDA to assess the reliability of sRDA. Both the analysis of simulated data, and the analysis of 485 512 methylation markers and 18,424 gene-expression values measured in a set of 55 patients with Marfan syndrome show that sRDA is able to deal with the usual high dimensionality of omics data.

AVAILABILITY AND IMPLEMENTATION

http://uva.csala.me/rda.

CONTACT

a.csala@amc.uva.nl.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

最近的技术发展使得进行遗传和基因组综合数据分析成为可能,其中来自不同生物学水平的多个组学数据集被组合并用于描述(疾病)表型变异。主要目标是通过了解遗传基础和相关遗传因素的相互作用来解释和最终预测表型变异。因此,了解表型变异的潜在遗传机制是生物医学科学中日益增长的研究兴趣。在许多情况下,我们有一组可以被认为是结果变量的变量,以及一组可以被认为是解释变量的变量。冗余分析(RDA)是一种用于处理这种方向性的分析方法。不幸的是,当前的 RDA 实现不能最优地处理组学数据的高维性(p≫n)。基于岭惩罚的现有理论框架是次优的,因为它包括了分析中的所有变量。作为一种解决方案,我们建议在迭代 RDA 框架中使用弹性网络惩罚来获得稀疏解。

结果

我们提出了用于高维组学数据分析的稀疏冗余分析(sRDA)。我们使用我们的 sRDA 软件实现进行了模拟研究,以评估 sRDA 的可靠性。模拟数据的分析以及对 55 例马凡综合征患者的一组 485512 个甲基化标记和 18424 个基因表达值的分析表明,sRDA 能够处理通常的组学数据的高维性。

可用性和实现

http://uva.csala.me/rda。

联系方式

a.csala@amc.uva.nl。

补充信息

补充数据可在 Bioinformatics 在线获取。

相似文献

1
Sparse redundancy analysis of high-dimensional genetic and genomic data.高维遗传和基因组数据的稀疏冗余分析。
Bioinformatics. 2017 Oct 15;33(20):3228-3234. doi: 10.1093/bioinformatics/btx374.
2
Multiset sparse redundancy analysis for high-dimensional omics data.用于高维组学数据的多重集稀疏冗余分析。
Biom J. 2019 Mar;61(2):406-423. doi: 10.1002/bimj.201700248. Epub 2018 Dec 3.
3
Multiset sparse partial least squares path modeling for high dimensional omics data analysis.多集稀疏偏最小二乘路径建模在高维组学数据分析中的应用。
BMC Bioinformatics. 2020 Jan 9;21(1):9. doi: 10.1186/s12859-019-3286-3.
4
Meta-analytic principal component analysis in integrative omics application.整合组学应用中的元分析主成分分析。
Bioinformatics. 2018 Apr 15;34(8):1321-1328. doi: 10.1093/bioinformatics/btx765.
5
Integrative Analysis of Multi-Omics Data Based on Blockwise Sparse Principal Components.基于分块稀疏主成分的多组学数据综合分析。
Int J Mol Sci. 2020 Nov 2;21(21):8202. doi: 10.3390/ijms21218202.
6
A non-negative matrix factorization method for detecting modules in heterogeneous omics multi-modal data.一种用于在异质组学多模态数据中检测模块的非负矩阵分解方法。
Bioinformatics. 2016 Jan 1;32(1):1-8. doi: 10.1093/bioinformatics/btv544. Epub 2015 Sep 15.
7
Predicting censored survival data based on the interactions between meta-dimensional omics data in breast cancer.基于乳腺癌元维度组学数据间的相互作用预测删失生存数据。
J Biomed Inform. 2015 Aug;56:220-8. doi: 10.1016/j.jbi.2015.05.019. Epub 2015 Jun 3.
8
Precision Lasso: accounting for correlations and linear dependencies in high-dimensional genomic data.精准套索:在高维基因组数据中考虑相关性和线性依赖关系。
Bioinformatics. 2019 Apr 1;35(7):1181-1187. doi: 10.1093/bioinformatics/bty750.
9
Integrating multi-OMICS data through sparse canonical correlation analysis for the prediction of complex traits: a comparison study.通过稀疏典型相关分析整合多组学数据以预测复杂性状:一项比较研究。
Bioinformatics. 2020 Nov 1;36(17):4616-4625. doi: 10.1093/bioinformatics/btaa530.
10
High dimensional classification with combined adaptive sparse PLS and logistic regression.基于组合自适应稀疏偏最小二乘和逻辑回归的高维分类。
Bioinformatics. 2018 Feb 1;34(3):485-493. doi: 10.1093/bioinformatics/btx571.

引用本文的文献

1
Particle associated denitrification in coastal aquaculture water.沿海养殖水体中颗粒态相关反硝化作用
BMC Microbiol. 2025 Jul 2;25(1):396. doi: 10.1186/s12866-025-04122-0.
2
Elucidating Cancer Subtypes by Using the Relationship between DNA Methylation and Gene Expression.通过 DNA 甲基化与基因表达的关系来阐明癌症亚型。
Genes (Basel). 2024 May 16;15(5):631. doi: 10.3390/genes15050631.
3
A curated multivariate approach to study efficacy and optimisation of a prototype vaccine against teladorsagiasis in sheep.经优化的多变量方法研究针对绵羊泰勒虫病的原型疫苗的功效和优化。
Vet Res Commun. 2024 Feb;48(1):367-379. doi: 10.1007/s11259-023-10208-9. Epub 2023 Sep 14.
4
The immuno-behavioural covariation associated with the treatment response to bumetanide in young children with autism spectrum disorder.免疫行为的变化与布美他尼治疗自闭症谱系障碍儿童的反应相关。
Transl Psychiatry. 2022 Jun 3;12(1):228. doi: 10.1038/s41398-022-01987-x.
5
Linking genotype to phenotype in multi-omics data of small sample.在小样本多组学数据中将基因型与表型联系起来。
BMC Genomics. 2021 Jul 13;22(1):537. doi: 10.1186/s12864-021-07867-w.
6
Where Do We Stand in Regularization for Life Science Studies?我们在生命科学研究的正则化方面处于什么位置?
J Comput Biol. 2022 Mar;29(3):213-232. doi: 10.1089/cmb.2019.0371. Epub 2021 Apr 29.
7
Multiset sparse partial least squares path modeling for high dimensional omics data analysis.多集稀疏偏最小二乘路径建模在高维组学数据分析中的应用。
BMC Bioinformatics. 2020 Jan 9;21(1):9. doi: 10.1186/s12859-019-3286-3.
8
Multiset sparse redundancy analysis for high-dimensional omics data.用于高维组学数据的多重集稀疏冗余分析。
Biom J. 2019 Mar;61(2):406-423. doi: 10.1002/bimj.201700248. Epub 2018 Dec 3.