稀疏多重共惯性分析及其在多组学数据综合分析中的应用。

Sparse multiple co-Inertia analysis with application to integrative analysis of multi -Omics data.

机构信息

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, 423 Guardian Dr, Philadelphia, 19104, USA.

出版信息

BMC Bioinformatics. 2020 Apr 15;21(1):141. doi: 10.1186/s12859-020-3455-4.

DOI:10.1186/s12859-020-3455-4

PMID:32293260

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7157996/

Abstract

BACKGROUND

Multiple co-inertia analysis (mCIA) is a multivariate analysis method that can assess relationships and trends in multiple datasets. Recently it has been used for integrative analysis of multiple high-dimensional -omics datasets. However, its estimated loading vectors are non-sparse, which presents challenges for identifying important features and interpreting analysis results. We propose two new mCIA methods: 1) a sparse mCIA method that produces sparse loading estimates and 2) a structured sparse mCIA method that further enables incorporation of structural information among variables such as those from functional genomics.

RESULTS

Our extensive simulation studies demonstrate the superior performance of the sparse mCIA and structured sparse mCIA methods compared to the existing mCIA in terms of feature selection and estimation accuracy. Application to the integrative analysis of transcriptomics data and proteomics data from a cancer study identified biomarkers that are suggested in the literature related with cancer disease.

CONCLUSION

Proposed sparse mCIA achieves simultaneous model estimation and feature selection and yields analysis results that are more interpretable than the existing mCIA. Furthermore, proposed structured sparse mCIA can effectively incorporate prior network information among genes, resulting in improved feature selection and enhanced interpretability.

摘要

背景

多重共惰性分析（mCIA）是一种多元分析方法，可评估多个数据集之间的关系和趋势。最近，它已被用于多个高维组学数据集的综合分析。然而，其估计的加载向量是非稀疏的，这给识别重要特征和解释分析结果带来了挑战。我们提出了两种新的 mCIA 方法：1）产生稀疏加载估计的稀疏 mCIA 方法，2）进一步能够在变量之间（如功能基因组学）纳入结构信息的结构稀疏 mCIA 方法。

结果

我们广泛的模拟研究表明，在特征选择和估计准确性方面，稀疏 mCIA 和结构稀疏 mCIA 方法的性能优于现有的 mCIA。应用于癌症研究中转录组学数据和蛋白质组学数据的综合分析，确定了文献中与癌症疾病相关的生物标志物。

结论

提出的稀疏 mCIA 实现了同时的模型估计和特征选择，并产生了比现有 mCIA 更具可解释性的分析结果。此外，提出的结构稀疏 mCIA 可以有效地在基因之间纳入先验网络信息，从而改善特征选择并增强可解释性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/402e/7157996/ed323d877d29/12859_2020_3455_Fig1_HTML.jpg

相似文献

Sparse multiple co-Inertia analysis with application to integrative analysis of multi -Omics data.稀疏多重共惯性分析及其在多组学数据综合分析中的应用。

BMC Bioinformatics. 2020 Apr 15;21(1):141. doi: 10.1186/s12859-020-3455-4.

A multivariate approach to the integration of multi-omics datasets.一种整合多组学数据集的多变量方法。

BMC Bioinformatics. 2014 May 29;15:162. doi: 10.1186/1471-2105-15-162.

Integrative Exploratory Analysis of Two or More Genomic Datasets.两个或多个基因组数据集的综合探索性分析

Methods Mol Biol. 2016;1418:19-38. doi: 10.1007/978-1-4939-3578-9_2.

Clustering and variable selection evaluation of 13 unsupervised methods for multi-omics data integration.用于多组学数据整合的13种无监督方法的聚类和变量选择评估

Brief Bioinform. 2020 Dec 1;21(6):2011-2030. doi: 10.1093/bib/bbz138.

Multiset sparse partial least squares path modeling for high dimensional omics data analysis.多集稀疏偏最小二乘路径建模在高维组学数据分析中的应用。

BMC Bioinformatics. 2020 Jan 9;21(1):9. doi: 10.1186/s12859-019-3286-3.

Penalized co-inertia analysis with applications to -omics data.带惩罚的共惯性分析及其在组学数据中的应用。

Bioinformatics. 2019 Mar 15;35(6):1018-1025. doi: 10.1093/bioinformatics/bty726.

Dimension reduction and outlier detection of 3-D shapes derived from multi-organ CT images.三维形状的降维和离群点检测，源自多器官 CT 图像。

BMC Med Inform Decis Mak. 2024 Feb 14;24(1):49. doi: 10.1186/s12911-024-02457-8.

NetMIM: network-based multi-omics integration with block missingness for biomarker selection and disease outcome prediction.NetMIM：基于网络的多组学整合，具有块缺失，用于生物标志物选择和疾病结果预测。

Brief Bioinform. 2024 Jul 25;25(5). doi: 10.1093/bib/bbae454.

Incorporating biological information in sparse principal component analysis with application to genomic data.将生物信息纳入稀疏主成分分析并应用于基因组数据。

BMC Bioinformatics. 2017 Jul 11;18(1):332. doi: 10.1186/s12859-017-1740-7.

Statistical integration of two omics datasets using GO2PLS.使用GO2PLS对两个组学数据集进行统计整合。

BMC Bioinformatics. 2021 Mar 18;22(1):131. doi: 10.1186/s12859-021-03958-3.

引用本文的文献

Transcriptomics, Proteomics and Bioinformatics in Atrial Fibrillation: A Descriptive Review.心房颤动中的转录组学、蛋白质组学和生物信息学：描述性综述

Bioengineering (Basel). 2025 Feb 4;12(2):149. doi: 10.3390/bioengineering12020149.

From Omics to Multi-Omics: A Review of Advantages and Tradeoffs.从组学到多组学：优势与权衡综述

Genes (Basel). 2024 Nov 29;15(12):1551. doi: 10.3390/genes15121551.

DeepIDA-GRU: a deep learning pipeline for integrative discriminant analysis of cross-sectional and longitudinal multiview data with applications to inflammatory bowel disease classification.DeepIDA-GRU：一种用于整合跨截面和纵向多视图数据的鉴别分析的深度学习管道，应用于炎症性肠病分类。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae339.

Joint multi-omics discriminant analysis with consistent representation learning using PANDA.使用PANDA进行具有一致表示学习的联合多组学判别分析。

Res Sq. 2024 May 17:rs.3.rs-4353037. doi: 10.21203/rs.3.rs-4353037/v1.

Knowledge-guided learning methods for integrative analysis of multi-omics data.用于多组学数据综合分析的知识引导学习方法。

Comput Struct Biotechnol J. 2024 Apr 30;23:1945-1950. doi: 10.1016/j.csbj.2024.04.053. eCollection 2024 Dec.

PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.PathIntegrate：基于通路的多组学数据整合的多元建模方法。

PLoS Comput Biol. 2024 Mar 25;20(3):e1011814. doi: 10.1371/journal.pcbi.1011814. eCollection 2024 Mar.

PathIntegrate: Multivariate modelling approaches for pathway-based multi-omics data integration.路径整合：基于通路的多组学数据整合的多变量建模方法。

bioRxiv. 2024 Jan 9:2024.01.09.574780. doi: 10.1101/2024.01.09.574780.

A primer on correlation-based dimension reduction methods for multi-omics analysis.基于相关性的多维数据分析方法概论。

J R Soc Interface. 2023 Oct;20(207):20230344. doi: 10.1098/rsif.2023.0344. Epub 2023 Oct 11.

Predictive overfitting in immunological applications: Pitfalls and solutions.免疫应用中的预测过拟合：陷阱与解决方案。

Hum Vaccin Immunother. 2023 Aug 1;19(2):2251830. doi: 10.1080/21645515.2023.2251830.

Integrated multiomics analysis to infer COVID-19 biological insights.综合多组学分析推断 COVID-19 的生物学见解。

Sci Rep. 2023 Jan 31;13(1):1802. doi: 10.1038/s41598-023-28816-5.

本文引用的文献

Generalized Bayesian Factor Analysis for Integrative Clustering with Applications to Multi-Omics Data.用于整合聚类的广义贝叶斯因子分析及其在多组学数据中的应用

Proc Int Conf Data Sci Adv Anal. 2018 Oct;2018:109-119. doi: 10.1109/DSAA.2018.00021. Epub 2019 Feb 4.

Penalized co-inertia analysis with applications to -omics data.带惩罚的共惯性分析及其在组学数据中的应用。

Bioinformatics. 2019 Mar 15;35(6):1018-1025. doi: 10.1093/bioinformatics/bty726.

Unified Sequence-Based Association Tests Allowing for Multiple Functional Annotations and Meta-analysis of Noncoding Variation in Metabochip Data.基于统一序列的关联测试，支持多种功能注释以及代谢芯片数据中非编码变异的荟萃分析。

Am J Hum Genet. 2017 Sep 7;101(3):340-352. doi: 10.1016/j.ajhg.2017.07.011. Epub 2017 Aug 24.

Regularized Generalized Canonical Correlation Analysis: A Framework for Sequential Multiblock Component Methods.正则化广义典型相关分析：一种用于顺序多块成分方法的框架。

Psychometrika. 2017 May 23. doi: 10.1007/s11336-017-9573-x.

Leukemia cell proliferation and death in chronic lymphocytic leukemia patients on therapy with the BTK inhibitor ibrutinib.接受 BTK 抑制剂伊布替尼治疗的慢性淋巴细胞白血病患者的白血病细胞增殖和死亡。

JCI Insight. 2017 Jan 26;2(2):e89904. doi: 10.1172/jci.insight.89904.

Leukemia-cell proliferation and disease progression in patients with early stage chronic lymphocytic leukemia.早期慢性淋巴细胞白血病患者的白血病细胞增殖与疾病进展

Leukemia. 2017 Jun;31(6):1348-1354. doi: 10.1038/leu.2017.34. Epub 2017 Jan 24.

KEGG: new perspectives on genomes, pathways, diseases and drugs.京都基因与基因组百科全书（KEGG）：关于基因组、通路、疾病和药物的新视角。

Nucleic Acids Res. 2017 Jan 4;45(D1):D353-D361. doi: 10.1093/nar/gkw1092. Epub 2016 Nov 28.

EpCAM Inhibition Sensitizes Chemoresistant Leukemia to Immune Surveillance.EPCAM 抑制使耐药性白血病对免疫监测敏感。

Cancer Res. 2017 Jan 15;77(2):482-493. doi: 10.1158/0008-5472.CAN-16-0842. Epub 2016 Oct 3.

Regulation of cellular proliferation in acute lymphoblastic leukemia by Casein Kinase II (CK2) and Ikaros.酪蛋白激酶II（CK2）和Ikaros对急性淋巴细胞白血病细胞增殖的调控

Adv Biol Regul. 2017 Jan;63:71-80. doi: 10.1016/j.jbior.2016.09.003. Epub 2016 Sep 18.

Dimension reduction techniques for the integrative analysis of multi-omics data.用于多组学数据综合分析的降维技术

Brief Bioinform. 2016 Jul;17(4):628-41. doi: 10.1093/bib/bbv108. Epub 2016 Mar 11.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

稀疏多重共惯性分析及其在多组学数据综合分析中的应用。

Sparse multiple co-Inertia analysis with application to integrative analysis of multi -Omics data.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献