使用GLM-PO2PLS对结果变量和综合组学数据集进行联合建模。

Joint modeling of an outcome variable and integrated omics datasets using GLM-PO2PLS.

作者信息

Gu Zhujie, Uh Hae-Won, Houwing-Duistermaat Jeanine, El Bouhaddani Said

机构信息

Department of Data Science and Biostatistics, Julius Centre, UMC Utrecht, Utrecht, The Netherlands.

Medical Research Council Biostatistics Unit, University of Cambridge, Cambridge, UK.

出版信息

J Appl Stat. 2024 Feb 21;51(13):2627-2651. doi: 10.1080/02664763.2024.2313458. eCollection 2024.

DOI:10.1080/02664763.2024.2313458

PMID:39290359

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11404385/

Abstract

In many studies of human diseases, multiple omics datasets are measured. Typically, these omics datasets are studied one by one with the disease, thus the relationship between omics is overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. Several dimension reduction methods which jointly model multiple omics and two-stage approaches that model the omics and outcome in separate steps are available. Holistic one-stage models for both omics and outcome are lacking. In this article, we propose a novel one-stage method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The method is illustrated by modeling Down syndrome as outcome and methylation and glycomics as omics datasets. Here we show that our model provides more insight by jointly considering methylation and glycomics.

摘要

在许多人类疾病研究中，会测量多个组学数据集。通常，这些组学数据集是分别与疾病进行研究的，因此组学之间的关系被忽视了。对多个组学的联合部分及其与疾病结局的关联进行建模，将有助于深入了解疾病复杂的分子基础。有几种能联合对多个组学进行建模的降维方法，以及分步骤对组学和结局进行建模的两阶段方法。目前缺乏用于组学和结局的整体单阶段模型。在本文中，我们提出了一种新颖的单阶段方法，可将结局变量与组学联合建模。我们建立了模型的可识别性，并开发了期望最大化（EM）算法，以获得正态分布和伯努利分布结局参数的最大似然估计值。我们提出了检验统计量来推断结局与组学之间的关联，并推导了它们的渐近分布。进行了广泛的模拟研究以评估所提出的模型。通过将唐氏综合征作为结局，甲基化和糖组学作为组学数据集进行建模来说明该方法。在此我们表明，通过联合考虑甲基化和糖组学，我们的模型能提供更多见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acfe/11404385/0cbf0f44a820/CJAS_A_2313458_F0001_OC.jpg

相似文献

Joint modeling of an outcome variable and integrated omics datasets using GLM-PO2PLS.使用GLM-PO2PLS对结果变量和综合组学数据集进行联合建模。

J Appl Stat. 2024 Feb 21;51(13):2627-2651. doi: 10.1080/02664763.2024.2313458. eCollection 2024.

Investigating the impact of Down syndrome on methylation and glycomics with two-stage PO2PLS.采用两阶段偏最小二乘判别分析研究唐氏综合征对甲基化和糖组学的影响。

Theor Biol Forum. 2021 Jan 1;114(1-2):29-44. doi: 10.19272/202111402004.

Statistical integration of two omics datasets using GO2PLS.使用GO2PLS对两个组学数据集进行统计整合。

BMC Bioinformatics. 2021 Mar 18;22(1):131. doi: 10.1186/s12859-021-03958-3.

An efficient estimation approach to joint modeling of longitudinal and survival data.一种用于纵向数据和生存数据联合建模的有效估计方法。

J Appl Stat. 2022 Jul 25;50(15):3031-3047. doi: 10.1080/02664763.2022.2096209. eCollection 2023.

A retrospective likelihood approach for efficient integration of multiple omics factors in case-control association studies.一种用于病例对照关联研究中多组学因素有效整合的回顾性似然方法。

Genet Epidemiol. 2015 Mar;39(3):156-65. doi: 10.1002/gepi.21884. Epub 2015 Jan 24.

A Systemic Analysis of Transcriptomic and Epigenomic Data To Reveal Regulation Patterns for Complex Disease.基于转录组和表观基因组数据的系统分析揭示复杂疾病的调控模式。

G3 (Bethesda). 2017 Jul 5;7(7):2271-2279. doi: 10.1534/g3.117.042408.

Mediation analysis of multiple mediators with incomplete omics data.基于不完全组学数据的多重中介的中介分析。

Genet Epidemiol. 2023 Feb;47(1):61-77. doi: 10.1002/gepi.22504. Epub 2022 Sep 20.

Translational Metabolomics of Head Injury: Exploring Dysfunctional Cerebral Metabolism with Ex Vivo NMR Spectroscopy-Based Metabolite Quantification头部损伤的转化代谢组学：基于体外核磁共振波谱的代谢物定量分析探索脑代谢功能障碍

PathwayMultiomics: An R Package for Efficient Integrative Analysis of Multi-Omics Datasets With Matched or Un-matched Samples.PathwayMultiomics：一个用于对具有匹配或不匹配样本的多组学数据集进行高效综合分析的R包。

Front Genet. 2021 Dec 22;12:783713. doi: 10.3389/fgene.2021.783713. eCollection 2021.

A general framework for integrative analysis of incomplete multiomics data.一种用于综合分析不完全多组学数据的通用框架。

Genet Epidemiol. 2020 Oct;44(7):646-664. doi: 10.1002/gepi.22328. Epub 2020 Jul 21.

本文引用的文献

sJIVE: Supervised Joint and Individual Variation Explained.sJIVE：监督联合与个体变异解释

Comput Stat Data Anal. 2022 Nov;175. doi: 10.1016/j.csda.2022.107547. Epub 2022 Jun 14.

Cooperative learning for multiview analysis.多视图分析的协同学习。

Proc Natl Acad Sci U S A. 2022 Sep 20;119(38):e2202113119. doi: 10.1073/pnas.2202113119. Epub 2022 Sep 12.

Investigating the impact of Down syndrome on methylation and glycomics with two-stage PO2PLS.采用两阶段偏最小二乘判别分析研究唐氏综合征对甲基化和糖组学的影响。

Theor Biol Forum. 2021 Jan 1;114(1-2):29-44. doi: 10.19272/202111402004.

Recent Advances in Mass Spectrometry-Based Glycomic and Glycoproteomic Studies of Pancreatic Diseases.基于质谱的胰腺疾病糖组学和糖蛋白质组学研究的最新进展

Front Chem. 2021 Jul 23;9:707387. doi: 10.3389/fchem.2021.707387. eCollection 2021.

Alterations in protein expression and site-specific N-glycosylation of prostate cancer tissues.前列腺癌组织中蛋白质表达和糖基化位点的改变。

Sci Rep. 2021 Aug 5;11(1):15886. doi: 10.1038/s41598-021-95417-5.

Joint association and classification analysis of multi-view data.多视图数据的联合关联与分类分析

Biometrics. 2022 Dec;78(4):1614-1625. doi: 10.1111/biom.13536. Epub 2021 Aug 22.

Navigating the DNA methylation landscape of cancer.解析癌症 DNA 甲基化图谱。

Trends Genet. 2021 Nov;37(11):1012-1027. doi: 10.1016/j.tig.2021.05.002. Epub 2021 Jun 10.

Efficient Likelihood Estimation of Generalized Structural Equation Models with a Mix of Normal and Nonnormal Responses.混合正态和非正态响应的广义结构方程模型的有效似然估计。

Psychometrika. 2021 Jun;86(2):642-667. doi: 10.1007/s11336-021-09770-5. Epub 2021 Jun 5.

A review of epigenetic changes in asthma: methylation and acetylation.哮喘中表观遗传改变的综述：甲基化和乙酰化。

Clin Epigenetics. 2021 Mar 29;13(1):65. doi: 10.1186/s13148-021-01049-x.

Glycans and Glycan-Binding Proteins as Regulators and Potential Targets in Leukocyte Recruitment.聚糖和聚糖结合蛋白作为白细胞募集的调节因子和潜在靶点

Front Cell Dev Biol. 2021 Feb 4;9:624082. doi: 10.3389/fcell.2021.624082. eCollection 2021.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

使用GLM-PO2PLS对结果变量和综合组学数据集进行联合建模。

Joint modeling of an outcome variable and integrated omics datasets using GLM-PO2PLS.

作者信息

机构信息

出版信息

相似文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

本文引用的文献