Suppr超能文献

使用GLM-PO2PLS对结果变量和综合组学数据集进行联合建模。

Joint modeling of an outcome variable and integrated omics datasets using GLM-PO2PLS.

作者信息

Gu Zhujie, Uh Hae-Won, Houwing-Duistermaat Jeanine, El Bouhaddani Said

机构信息

Department of Data Science and Biostatistics, Julius Centre, UMC Utrecht, Utrecht, The Netherlands.

Medical Research Council Biostatistics Unit, University of Cambridge, Cambridge, UK.

出版信息

J Appl Stat. 2024 Feb 21;51(13):2627-2651. doi: 10.1080/02664763.2024.2313458. eCollection 2024.

Abstract

In many studies of human diseases, multiple omics datasets are measured. Typically, these omics datasets are studied one by one with the disease, thus the relationship between omics is overlooked. Modeling the joint part of multiple omics and its association to the outcome disease will provide insights into the complex molecular base of the disease. Several dimension reduction methods which jointly model multiple omics and two-stage approaches that model the omics and outcome in separate steps are available. Holistic one-stage models for both omics and outcome are lacking. In this article, we propose a novel one-stage method that jointly models an outcome variable with omics. We establish the model identifiability and develop EM algorithms to obtain maximum likelihood estimators of the parameters for normally and Bernoulli distributed outcomes. Test statistics are proposed to infer the association between the outcome and omics, and their asymptotic distributions are derived. Extensive simulation studies are conducted to evaluate the proposed model. The method is illustrated by modeling Down syndrome as outcome and methylation and glycomics as omics datasets. Here we show that our model provides more insight by jointly considering methylation and glycomics.

摘要

在许多人类疾病研究中,会测量多个组学数据集。通常,这些组学数据集是分别与疾病进行研究的,因此组学之间的关系被忽视了。对多个组学的联合部分及其与疾病结局的关联进行建模,将有助于深入了解疾病复杂的分子基础。有几种能联合对多个组学进行建模的降维方法,以及分步骤对组学和结局进行建模的两阶段方法。目前缺乏用于组学和结局的整体单阶段模型。在本文中,我们提出了一种新颖的单阶段方法,可将结局变量与组学联合建模。我们建立了模型的可识别性,并开发了期望最大化(EM)算法,以获得正态分布和伯努利分布结局参数的最大似然估计值。我们提出了检验统计量来推断结局与组学之间的关联,并推导了它们的渐近分布。进行了广泛的模拟研究以评估所提出的模型。通过将唐氏综合征作为结局,甲基化和糖组学作为组学数据集进行建模来说明该方法。在此我们表明,通过联合考虑甲基化和糖组学,我们的模型能提供更多见解。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/acfe/11404385/0cbf0f44a820/CJAS_A_2313458_F0001_OC.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验