Investigative Medicine, Department of Medicine, Faculty of Medicine , Imperial College London , W12 0NN London , United Kingdom.
Department of Preventive Medicine, Feinberg School of Medicine , Northwestern University , Chicago , Illinois 60611 , United States.
J Proteome Res. 2018 Apr 6;17(4):1586-1595. doi: 10.1021/acs.jproteome.7b00879. Epub 2018 Feb 27.
Metabolism is altered by genetics, diet, disease status, environment, and many other factors. Modeling either one of these is often done without considering the effects of the other covariates. Attributing differences in metabolic profile to one of these factors needs to be done while controlling for the metabolic influence of the rest. We describe here a data analysis framework and novel confounder-adjustment algorithm for multivariate analysis of metabolic profiling data. Using simulated data, we show that similar numbers of true associations and significantly less false positives are found compared to other commonly used methods. Covariate-adjusted projections to latent structures (CA-PLS) are exemplified here using a large-scale metabolic phenotyping study of two Chinese populations at different risks for cardiovascular disease. Using CA-PLS, we find that some previously reported differences are actually associated with external factors and discover a number of previously unreported biomarkers linked to different metabolic pathways. CA-PLS can be applied to any multivariate data where confounding may be an issue and the confounder-adjustment procedure is translatable to other multivariate regression techniques.
新陈代谢受到遗传、饮食、疾病状态、环境和许多其他因素的影响。在建模时,通常不会考虑其他协变量的影响。在控制其他因素对代谢的影响的情况下,需要将代谢谱的差异归因于这些因素之一。我们在这里描述了一个数据分析框架和一种新的混杂因素调整算法,用于代谢谱数据分析的多变量分析。使用模拟数据,我们发现与其他常用方法相比,发现了更多的真实关联和更少的假阳性。这里使用两种具有不同心血管疾病风险的中国人群的大规模代谢表型研究来说明协变量调整的潜结构投影(CA-PLS)。使用 CA-PLS,我们发现一些先前报道的差异实际上与外部因素有关,并发现了一些与不同代谢途径相关的以前未报道的生物标志物。CA-PLS 可应用于任何可能存在混杂因素的多变量数据中,并且混杂因素调整程序可转化为其他多变量回归技术。