Chen Chen, Gowda G A Nagana, Zhu Jiangjiang, Deng Lingli, Gu Haiwei, Chiorean E Gabriela, Zaid Mohammad Abu, Harrison Marietta, Zhang Dabao, Zhang Min, Raftery Daniel
Department of Statistics, Purdue University, West Lafayette, IN 47907, USA.
Northwest Metabolomics Research Center, Department of Anesthesiology and Pain Medicine, University of Washington, Seattle, WA 98109, USA.
Metabolomics. 2017 Nov;13(11). doi: 10.1007/s11306-017-1265-0. Epub 2017 Sep 15.
Metabolomics technologies enable the identification of putative biomarkers for numerous diseases; however, the influence of confounding factors on metabolite levels poses a major challenge in moving forward with such metabolites for pre-clinical or clinical applications.
To address this challenge, we analyzed metabolomics data from a colorectal cancer (CRC) study, and used seemingly unrelated regression (SUR) to account for the effects of confounding factors including gender, BMI, age, alcohol use, and smoking.
A SUR model based on 113 serum metabolites quantified using targeted mass spectrometry, identified 20 metabolites that differentiated CRC patients (n = 36), patients with polyp (n = 39), and healthy subjects (n = 83). Models built using different groups of biologically related metabolites achieved improved differentiation and were significant for 26 out of 29 groups. Furthermore, the networks of correlated metabolites constructed for all groups of metabolites using the ParCorA algorithm, before or after application of the SUR model, showed significant alterations for CRC and polyp patients relative to healthy controls.
The results showed that demographic covariates, such as gender, BMI, BMI, and smoking status, exhibit significant confounding effects on metabolite levels, which can be modeled effectively.
These results not only provide new insights into addressing the major issue of confounding effects in metabolomics analysis, but also shed light on issues related to establishing reliable biomarkers and the biological connections between them in a complex disease.
代谢组学技术能够识别多种疾病的潜在生物标志物;然而,混杂因素对代谢物水平的影响给将此类代谢物推进到临床前或临床应用带来了重大挑战。
为应对这一挑战,我们分析了一项结直肠癌(CRC)研究的代谢组学数据,并使用看似不相关回归(SUR)来考虑包括性别、体重指数(BMI)、年龄、饮酒和吸烟等混杂因素的影响。
基于使用靶向质谱法定量的113种血清代谢物建立的SUR模型,识别出20种能够区分CRC患者(n = 36)、息肉患者(n = 39)和健康受试者(n = 83)的代谢物。使用不同组生物学相关代谢物构建的模型实现了更好的区分,并且在29组中有26组具有显著性。此外,在应用SUR模型之前或之后,使用ParCorA算法为所有代谢物组构建的相关代谢物网络显示,CRC和息肉患者相对于健康对照有显著变化。
结果表明,人口统计学协变量,如性别、BMI、BMI和吸烟状况,对代谢物水平表现出显著的混杂效应,这可以有效地进行建模。
这些结果不仅为解决代谢组学分析中混杂效应这一主要问题提供了新的见解,还为在复杂疾病中建立可靠生物标志物及其之间的生物学联系相关问题提供了启示。