Konigsberg Iain R, Vu Thao, Liu Weixuan, Litkowski Elizabeth M, Pratte Katherine A, Vargas Luciana B, Gilmore Niles, Abdel-Hafiz Mohamed, Manichaikul Ani W, Cho Michael H, Hersh Craig P, DeMeo Dawn L, Banaei-Kashani Farnoush, Bowler Russell P, Lange Leslie A, Kechris Katerina J
Department of Biomedical Informatics, University of Colorado - Anschutz Medical Campus, Aurora, CO.
Department of Biostatistics and Informatics, University of Colorado - Anschutz Medical Campus, Aurora, CO.
medRxiv. 2024 Feb 28:2024.02.26.24303069. doi: 10.1101/2024.02.26.24303069.
Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features.
Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS.
We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts.
In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.
研究已经确定了与慢性阻塞性肺疾病(COPD)及相关表型相关的个体血液生物标志物。然而,像COPD这样的复杂疾病通常涉及多个分子的变化,这些分子之间存在相互联系,仅考虑单一分子特征时可能无法捕捉到这些联系。
利用来自3173名COPDGene非西班牙裔白人(NHW)和非裔美国人(AA)参与者的蛋白质组数据,我们将稀疏多重典型相关网络分析(SmCCNet)应用于在SomaScan v4.0平台上检测的4776种蛋白质,以推导与当前吸烟状态与既往吸烟状态、气流阻塞以及从高分辨率计算机断层扫描定量得出的肺气肿相关的蛋白质稀疏网络。然后,我们使用NetSHy(一种利用网络拓扑结构的降维技术)来生成每个蛋白质组网络的汇总分数,即NetSHy分数。接下来,我们进行全基因组关联研究(GWAS)以识别与NetSHy分数相关的变异,即网络数量性状位点(nQTL)。最后,我们在一个独立队列SPIROMICS中评估了这些网络的可重复性。
我们在NHW和AA中为每种表型和暴露确定了由13至104种蛋白质组成的网络,并且推导得出的NetSHy分数与感兴趣的变量显著相关。这些网络包括已知分子(sRAGE、ALPP、MIP1)和新分子(CA10、CPB1、HIS3、PXDN)以及参与COPD发病机制的相互作用。我们观察到7个与NetSHy分数相关的nQTL位点,其中4个在条件分析后仍然存在。吸烟状态和肺气肿的网络,而非气流阻塞的网络,在不同种族群体和队列中表现出高度的可重复性。
在这项工作中,我们将最先进的分子网络生成和汇总方法应用于COPDGene参与者的蛋白质组数据,以揭示与COPD表型相关的蛋白质网络。我们进一步确定了与网络的遗传关联。这项工作发现了包含已知和新蛋白质以及与不同种族群体和队列中临床相关COPD表型相关的蛋白质相互作用的蛋白质网络。