Department of Biomedical Informatics, School of Medicine, University of Colorado - Anschutz Medical Campus, Aurora, CO, USA.
Department of Biostatistics and Informatics, Colorado School of Public Health, Aurora, CO, USA.
BMC Genomics. 2024 Sep 2;25(1):825. doi: 10.1186/s12864-024-10619-1.
Studies have identified individual blood biomarkers associated with chronic obstructive pulmonary disease (COPD) and related phenotypes. However, complex diseases such as COPD typically involve changes in multiple molecules with interconnections that may not be captured when considering single molecular features.
Leveraging proteomic data from 3,173 COPDGene Non-Hispanic White (NHW) and African American (AA) participants, we applied sparse multiple canonical correlation network analysis (SmCCNet) to 4,776 proteins assayed on the SomaScan v4.0 platform to derive sparse networks of proteins associated with current vs. former smoking status, airflow obstruction, and emphysema quantitated from high-resolution computed tomography scans. We then used NetSHy, a dimension reduction technique leveraging network topology, to produce summary scores of each proteomic network, referred to as NetSHy scores. We next performed a genome-wide association study (GWAS) to identify variants associated with the NetSHy scores, or network quantitative trait loci (nQTLs). Finally, we evaluated the replicability of the networks in an independent cohort, SPIROMICS.
We identified networks of 13 to 104 proteins for each phenotype and exposure in NHW and AA, and the derived NetSHy scores significantly associated with the variable of interests. Networks included known (sRAGE, ALPP, MIP1) and novel molecules (CA10, CPB1, HIS3, PXDN) and interactions involved in COPD pathogenesis. We observed 7 nQTL loci associated with NetSHy scores, 4 of which remained after conditional analysis. Networks for smoking status and emphysema, but not airflow obstruction, demonstrated a high degree of replicability across race groups and cohorts.
In this work, we apply state-of-the-art molecular network generation and summarization approaches to proteomic data from COPDGene participants to uncover protein networks associated with COPD phenotypes. We further identify genetic associations with networks. This work discovers protein networks containing known and novel proteins and protein interactions associated with clinically relevant COPD phenotypes across race groups and cohorts.
已有研究确定了与慢性阻塞性肺疾病(COPD)和相关表型相关的个体血液生物标志物。然而,像 COPD 这样的复杂疾病通常涉及多个分子的变化,这些变化之间存在相互关联,而在考虑单个分子特征时可能无法捕捉到这些关联。
利用 COPDGene 非西班牙裔白人(NHW)和非裔美国人(AA)3173 名参与者的蛋白质组学数据,我们应用稀疏多典型相关网络分析(SmCCNet)对 SomaScan v4.0 平台上检测到的 4776 种蛋白质进行分析,得出与当前和以前吸烟状态、气流阻塞以及高分辨率计算机断层扫描定量肺气肿相关的蛋白质稀疏网络。然后,我们使用 NetSHy(一种利用网络拓扑结构进行降维的技术)来生成每个蛋白质组学网络的综合评分,称为 NetSHy 评分。接下来,我们进行了全基因组关联研究(GWAS),以鉴定与 NetSHy 评分或网络数量性状基因座(nQTL)相关的变体。最后,我们在一个独立的队列 SPIROMICS 中评估了网络的可重复性。
我们在 NHW 和 AA 中为每个表型和暴露因素确定了 13 到 104 个蛋白质的网络,并且得到的 NetSHy 评分与感兴趣的变量显著相关。网络包括已知的(sRAGE、ALPP、MIP1)和新的分子(CA10、CPB1、HIS3、PXDN)以及涉及 COPD 发病机制的相互作用。我们观察到 7 个与 NetSHy 评分相关的 nQTL 位点,其中 4 个在条件分析后仍然存在。吸烟状态和肺气肿网络,但不是气流阻塞网络,在种族群体和队列中表现出高度的可重复性。
在这项工作中,我们应用最先进的分子网络生成和总结方法对 COPDGene 参与者的蛋白质组学数据进行分析,以揭示与 COPD 表型相关的蛋白质网络。我们进一步确定了与网络的遗传关联。这项工作发现了包含已知和新的蛋白质以及与种族群体和队列中临床相关 COPD 表型相关的蛋白质相互作用的蛋白质网络。