Tsuo Kristin, Argentieri M Austin, Gadd Danni, Kurki Mitja, Zheng Zhili, Baird Denis, Marioni Riccardo E, Foley Christopher, Huang Hailiang, Sun Benjamin B, Chen Chia-Yen, Daly Mark J, Martin Alicia R
Analytic and Translational Genetics Unit, Massachusetts General Hospital, Boston, MA, USA.
Stanley Center for Psychiatric Research, Broad Institute of MIT and Harvard, Cambridge, MA, USA.
medRxiv. 2025 Aug 29:2025.08.27.25334571. doi: 10.1101/2025.08.27.25334571.
Plasma proteomic signatures accurately predict disease risk, but our understanding of the mechanisms contributing to the predictive value of the proteome remains limited. Here, we characterized proteomic biomarkers of 19 age-related diseases, based on observational associations between 2,923 protein levels and incidence of these outcomes in the UK Biobank (N = 45,438). To identify the subset of these biomarkers that may represent causal drivers of disease, we first employed Mendelian Randomization (MR) and found that only 8% of the protein-disease associations with genetic instruments showed suggestive evidence of causal relationships, and were more likely to pertain to only a single disease. We then tested the hypothesis that many proteomic biomarkers, particularly the non-causal proteins, are impacted by environmental factors that might independently affect disease risk and protein levels. We discovered that the vast majority (>90%) of proteins associated with diseases like lung cancer and COPD are also associated with smoking, and more than half of all disease-associated proteins tested in MR were associated with smoking. These proteins showed no evidence of causal effects on disease, suggesting their predictive value is as an environmental sensor. Given the sensitivity of the plasma proteome to smoking, we developed a proteomic score for smoking (SmokingPS) and demonstrated that the plasma proteome can serve as a quantitative index of smoking behavior and history. Extending this approach to alcohol intake phenotypes, our results generally suggest that many plasma proteins identified in observational associations are more likely to be readouts of environmental risk factors than disease-specific signals. We conclude that the plasma proteome may provide critical objective biomarkers for quantifying the impacts of environmental risk factors on human health and disease. Our results have significant implications for implementing predictive plasma protein biomarkers in disease prevention, and can help guide interpretation of putative protein-disease associations as actionable therapeutic targets or quantitative indications of upstream exposures that represent potential intervention points.
血浆蛋白质组学特征能够准确预测疾病风险,但我们对蛋白质组预测价值背后机制的理解仍然有限。在此,我们基于英国生物银行(样本量N = 45,438)中2923种蛋白质水平与19种年龄相关疾病发病率之间的观察性关联,对这些疾病的蛋白质组学生物标志物进行了特征分析。为了确定这些生物标志物中可能代表疾病因果驱动因素的子集,我们首先采用了孟德尔随机化方法(MR),发现与遗传工具相关的蛋白质 - 疾病关联中,只有8%显示出因果关系的提示性证据,并且更有可能仅与单一疾病相关。然后,我们检验了这样一个假设,即许多蛋白质组学生物标志物,特别是非因果关系的蛋白质,会受到可能独立影响疾病风险和蛋白质水平的环境因素的影响。我们发现,与肺癌和慢性阻塞性肺疾病等疾病相关的绝大多数蛋白质(>90%)也与吸烟有关,并且在MR中测试的所有疾病相关蛋白质中,超过一半与吸烟有关。这些蛋白质没有显示出对疾病的因果影响证据,表明它们的预测价值在于作为一种环境传感器。鉴于血浆蛋白质组对吸烟的敏感性,我们开发了一种吸烟蛋白质组评分(SmokingPS),并证明血浆蛋白质组可以作为吸烟行为和历史的定量指标。将这种方法扩展到酒精摄入表型,我们的结果总体表明,在观察性关联中确定的许多血浆蛋白质更有可能是环境风险因素的读数,而不是疾病特异性信号。我们得出结论,血浆蛋白质组可能为量化环境风险因素对人类健康和疾病的影响提供关键的客观生物标志物。我们的结果对于在疾病预防中实施预测性血浆蛋白生物标志物具有重要意义,并有助于指导将假定的蛋白质 - 疾病关联解释为可操作的治疗靶点或代表潜在干预点的上游暴露的定量指标。