Zhang Sai, Cooper-Knock Johnathan, Weimer Annika K, Harvey Calum, Julian Thomas H, Wang Cheng, Li Jingjing, Furini Simone, Frullanti Elisa, Fava Francesca, Renieri Alessandra, Pan Cuiping, Song Jina, Billing-Ross Paul, Gao Peng, Shen Xiaotao, Timpanaro Ilia Sarah, Kenna Kevin P, Davis Mark M, Tsao Philip S, Snyder Michael P
medRxiv. 2021 Jun 21:2021.06.15.21258703. doi: 10.1101/2021.06.15.21258703.
The determinants of severe COVID-19 in non-elderly adults are poorly understood, which limits opportunities for early intervention and treatment. Here we present novel machine learning frameworks for identifying common and rare disease-associated genetic variation, which outperform conventional approaches. By integrating single-cell multiomics profiling of human lungs to link genetic signals to cell-type-specific functions, we have discovered and validated over 1,000 risk genes underlying severe COVID-19 across 19 cell types. Identified risk genes are overexpressed in healthy lungs but relatively downregulated in severely diseased lungs. Genetic risk for severe COVID-19, within both common and rare variants, is particularly enriched in natural killer (NK) cells, which places these immune cells upstream in the pathogenesis of severe disease. Mendelian randomization indicates that failed NKG2D-mediated activation of NK cells leads to critical illness. Network analysis further links multiple pathways associated with NK cell activation, including type-I-interferon-mediated signalling, to severe COVID-19. Our rare variant model, PULSE, enables sensitive prediction of severe disease in non-elderly patients based on whole-exome sequencing; individualized predictions are accurate independent of age and sex, and are consistent across multiple populations and cohorts. Risk stratification based on exome sequencing has the potential to facilitate post-exposure prophylaxis in at-risk individuals, potentially based around augmentation of NK cell function. Overall, our study characterizes a comprehensive genetic landscape of COVID-19 severity and provides novel insights into the molecular mechanisms of severe disease, leading to new therapeutic targets and sensitive detection of at-risk individuals.
非老年成年人中重症 COVID-19 的决定因素尚不清楚,这限制了早期干预和治疗的机会。在此,我们提出了用于识别常见和罕见疾病相关基因变异的新型机器学习框架,其性能优于传统方法。通过整合人类肺部的单细胞多组学分析,将遗传信号与细胞类型特异性功能联系起来,我们在 19 种细胞类型中发现并验证了 1000 多个重症 COVID-19 的风险基因。已识别的风险基因在健康肺组织中过表达,但在重症疾病肺组织中相对下调。在常见和罕见变异中,重症 COVID-19 的遗传风险在自然杀伤(NK)细胞中尤其富集,这使得这些免疫细胞在重症疾病的发病机制中处于上游位置。孟德尔随机化表明,NK 细胞的 NKG2D 介导的激活失败会导致危重病。网络分析进一步将与 NK 细胞激活相关的多种途径,包括 I 型干扰素介导的信号传导,与重症 COVID-19 联系起来。我们的罕见变异模型 PULSE 能够基于全外显子组测序对非老年患者的重症疾病进行敏感预测;个性化预测独立于年龄和性别是准确的,并且在多个群体和队列中是一致的。基于外显子组测序的风险分层有可能促进对高危个体的暴露后预防,可能围绕增强 NK 细胞功能进行。总体而言,我们的研究描绘了 COVID-19 严重程度的全面遗传图谱,并为重症疾病的分子机制提供了新见解,从而产生了新的治疗靶点和对高危个体的敏感检测。