Cambridge Institute of Therapeutic Immunology & Infectious Disease (CITIID), Jeffrey Cheah Biomedical Centre, Cambridge Biomedical Campus, University of Cambridge, Puddicombe Way, Cambridge, CB2 0AW, UK.
Department of Medicine, University of Cambridge School of Clinical Medicine, Cambridge Biomedical Campus, Cambridge, CB2 0QQ, UK.
Genome Med. 2020 Nov 25;12(1):106. doi: 10.1186/s13073-020-00797-4.
Genome-wide association studies (GWAS) have identified pervasive sharing of genetic architectures across multiple immune-mediated diseases (IMD). By learning the genetic basis of IMD risk from common diseases, this sharing can be exploited to enable analysis of less frequent IMD where, due to limited sample size, traditional GWAS techniques are challenging.
Exploiting ideas from Bayesian genetic fine-mapping, we developed a disease-focused shrinkage approach to allow us to distill genetic risk components from GWAS summary statistics for a set of related diseases. We applied this technique to 13 larger GWAS of common IMD, deriving a reduced dimension "basis" that summarised the multidimensional components of genetic risk. We used independent datasets including the UK Biobank to assess the performance of the basis and characterise individual axes. Finally, we projected summary GWAS data for smaller IMD studies, with less than 1000 cases, to assess whether the approach was able to provide additional insights into genetic architecture of less common IMD or IMD subtypes, where cohort collection is challenging.
We identified 13 IMD genetic risk components. The projection of independent UK Biobank data demonstrated the IMD specificity and accuracy of the basis even for traits with very limited case-size (e.g. vitiligo, 150 cases). Projection of additional IMD-relevant studies allowed us to add biological interpretation to specific components, e.g. related to raised eosinophil counts in blood and serum concentration of the chemokine CXCL10 (IP-10). On application to 22 rare IMD and IMD subtypes, we were able to not only highlight subtype-discriminating axes (e.g. for juvenile idiopathic arthritis) but also suggest eight novel genetic associations.
Requiring only summary-level data, our unsupervised approach allows the genetic architectures across any range of clinically related traits to be characterised in fewer dimensions. This facilitates the analysis of studies with modest sample size by matching shared axes of both genetic and biological risk across a wider disease domain, and provides an evidence base for possible therapeutic repurposing opportunities.
全基因组关联研究 (GWAS) 已经确定了多种免疫介导疾病 (IMD) 之间广泛存在的遗传结构共享。通过从常见疾病中学习 IMD 风险的遗传基础,可以利用这种共享来分析不太常见的 IMD,由于样本量有限,传统的 GWAS 技术具有挑战性。
利用贝叶斯遗传精细映射的思想,我们开发了一种针对疾病的收缩方法,使我们能够从一组相关疾病的 GWAS 汇总统计数据中提取遗传风险成分。我们将该技术应用于 13 项更大的常见 IMD GWAS,得出一个简化的维度“基础”,该基础总结了遗传风险的多维成分。我们使用包括英国生物库在内的独立数据集来评估基础的性能并描述个体轴。最后,我们将较小的 IMD 研究的汇总 GWAS 数据投影到该基础上,以评估该方法是否能够为遗传结构不太常见的 IMD 或 IMD 亚型提供更多见解,在这些疾病中,收集队列具有挑战性。
我们确定了 13 个 IMD 遗传风险成分。对独立的英国生物库数据的投影表明,即使对于病例数非常有限的特征(例如白癜风,150 例),该基础也具有 IMD 特异性和准确性。对其他 IMD 相关研究的投影使我们能够为特定成分添加生物学解释,例如与血液中嗜酸性粒细胞计数升高和趋化因子 CXCL10(IP-10)的血清浓度相关的成分。在应用于 22 种罕见的 IMD 和 IMD 亚型时,我们不仅能够突出区分亚型的轴(例如,青少年特发性关节炎),还能够提出 8 个新的遗传关联。
我们的无监督方法仅需要汇总数据,可以用更少的维度来描述任何范围内与临床相关的特征的遗传结构。这通过在更广泛的疾病领域中匹配遗传和生物风险的共享轴,为具有适度样本量的研究分析提供了便利,并为可能的治疗重新定位机会提供了证据基础。