Suppr超能文献

一种结合数百个功能特征与关联证据以改善变异体优先级排序的贝叶斯方法。

A Bayesian method to incorporate hundreds of functional characteristics with association evidence to improve variant prioritization.

作者信息

Gagliano Sarah A, Barnes Michael R, Weale Michael E, Knight Jo

机构信息

Centre for Addiction and Mental Health, Toronto, Ontario, Canada; Institute of Medical Science, University of Toronto, Toronto, Ontario, Canada.

William Harvey Research Institute, Barts and The London School of Medicine and Dentistry, Queen Mary University of London, London, United Kingdom.

出版信息

PLoS One. 2014 May 20;9(5):e98122. doi: 10.1371/journal.pone.0098122. eCollection 2014.

Abstract

The increasing quantity and quality of functional genomic information motivate the assessment and integration of these data with association data, including data originating from genome-wide association studies (GWAS). We used previously described GWAS signals ("hits") to train a regularized logistic model in order to predict SNP causality on the basis of a large multivariate functional dataset. We show how this model can be used to derive Bayes factors for integrating functional and association data into a combined Bayesian analysis. Functional characteristics were obtained from the Encyclopedia of DNA Elements (ENCODE), from published expression quantitative trait loci (eQTL), and from other sources of genome-wide characteristics. We trained the model using all GWAS signals combined, and also using phenotype specific signals for autoimmune, brain-related, cancer, and cardiovascular disorders. The non-phenotype specific and the autoimmune GWAS signals gave the most reliable results. We found SNPs with higher probabilities of causality from functional characteristics showed an enrichment of more significant p-values compared to all GWAS SNPs in three large GWAS studies of complex traits. We investigated the ability of our Bayesian method to improve the identification of true causal signals in a psoriasis GWAS dataset and found that combining functional data with association data improves the ability to prioritise novel hits. We used the predictions from the penalized logistic regression model to calculate Bayes factors relating to functional characteristics and supply these online alongside resources to integrate these data with association data.

摘要

功能基因组信息在数量和质量上的不断增加,促使人们对这些数据与关联数据进行评估和整合,其中包括源自全基因组关联研究(GWAS)的数据。我们使用先前描述的GWAS信号(“命中”)来训练一个正则化逻辑模型,以便基于一个大型多变量功能数据集预测单核苷酸多态性(SNP)的因果关系。我们展示了如何使用该模型来推导贝叶斯因子,以便将功能数据和关联数据整合到一个联合贝叶斯分析中。功能特征来自DNA元件百科全书(ENCODE)、已发表的表达定量性状位点(eQTL)以及其他全基因组特征来源。我们使用所有组合的GWAS信号训练模型,同时也使用针对自身免疫性、脑相关、癌症和心血管疾病的表型特异性信号进行训练。非表型特异性和自身免疫性GWAS信号给出了最可靠的结果。我们发现,在三项关于复杂性状的大型GWAS研究中,与所有GWAS SNP相比,具有较高因果关系概率的SNP显示出更显著的p值富集。我们研究了我们的贝叶斯方法在银屑病GWAS数据集中改善真实因果信号识别的能力,发现将功能数据与关联数据相结合可提高对新命中结果进行优先级排序的能力。我们使用惩罚逻辑回归模型的预测来计算与功能特征相关的贝叶斯因子,并在网上提供这些因子以及将这些数据与关联数据整合的资源。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b1c8/4028284/36a145866218/pone.0098122.g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验