Brisbin Abra, Fridley Brooke L
Department of Health Sciences Research, Mayo Clinic, 200 First Street SW, Rochester, MN 55905, USA
Stat Appl Genet Mol Biol. 2013 Aug;12(4):505-16. doi: 10.1515/sagmb-2012-0061.
Pathway topology and relationships between genes have the potential to provide information for modeling effects of mRNA gene expression on complex traits. For example, researchers may wish to incorporate the prior belief that "hub" genes (genes with many neighbors) are more likely to influence the trait. In this paper, we propose and compare six Bayesian pathway-based prior models to incorporate pathway topology information into association analyses. Including prior information regarding the relationships among genes in a pathway was effective in somewhat improving detection rates for genes associated with complex traits. Through an extensive set of simulations, we found that when hub (central) effects are expected, the diagonal degree model is preferred; when spoke (edge) effects are expected, the spatial power model is preferred. When there is no prior knowledge about the location of the effect genes in the pathway (e.g., hub versus spoke model), it is worthwhile to apply multiple models, as the model with the best DIC is not always the one with the best detection rate. We also applied the models to pharmacogenomic studies for the drugs gemcitabine and 6-mercaptopurine and found that the diagonal degree model identified an association between 6-mercaptopurine response and expression of the gene SLC28A3, which was not detectable using the model including no pathway information. These results demonstrate the value of incorporating pathway information into association analyses.
通路拓扑结构以及基因之间的关系有潜力为模拟mRNA基因表达对复杂性状的影响提供信息。例如,研究人员可能希望纳入“枢纽”基因(有许多邻居的基因)更有可能影响性状这一先验信念。在本文中,我们提出并比较了六种基于贝叶斯通路的先验模型,以将通路拓扑信息纳入关联分析。纳入关于通路中基因间关系的先验信息在一定程度上有效地提高了与复杂性状相关基因的检测率。通过大量模拟,我们发现,当预期有枢纽(中心)效应时,对角度模型更优;当预期有辐条(边缘)效应时,空间幂模型更优。当对通路中效应基因的位置没有先验知识时(例如,枢纽与辐条模型),应用多个模型是值得的,因为具有最佳DIC的模型并不总是具有最佳检测率的模型。我们还将这些模型应用于吉西他滨和6-巯基嘌呤药物的药物基因组学研究,发现对角度模型确定了6-巯基嘌呤反应与基因SLC28A3表达之间的关联,而使用不包含通路信息的模型则无法检测到这种关联。这些结果证明了将通路信息纳入关联分析的价值。