Institute for Cardiogenetics, University of Lübeck, 23562, Lübeck, Germany.
Charité - University Medicine Berlin, Corporate Member of Freie Universität Berlin, Humboldt-Universität Zu Berlin, and Berlin Institute of Health, Institute for Dental and Craniofacial Sciences, Department of Periodontology and Synoptic Dentistry, 14197, Berlin, Germany.
Sci Rep. 2020 Nov 24;10(1):20417. doi: 10.1038/s41598-020-75770-7.
Exploration of genetic variant-to-gene relationships by quantitative trait loci such as expression QTLs is a frequently used tool in genome-wide association studies. However, the wide range of public QTL databases and the lack of batch annotation features complicate a comprehensive annotation of GWAS results. In this work, we introduce the tool "Qtlizer" for annotating lists of variants in human with associated changes in gene expression and protein abundance using an integrated database of published QTLs. Features include incorporation of variants in linkage disequilibrium and reverse search by gene names. Analyzing the database for base pair distances between best significant eQTLs and their affected genes suggests that the commonly used cis-distance limit of 1,000,000 base pairs might be too restrictive, implicating a substantial amount of wrongly and yet undetected eQTLs. We also ranked genes with respect to the maximum number of tissue-specific eQTL studies in which a most significant eQTL signal was consistent. For the top 100 genes we observed the strongest enrichment with housekeeping genes (P = 2 × 10) and with the 10% highest expressed genes (P = 0.005) after grouping eQTLs by r > 0.95, underlining the relevance of LD information in eQTL analyses. Qtlizer can be accessed via https://genehopper.de/qtlizer or by using the respective Bioconductor R-package ( https://doi.org/10.18129/B9.bioc.Qtlizer ).
通过表达数量性状基因座(如表达 QTL)等数量性状基因座探索基因变异与基因的关系是全基因组关联研究中常用的工具。然而,广泛的公共 QTL 数据库和缺乏批量注释功能使得全面注释 GWAS 结果变得复杂。在这项工作中,我们引入了“Qtlizer”工具,该工具使用已发表的 QTL 综合数据库,根据相关基因表达和蛋白质丰度的变化对人类变体列表进行注释。功能包括将处于连锁不平衡中的变体和通过基因名称进行反向搜索纳入其中。分析数据库中最佳显著 eQTL 与其受影响基因之间的碱基对距离表明,常用的 cis 距离限制为 100 万碱基可能过于严格,这意味着大量的 eQTL 被错误地且尚未被检测到。我们还根据组织特异性 eQTL 研究中出现最显著 eQTL 信号的最大数量对基因进行了排名,这些研究的一致性最高。对于前 100 个基因,我们观察到与管家基因(P=2×10)和表达最高的 10%基因(P=0.005)的最强富集,这在 r>0.95 下对 eQTL 进行分组后强调了 LD 信息在 eQTL 分析中的相关性。可以通过 https://genehopper.de/qtlizer 或使用相应的 Bioconductor R 包(https://doi.org/10.18129/B9.bioc.Qtlizer)访问 Qtlizer。