Suppr超能文献

变异调整窗口中的条件熵可检测与表达数量性状基因座(eQTL)相关的选择特征。

Conditional entropy in variation-adjusted windows detects selection signatures associated with expression quantitative trait loci (eQTLs).

作者信息

Handelman Samuel K, Seweryn Michal, Smith Ryan M, Hartmann Katherine, Wang Danxin, Pietrzak Maciej, Johnson Andrew D, Kloczkowski Andrzej, Sadee Wolfgang

出版信息

BMC Genomics. 2015;16 Suppl 8(Suppl 8):S8. doi: 10.1186/1471-2164-16-S8-S8. Epub 2015 Jun 18.

Abstract

BACKGROUND

Over the past 50,000 years, shifts in human-environmental or human-human interactions shaped genetic differences within and among human populations, including variants under positive selection. Shaped by environmental factors, such variants influence the genetics of modern health, disease, and treatment outcome. Because evolutionary processes tend to act on gene regulation, we test whether regulatory variants are under positive selection. We introduce a new approach to enhance detection of genetic markers undergoing positive selection, using conditional entropy to capture recent local selection signals.

RESULTS

We use conditional logistic regression to compare our Adjusted Haplotype Conditional Entropy (H|H) measure of positive selection to existing positive selection measures. H|H and existing measures were applied to published regulatory variants acting in cis (cis-eQTLs), with conditional logistic regression testing whether regulatory variants undergo stronger positive selection than the surrounding gene. These cis-eQTLs were drawn from six independent studies of genotype and RNA expression. The conditional logistic regression shows that, overall, H|H is substantially more powerful than existing positive-selection methods in identifying cis-eQTLs against other Single Nucleotide Polymorphisms (SNPs) in the same genes. When broken down by Gene Ontology, H|H predictions are particularly strong in some biological process categories, where regulatory variants are under strong positive selection compared to the bulk of the gene, distinct from those GO categories under overall positive selection. . However, cis-eQTLs in a second group of genes lack positive selection signatures detectable by H|H, consistent with ancient short haplotypes compared to the surrounding gene (for example, in innate immunity GO:0042742); under such other modes of selection, H|H would not be expected to be a strong predictor.. These conditional logistic regression models are adjusted for Minor allele frequency(MAF); otherwise, ascertainment bias is a huge factor in all eQTL data sets. Relationships between Gene Ontology categories, positive selection and eQTL specificity were replicated with H|H in a single larger data set. Our measure, Adjusted Haplotype Conditional Entropy (H|H), was essential in generating all of the results above because it: 1) is a stronger overall predictor for eQTLs than comparable existing approaches, and 2) shows low sequential auto-correlation, overcoming problems with convergence of these conditional regression statistical models.

CONCLUSIONS

Our new method, H|H, provides a consistently more robust signal associated with cis-eQTLs compared to existing methods. We interpret this to indicate that some cis-eQTLs are under positive selection compared to their surrounding genes. Conditional entropy indicative of a selective sweep is an especially strong predictor of eQTLs for genes in several biological processes of medical interest. Where conditional entropy is a weak or negative predictor of eQTLs, such as innate immune genes, this would be consistent with balancing selection acting on such eQTLs over long time periods. Different measures of selection may be needed for variant prioritization under other modes of evolutionary selection.

摘要

背景

在过去的5万年里,人类与环境或人与人之间的相互作用的变化塑造了人类群体内部和群体之间的基因差异,包括正选择下的变异。受环境因素影响,这些变异影响着现代健康、疾病和治疗结果的遗传学。由于进化过程往往作用于基因调控,我们测试调控变异是否处于正选择之下。我们引入了一种新方法来增强对经历正选择的遗传标记的检测,使用条件熵来捕获近期的局部选择信号。

结果

我们使用条件逻辑回归将我们的正选择调整单倍型条件熵(H|H)度量与现有的正选择度量进行比较。H|H和现有度量应用于已发表的顺式作用调控变异(顺式eQTL),通过条件逻辑回归测试调控变异是否比周围基因经历更强的正选择。这些顺式eQTL来自六项关于基因型和RNA表达的独立研究。条件逻辑回归表明,总体而言,在识别顺式eQTL相对于同一基因中的其他单核苷酸多态性(SNP)方面,H|H比现有的正选择方法更强大。按基因本体分类分解时,H|H预测在某些生物过程类别中特别强,在这些类别中,与基因主体相比,调控变异处于强烈的正选择之下,这与整体正选择下的那些基因本体类别不同。然而,第二组基因中的顺式eQTL缺乏H|H可检测到的正选择特征,这与与周围基因相比古老的短单倍型一致(例如,在先天免疫基因本体:0042742中);在这种其他选择模式下,预计H|H不会是一个强大的预测指标。这些条件逻辑回归模型针对次要等位基因频率(MAF)进行了调整;否则,在所有eQTL数据集中,确定偏差是一个巨大因素。在一个更大的单一数据集中,基因本体类别、正选择和eQTL特异性之间的关系用H|H进行了复制。我们的度量,调整单倍型条件熵(H|H),对于产生上述所有结果至关重要,因为它:1)对于eQTL而言,是比现有可比方法更强的总体预测指标,并且2)显示出低序列自相关性,克服了这些条件回归统计模型的收敛问题。

结论

与现有方法相比,我们的新方法H|H提供了与顺式eQTL相关的始终更稳健的信号。我们将此解释为表明一些顺式eQTL与其周围基因相比处于正选择之下。指示选择性清除的条件熵对于医学上感兴趣的几个生物过程中的基因的eQTL是一个特别强的预测指标。在条件熵是eQTL的弱预测指标或负预测指标的情况下,例如先天免疫基因,这与长期作用于此类eQTL的平衡选择一致。在其他进化选择模式下,可能需要不同的选择度量来对变异进行优先级排序。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e601/4480832/291d72e70d66/1471-2164-16-S8-S8-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验