Suppr超能文献

全基因组关联研究的进化枝提纯法

Clade Distillation for Genome-wide Association Studies.

作者信息

Christ Ryan, Wang Xinxin, Aslett Louis J M, Steinsaltz David, Hall Ira

机构信息

Department of Genetics, Yale University School of Medicine, New Haven, CT 06510, USA.

Department of Genetics, Washington University School of Medicine, Saint Louis, MO 63110, USA.

出版信息

Genetics. 2025 Aug 7. doi: 10.1093/genetics/iyaf158.

Abstract

Testing inferred haplotype genealogies for association with phenotypes has been a longstanding goal in human genetics given their potential to detect association signals driven by allelic heterogeneity - when multiple causal variants modulate a phenotype - in both coding and noncoding regions. Recent scalable methods for inferring locus-specific genealogical trees along the genome, or representations thereof, have made substantial progress towards this goal; however, the problem of testing these trees for association with phenotypes has remained unsolved due to the growth in the number of clades with increasing sample size. To address this issue, we introduce several practical improvements to the kalis ancestry inference engine, including a general optimal checkpointing algorithm for decoding hidden Markov models, thereby enabling efficient genome-wide analyses. We then propose LOCATER, a powerful new procedure based on the recently proposed Stable Distillation framework, to test local tree representations for trait association. Although LOCATER is demonstrated here in conjunction with kalis, it may be used for testing output from any ancestry inference engine, regardless of whether such engines return discrete tree structures, relatedness matrices, or some combination of the two at each locus. Using simulated quantitative phenotypes, our results indicate that LOCATER achieves substantial power gains over traditional single marker testing, ARG-Needle, and window-based testing in cases of allelic heterogeneity, while also improving causal region localization. These findings suggest that genealogy-based association testing will be a fruitful approach for gene discovery, especially for signals driven by multiple ultra-rare variants.

摘要

鉴于推断的单倍型谱系有可能检测由等位基因异质性驱动的关联信号(当多个因果变异调节一种表型时),在编码区和非编码区进行表型关联测试一直是人类遗传学的一个长期目标。最近用于推断全基因组位点特异性系谱树或其表示形式的可扩展方法,在实现这一目标方面取得了重大进展;然而,由于随着样本量增加分支数量也在增加,测试这些树与表型的关联问题仍未解决。为了解决这个问题,我们对卡利斯祖先推断引擎进行了一些实际改进,包括一种用于解码隐马尔可夫模型的通用最优检查点算法,从而实现高效的全基因组分析。然后,我们提出了LOCATER,这是一种基于最近提出的稳定蒸馏框架的强大新程序,用于测试局部树表示与性状的关联。尽管这里结合卡利斯展示了LOCATER,但它可用于测试任何祖先推断引擎的输出,无论这些引擎在每个位点返回的是离散树结构、亲缘关系矩阵还是两者的某种组合。使用模拟的定量表型,我们的结果表明,在等位基因异质性情况下,LOCATER相对于传统的单标记测试、ARG-Needle和基于窗口的测试实现了显著的功效提升,同时还改善了因果区域定位。这些发现表明,基于谱系的关联测试将是一种富有成效的基因发现方法,特别是对于由多个超罕见变异驱动的信号。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验