Suppr超能文献

基因型表示图的快速表型模拟

Fast Phenotype Simulation for Genotype Representation Graphs.

作者信息

Syam Aditya, Adonizio Chris, Wei Xinzhu

机构信息

Department of Computational Biology, Cornell University, Ithaca, NY.

Department of Mathematics, Cornell University, Ithaca, NY.

出版信息

bioRxiv. 2025 Aug 20:2025.08.15.670378. doi: 10.1101/2025.08.15.670378.

Abstract

MOTIVATION

The Genotype Representation Graph (GRG) [DeHaas et al., 2025] is a graph representation of whole genome polymorphisms, designed to encode the variant hard-call information in phased whole genomes. It encodes the genotypes as an extremely compact graph that can be traversed efficiently, enabling dynamic programming-style algorithms on applications such as genome-wide association studies that run faster on biobank-scale data than existing alternatives. To facilitate scalable statistical genetics, we present , an extremely fast phenotype simulator for GRGs, suitable for simulating phenotypes on biobank-scale datasets.

RESULTS

contains all the primary functionalities of a phenotype simulator, uses a standardized output, and supports customized simulations. is dozens to hundreds of times faster than [Tagami et al., 2024], a fast ancestral recombination graph-based phenotype simulator, when the sample size ranges from thousands to hundreds of thousands samples.

AVAILABILITY

The GrgPhenoSim library and use-case demonstrations are available at https://github.com/aprilweilab/grg_pheno_simThe documentation for GrgPhenoSim is hosted at https://grgl.readthedocs.io/en/latest/index.html.

摘要

动机

基因型表示图(GRG)[德哈斯等人,2025年]是全基因组多态性的一种图形表示,旨在对分阶段全基因组中的变异硬调用信息进行编码。它将基因型编码为一个极其紧凑的图形,可以高效遍历,从而在全基因组关联研究等应用中实现动态规划风格的算法,在生物样本库规模的数据上运行速度比现有替代方法更快。为了促进可扩展的统计遗传学发展,我们提出了GrgPhenoSim,这是一种用于GRG的极快速表型模拟器,适用于在生物样本库规模的数据集上模拟表型。

结果

GrgPhenoSim包含表型模拟器的所有主要功能,使用标准化输出,并支持定制模拟。当样本量从数千个到数十万个样本时,GrgPhenoSim比基于快速祖先重组图的表型模拟器Tagami[田上等人,2024年]快几十到几百倍。

可用性

GrgPhenoSim库和用例演示可在https://github.com/aprilweilab/grg_pheno_sim获取。GrgPhenoSim的文档托管在https://grgl.readthedocs.io/en/latest/index.html。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/ca7b/12393345/f7667144b58a/nihpp-2025.08.15.670378v1-f0001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验