Suppr超能文献

一个利用来自印度各地2680名参与者的全基因组测序数据进行连锁不平衡和基因型推算的参考面板。

A reference panel for linkage disequilibrium and genotype imputation using whole-genome sequencing data from 2,680 participants across India.

作者信息

Li Zheng, Zhao Wei, Zhou Xiang, Leung Yuk Yee, Schellenberg Gerard D, Wang Li-San, Dey Sharmistha, Lee Jinkook, Smith Jennifer A, Dey Aparajit B, Kardia Sharon L R

出版信息

bioRxiv. 2025 Jul 4:2025.06.30.662450. doi: 10.1101/2025.06.30.662450.

Abstract

India is the most populous country globally, yet genetic studies involving Indian individuals remain limited. The Indian population is composed of many founder groups and has a mixed genetic ancestry, including an ancestral component not observed anywhere outside of India. This presents a unique opportunity to uncover novel disease variants and develop more tailored medical interventions. To facilitate genetic research in India, a crucial first step is to create a foundational resource that serves as a benchmark for future population studies and methods development. To this end, we have constructed the largest and most nationally representative linkage disequilibrium and genotype imputation reference panels in India to date, using high-coverage whole-genome sequencing data of 2,680 Indian participants from the Longitudinal Aging Study in India-Harmonized Diagnostic Assessment of Dementia (LASI-DAD). As an LD reference panel, LASI-DAD includes 69.5 million variants, representing 170% and 213% increases relative to the 1000 Genomes Project (1000G) and TOP-LD panels, respectively. Besides serving as an LD lookup panel, LASI-DAD facilitates various statistical analyses that rely on precise LD estimates. In a polygenic risk score (PRS) analysis, LASI-DAD improved the predictive performance of PRS by 2.1% to 35.1% across traits and studies. As an imputation reference panel, LASI-DAD improved the imputation accuracy by 3% to 101% (mean = 38%) compared to the TOPMed panel (Version R3) and by 3% to 73% (mean = 27%) compared to the Genome Asia Pilot (GAsP) panel across different minor allele frequencies. The LASI-DAD reference panel is publicly available to benefit future studies.

摘要

印度是全球人口最多的国家,但涉及印度个体的基因研究仍然有限。印度人口由许多奠基人群体组成,拥有混合的遗传血统,包括在印度以外任何地方都未观察到的祖先成分。这为发现新的疾病变异和开发更具针对性的医疗干预措施提供了独特的机会。为了促进印度的基因研究,关键的第一步是创建一个基础资源,作为未来人群研究和方法开发的基准。为此,我们利用来自印度纵向衰老研究——痴呆症统一诊断评估(LASI-DAD)的2680名印度参与者的高覆盖全基因组测序数据,构建了迄今为止印度最大且最具全国代表性的连锁不平衡和基因型填充参考面板。作为一个连锁不平衡参考面板,LASI-DAD包含6950万个变异,相对于千人基因组计划(1000G)和TOP-LD面板,分别增加了170%和213%。除了作为连锁不平衡查找面板外,LASI-DAD还便于进行各种依赖精确连锁不平衡估计的统计分析。在多基因风险评分(PRS)分析中,LASI-DAD在不同性状和研究中使PRS的预测性能提高了2.1%至35.1%。作为一个填充参考面板,与TOPMed面板(版本R3)相比,LASI-DAD在不同的次要等位基因频率下,填充准确率提高了3%至101%(平均为38%);与亚洲基因组试点(GAsP)面板相比,提高了3%至73%(平均为27%)。LASI-DAD参考面板已公开提供,以造福未来的研究。

相似文献

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验