Suppr超能文献

一个多样化的祖先匹配参考面板可以提高代表性不足人群中的基因型推断准确性。

A diverse ancestrally-matched reference panel increases genotype imputation accuracy in a underrepresented population.

机构信息

Center of Excellence in Arrhythmia Research, Department of Medicine, Faculty of Medicine, Chulalongkorn University, Bangkok, Thailand.

Interdisciplinary Program of Biomedical Sciences, Graduate School, Chulalongkorn University, Bangkok, Thailand.

出版信息

Sci Rep. 2023 Jul 31;13(1):12360. doi: 10.1038/s41598-023-39429-3.

Abstract

Variant imputation, a common practice in genome-wide association studies, relies on reference panels to infer unobserved genotypes. Multiple public reference panels are currently available with variations in size, sequencing depth, and represented populations. Currently, limited data exist regarding the performance of public reference panels when used in an imputation of populations underrepresented in the reference panel. Here, we compare the performance of various public reference panels: 1000 Genomes Project, Haplotype Reference Consortium, GenomeAsia 100 K, and the recent Trans-Omics for Precision Medicine (TOPMed) program, when used in an imputation of samples from the Thai population. Genotype yields were assessed, and imputation accuracies were examined by comparison with high-depth whole genome sequencing data of the same sample. We found that imputation using the TOPMed panel yielded the largest number of variants (~ 271 million). Despite being the smallest in size, GenomeAsia 100 K achieved the best imputation accuracy with a median genotype concordance rate of 0.97. For rare variants, GenomeAsia 100 K also offered the best accuracy, although rare variants were less accurately imputable than common variants (30.3% reduction in concordance rates). The high accuracy observed when using GenomeAsia 100 K is likely attributable to the diverse representation of populations genetically similar to the study cohort emphasizing the benefits of sequencing populations classically underrepresented in human genomics.

摘要

变异推断是全基因组关联研究中的常用方法,它依赖于参考面板来推断未观察到的基因型。目前有多种大小、测序深度和代表性人群不同的公共参考面板。目前,关于在参考面板中代表性不足的人群进行推断时使用公共参考面板的性能的数据有限。在这里,我们比较了各种公共参考面板的性能:1000 基因组计划、单倍型参考联盟、基因组亚洲 100K 和最近的跨组学精准医学 (TOPMed) 计划,当用于推断泰国人群的样本时。评估了基因型产量,并通过与相同样本的高深度全基因组测序数据进行比较来检查推断准确性。我们发现,使用 TOPMed 面板进行推断产生了最多的变体(约 2.71 亿个)。尽管大小最小,但 GenomeAsia 100K 实现了最佳的推断准确性,中位基因型一致性率为 0.97。对于罕见变异,GenomeAsia 100K 也提供了最佳的准确性,尽管罕见变异的推断准确性低于常见变异(一致性率降低 30.3%)。使用 GenomeAsia 100K 观察到的高精度可能归因于与研究队列遗传相似的人群的多样化代表性,强调了对人类基因组学中经典代表性不足的测序人群进行测序的好处。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/47a6/10390539/4579d5ebadb3/41598_2023_39429_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验