Suppr超能文献

GCAT|Panel,对来自高覆盖全基因组测序的伊比利亚人群进行的全面结构变异单倍型图谱。

GCAT|Panel, a comprehensive structural variant haplotype map of the Iberian population from high-coverage whole-genome sequencing.

机构信息

Life Sciences Department, Barcelona Supercomputing Center (BSC), Barcelona 08034, Spain.

Genomes for Life-GCAT lab Group, Institute for Health Science Research Germans Trias i Pujol (IGTP), Badalona 08916, Spain.

出版信息

Nucleic Acids Res. 2022 Mar 21;50(5):2464-2479. doi: 10.1093/nar/gkac076.

Abstract

The combined analysis of haplotype panels with phenotype clinical cohorts is a common approach to explore the genetic architecture of human diseases. However, genetic studies are mainly based on single nucleotide variants (SNVs) and small insertions and deletions (indels). Here, we contribute to fill this gap by generating a dense haplotype map focused on the identification, characterization, and phasing of structural variants (SVs). By integrating multiple variant identification methods and Logistic Regression Models (LRMs), we present a catalogue of 35 431 441 variants, including 89 178 SVs (≥50 bp), 30 325 064 SNVs and 5 017 199 indels, across 785 Illumina high coverage (30x) whole-genomes from the Iberian GCAT Cohort, containing a median of 3.52M SNVs, 606 336 indels and 6393 SVs per individual. The haplotype panel is able to impute up to 14 360 728 SNVs/indels and 23 179 SVs, showing a 2.7-fold increase for SVs compared with available genetic variation panels. The value of this panel for SVs analysis is shown through an imputed rare Alu element located in a new locus associated with Mononeuritis of lower limb, a rare neuromuscular disease. This study represents the first deep characterization of genetic variation within the Iberian population and the first operational haplotype panel to systematically include the SVs into genome-wide genetic studies.

摘要

对单体型面板与表型临床队列的联合分析是探索人类疾病遗传结构的常用方法。然而,遗传研究主要基于单核苷酸变异(SNVs)和小插入/缺失(indels)。在这里,我们通过生成一个专注于识别、特征描述和相位结构变异(SVs)的密集单体型图谱来填补这一空白。通过整合多种变异识别方法和逻辑回归模型(LRMs),我们提出了一个包含 35431441 个变体的目录,包括 89178 个 SVs(≥50 bp)、30325064 个 SNVs 和 5017199 个 indels,这些变体来自 785 个 Illumina 高覆盖(30x)全基因组,这些基因组来自伊比利亚 GCAT 队列,每个个体包含中位数为 3.52M SNVs、606336 个 indels 和 6393 个 SVs。单体型面板能够推断出多达 14360728 个 SNVs/indels 和 23179 个 SVs,与现有遗传变异面板相比,SVs 的推断数量增加了 2.7 倍。通过对位于与下肢单神经炎相关的新基因座中的一个罕见 Alu 元件的推断,证明了该面板在 SVs 分析中的价值,下肢单神经炎是一种罕见的神经肌肉疾病。这项研究代表了对伊比利亚人群内遗传变异的首次深度特征描述,也是第一个将 SVs 系统地纳入全基因组遗传研究的操作单体型面板。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/66c7/8934637/ebdc641307c1/gkac076fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验