Suppr超能文献

英国生物库中 150119 个基因组的序列。

The sequences of 150,119 genomes in the UK Biobank.

机构信息

deCODE genetics/Amgen Inc., Reykjavik, Iceland.

School of Technology, Reykjavik University, Reykjavik, Iceland.

出版信息

Nature. 2022 Jul;607(7920):732-740. doi: 10.1038/s41586-022-04965-x. Epub 2022 Jul 20.

Abstract

Detailed knowledge of how diversity in the sequence of the human genome affects phenotypic diversity depends on a comprehensive and reliable characterization of both sequences and phenotypic variation. Over the past decade, insights into this relationship have been obtained from whole-exome sequencing or whole-genome sequencing of large cohorts with rich phenotypic data. Here we describe the analysis of whole-genome sequencing of 150,119 individuals from the UK Biobank. This constitutes a set of high-quality variants, including 585,040,410 single-nucleotide polymorphisms, representing 7.0% of all possible human single-nucleotide polymorphisms, and 58,707,036 indels. This large set of variants allows us to characterize selection based on sequence variation within a population through a depletion rank score of windows along the genome. Depletion rank analysis shows that coding exons represent a small fraction of regions in the genome subject to strong sequence conservation. We define three cohorts within the UK Biobank: a large British Irish cohort, a smaller African cohort and a South Asian cohort. A haplotype reference panel is provided that allows reliable imputation of most variants carried by three or more sequenced individuals. We identified 895,055 structural variants and 2,536,688 microsatellites, groups of variants typically excluded from large-scale whole-genome sequencing studies. Using this formidable new resource, we provide several examples of trait associations for rare variants with large effects not found previously through studies based on whole-exome sequencing and/or imputation.

摘要

详细了解人类基因组序列多样性如何影响表型多样性,取决于对序列和表型变异的全面和可靠描述。在过去的十年中,通过对具有丰富表型数据的大队列进行全外显子组测序或全基因组测序,人们对这种关系有了深入的了解。在这里,我们描述了对来自英国生物库的 150,119 个人的全基因组测序分析。这构成了一组高质量的变体,包括 585,040,410 个单核苷酸多态性,占所有可能人类单核苷酸多态性的 7.0%,以及 58,707,036 个插入缺失。这一大组变体使我们能够通过基因组中窗口的耗竭等级评分来描述基于群体内序列变异的选择。耗竭等级分析表明,编码外显子仅代表基因组中受强序列保守性影响的区域的一小部分。我们在英国生物库内定义了三个队列:一个大型的英国-爱尔兰队列、一个较小的非洲队列和一个南亚队列。提供了一个单倍型参考面板,允许对由三个或更多测序个体携带的大多数变体进行可靠的推断。我们鉴定了 895,055 个结构变体和 2,536,688 个微卫星,这些变体通常被排除在大规模全基因组测序研究之外。利用这一强大的新资源,我们提供了一些罕见变异与大效应相关的特征的例子,这些例子以前没有通过全外显子组测序和/或推断研究发现过。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4ee6/9329122/6068ad1abbc6/41586_2022_4965_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验