Suppr超能文献

基于个体血缘关系和基因转换的生物银行规模个体推断。

Biobank-scale inference of multi-individual identity by descent and gene conversion.

机构信息

Department of Biostatistics, University of Washington, Seattle, WA, USA.

Department of Biostatistics, University of Washington, Seattle, WA, USA; Division of Medical Genetics, Department of Medicine, University of Washington, Seattle, WA, USA.

出版信息

Am J Hum Genet. 2024 Apr 4;111(4):691-700. doi: 10.1016/j.ajhg.2024.02.015. Epub 2024 Mar 20.

Abstract

We present a method for efficiently identifying clusters of identical-by-descent haplotypes in biobank-scale sequence data. Our multi-individual approach enables much more computationally efficient inference of identity by descent (IBD) than approaches that infer pairwise IBD segments and provides locus-specific IBD clusters rather than IBD segments. Our method's computation time, memory requirements, and output size scale linearly with the number of individuals in the dataset. We also present a method for using multi-individual IBD to detect alleles changed by gene conversion. Application of our methods to the autosomal sequence data for 125,361 White British individuals in the UK Biobank detects more than 9 million converted alleles. This is 2,900 times more alleles changed by gene conversion than were detected in a previous analysis of familial data. We estimate that more than 250,000 sequenced probands and a much larger number of additional genomes from multi-generational family members would be required to find a similar number of alleles changed by gene conversion using a family-based approach. Our IBD clustering method is implemented in the open-source ibd-cluster software package.

摘要

我们提出了一种在大型生物库规模的序列数据中高效识别同一位点基因型簇的方法。我们的多个体方法比推断个体间 IBD 片段的方法更有效地推断同源重组(IBD),并提供了具有特定基因座的 IBD 簇,而不是 IBD 片段。我们的方法的计算时间、内存需求和输出大小与数据集的个体数量呈线性关系。我们还提出了一种使用多个体 IBD 来检测基因转换改变的等位基因的方法。将我们的方法应用于英国生物库中 125361 名白种英国人的常染色体序列数据中,检测到超过 900 万个发生基因转换的等位基因。这是以前对家族数据进行分析时检测到的基因转换改变的等位基因数量的 2900 倍。我们估计,使用基于家族的方法找到类似数量的基因转换改变的等位基因,需要测序的先证者超过 25 万例,以及来自多代家族成员的更多数量的基因组。我们的 IBD 聚类方法在开源的 ibd-cluster 软件包中实现。

相似文献

10
Efficient clustering of identity-by-descent between multiple individuals.多个个体之间的血缘关系的高效聚类。
Bioinformatics. 2014 Apr 1;30(7):915-22. doi: 10.1093/bioinformatics/btt734. Epub 2013 Dec 19.

引用本文的文献

3
Power and Limitations of Inferring Genetic Ancestry.推断遗传血统的能力与局限性
Ann Hum Genet. 2025 Sep;89(5):264-273. doi: 10.1111/ahg.70007. Epub 2025 Jul 15.
6
Fast simulation of identity-by-descent segments.同源片段的快速模拟。
Bull Math Biol. 2025 May 23;87(7):84. doi: 10.1007/s11538-025-01464-8.
9
Complete human recombination maps.完整的人类重组图谱。
Nature. 2025 Mar;639(8055):700-707. doi: 10.1038/s41586-024-08450-5. Epub 2025 Jan 22.
10
Fast simulation of identity-by-descent segments.同源片段的快速模拟。
bioRxiv. 2025 Jan 7:2024.12.13.628449. doi: 10.1101/2024.12.13.628449.

本文引用的文献

4
Statistical phasing of 150,119 sequenced genomes in the UK Biobank.英国生物库中 150119 个测序基因组的统计相位。
Am J Hum Genet. 2023 Jan 5;110(1):161-165. doi: 10.1016/j.ajhg.2022.11.008. Epub 2022 Nov 29.
5
Estimating the genome-wide mutation rate from thousands of unrelated individuals.从数千个无关个体估计全基因组突变率。
Am J Hum Genet. 2022 Dec 1;109(12):2178-2184. doi: 10.1016/j.ajhg.2022.10.015. Epub 2022 Nov 11.
6
The sequences of 150,119 genomes in the UK Biobank.英国生物库中 150119 个基因组的序列。
Nature. 2022 Jul;607(7920):732-740. doi: 10.1038/s41586-022-04965-x. Epub 2022 Jul 20.
8
Current Developments in Detection of Identity-by-Descent Methods and Applications.同源性检测方法的当前发展与应用
Front Genet. 2021 Sep 10;12:722602. doi: 10.3389/fgene.2021.722602. eCollection 2021.
9
Fast two-stage phasing of large-scale sequence data.大规模序列数据的快速两阶段相位测定。
Am J Hum Genet. 2021 Oct 7;108(10):1880-1890. doi: 10.1016/j.ajhg.2021.08.005. Epub 2021 Sep 2.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验