• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于家系的单体型相位准确性估计会受到基因型错误的偏倚。

Genotype error biases trio-based estimates of haplotype phase accuracy.

机构信息

Department of Medicine, Division of Medical Genetics, University of Washington, Seattle, WA 98195, USA; Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.

Department of Biostatistics, University of Washington, Seattle, WA 98195, USA.

出版信息

Am J Hum Genet. 2022 Jun 2;109(6):1016-1025. doi: 10.1016/j.ajhg.2022.04.019.

DOI:10.1016/j.ajhg.2022.04.019
PMID:35659928
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9247820/
Abstract

Haplotypes can be estimated from unphased genotype data via statistical methods. When parent-offspring trios are available for inferring the true phase from Mendelian inheritance rules, the accuracy of statistical phasing is usually measured by the switch error rate, which is the proportion of pairs of consecutive heterozygotes that are incorrectly phased. We present a method for estimating the genotype error rate from parent-offspring trios and a method for estimating the bias that occurs in the observed switch error rate as a result of genotype error. We apply these methods to 485,301 genotyped UK Biobank samples that include 898 White British trios and to 38,387 sequenced TOPMed samples that include 217 African Caribbean trios and 669 European American trios. We show that genotype error inflates the observed switch error rate and that the relative bias increases with sample size. For the UK Biobank White British trios, the observed switch error rate in the trio offspring is 2.4 times larger than the estimated true switch error rate (1.4 × 10 vs 5.8 × 10. We propose an alternate definition of phase error that counts two consecutive switch errors as a single error because back-to-back switch errors arise when a single heterozygote is incorrectly phased with respect to the surrounding heterozygotes. With this definition, we estimate that the average distance between phase errors is 64 megabases in the UK Biobank White British individuals.

摘要

单体型可以通过统计方法从未相位基因型数据中估计。当有父母-子女三体型可用于根据孟德尔遗传规律推断真实相位时,统计相位的准确性通常通过转换错误率来衡量,即连续杂合子错误相位的比例。我们提出了一种从父母-子女三体型估计基因型错误率的方法,以及一种估计由于基因型错误而导致观察到的转换错误率中出现偏差的方法。我们将这些方法应用于包含 898 对英国生物库白种人三体型和 38387 个测序的 TOPMed 样本的 485301 个已分型 UK Biobank 样本,其中包括 217 对非裔加勒比三体型和 669 对欧洲裔三体型。我们表明,基因型错误会增加观察到的转换错误率,并且相对偏差随样本量增加而增加。对于英国生物库白种人三体型,三体型后代中的观察到的转换错误率是估计的真实转换错误率的 2.4 倍(1.4×10 比 5.8×10)。我们提出了一种替代的相位错误定义,将两个连续的转换错误计为单个错误,因为当单个杂合子相对于周围的杂合子错误相位时,会出现连续的转换错误。根据这个定义,我们估计在英国生物库白种人个体中,相位错误的平均距离为 6400 万个碱基对。

相似文献

1
Genotype error biases trio-based estimates of haplotype phase accuracy.基于家系的单体型相位准确性估计会受到基因型错误的偏倚。
Am J Hum Genet. 2022 Jun 2;109(6):1016-1025. doi: 10.1016/j.ajhg.2022.04.019.
2
Statistical phasing of 150,119 sequenced genomes in the UK Biobank.英国生物库中 150119 个测序基因组的统计相位。
Am J Hum Genet. 2023 Jan 5;110(1):161-165. doi: 10.1016/j.ajhg.2022.11.008. Epub 2022 Nov 29.
3
Estimating the Genome-wide Mutation Rate with Three-Way Identity by Descent.利用三亲同缘关系估计全基因组突变率。
Am J Hum Genet. 2019 Nov 7;105(5):883-893. doi: 10.1016/j.ajhg.2019.09.012. Epub 2019 Oct 3.
4
trioPhaser: using Mendelian inheritance logic to improve genomic phasing of trios.trioPhaser:利用孟德尔遗传逻辑提高三体型的基因组相位。
BMC Bioinformatics. 2021 Nov 22;22(1):559. doi: 10.1186/s12859-021-04470-4.
5
Simultaneous estimation of genotype error and uncalled deletion rates in whole genome sequence data.同时估计全基因组序列数据中的基因型错误和未检出缺失率。
PLoS Genet. 2024 May 24;20(5):e1011297. doi: 10.1371/journal.pgen.1011297. eCollection 2024 May.
6
A unified approach to genotype imputation and haplotype-phase inference for large data sets of trios and unrelated individuals.针对三联体和无关个体的大型数据集进行基因型填充和单倍型相位推断的统一方法。
Am J Hum Genet. 2009 Feb;84(2):210-23. doi: 10.1016/j.ajhg.2009.01.005. Epub 2009 Feb 5.
7
A comparison of phasing algorithms for trios and unrelated individuals.三联体与无关个体的定相算法比较。
Am J Hum Genet. 2006 Mar;78(3):437-50. doi: 10.1086/500808. Epub 2006 Jan 26.
8
Genotype calling and haplotyping in parent-offspring trios.对亲代-子代三体系进行基因型分型和单体型分型。
Genome Res. 2013 Jan;23(1):142-51. doi: 10.1101/gr.142455.112. Epub 2012 Oct 11.
9
Fast two-stage phasing of large-scale sequence data.大规模序列数据的快速两阶段相位测定。
Am J Hum Genet. 2021 Oct 7;108(10):1880-1890. doi: 10.1016/j.ajhg.2021.08.005. Epub 2021 Sep 2.
10
Benchmarking phasing software with a whole-genome sequenced cattle pedigree.利用全基因组测序的牛系谱对相位软件进行基准测试。
BMC Genomics. 2022 Feb 15;23(1):130. doi: 10.1186/s12864-022-08354-6.

引用本文的文献

1
High-resolution detection of copy number alterations in single cells with HiScanner.使用HiScanner对单细胞中的拷贝数改变进行高分辨率检测。
Nat Commun. 2025 Jul 1;16(1):5477. doi: 10.1038/s41467-025-60446-5.
2
Identifying individuals with rare disease variants by inferring shared ancestral haplotypes from SNP array data.通过从SNP阵列数据推断共享祖先单倍型来识别携带罕见病变异的个体。
NAR Genom Bioinform. 2025 Apr 4;7(2):lqaf033. doi: 10.1093/nargab/lqaf033. eCollection 2025 Jun.
3
A T2T-CHM13 recombination map and globally diverse haplotype reference panel improves phasing and imputation.一个端粒到端粒的人类基因组参考组装(T2T-CHM13)重组图谱和全球多样化单倍型参考面板改善了定相和填充。
bioRxiv. 2025 Feb 28:2025.02.24.639687. doi: 10.1101/2025.02.24.639687.
4
Identifying causal genotype-phenotype relationships for population-sampled parent-child trios.确定群体抽样的亲子三联体的因果基因型-表型关系。
bioRxiv. 2024 Dec 11:2024.12.10.627752. doi: 10.1101/2024.12.10.627752.
5
How to handle high subgenome sequence similarity in allopolyploid Fragaria x ananassa: linkage disequilibrium based variant filtering.如何处理异源多倍体草莓 Fragaria x ananassa 中的高亚基因组序列相似性:基于连锁不平衡的变异过滤。
BMC Genomics. 2024 Nov 28;25(1):1150. doi: 10.1186/s12864-024-10987-8.
6
Dissecting the genetic basis of resistance to Soil-borne cereal mosaic virus (SBCMV) in durum wheat by bi-parental mapping and GWAS.利用双亲作图和 GWAS 解析抗土传小麦花叶病毒(SBCMV)的遗传基础。
Theor Appl Genet. 2024 Sep 2;137(9):213. doi: 10.1007/s00122-024-04709-7.
7
Reconstructing parent genomes using siblings and other relatives.利用兄弟姐妹及其他亲属重建亲代基因组。
bioRxiv. 2024 May 14:2024.05.10.593578. doi: 10.1101/2024.05.10.593578.
8
Simultaneous estimation of genotype error and uncalled deletion rates in whole genome sequence data.同时估计全基因组序列数据中的基因型错误和未检出缺失率。
PLoS Genet. 2024 May 24;20(5):e1011297. doi: 10.1371/journal.pgen.1011297. eCollection 2024 May.
9
High-resolution detection of copy number alterations in single cells with HiScanner.利用HiScanner对单细胞中的拷贝数变异进行高分辨率检测。
bioRxiv. 2025 Apr 3:2024.04.26.587806. doi: 10.1101/2024.04.26.587806.
10
A Pipeline for Phasing and Genotype Imputation on Mixed Human Data (Parents-Offspring Trios and Unrelated Subjects) by Reviewing Current Methods and Software.通过回顾当前方法和软件构建的用于混合人类数据(父母-子女三联体和无关个体)的定相和基因型填充流程
Life (Basel). 2022 Dec 5;12(12):2030. doi: 10.3390/life12122030.

本文引用的文献

1
Fast two-stage phasing of large-scale sequence data.大规模序列数据的快速两阶段相位测定。
Am J Hum Genet. 2021 Oct 7;108(10):1880-1890. doi: 10.1016/j.ajhg.2021.08.005. Epub 2021 Sep 2.
2
Computational methods for chromosome-scale haplotype reconstruction.染色体级别的单倍型重构的计算方法。
Genome Biol. 2021 Apr 12;22(1):101. doi: 10.1186/s13059-021-02328-9.
3
Distinct error rates for reference and nonreference genotypes estimated by pedigree analysis.通过家系分析估计参考基因型和非参考基因型的不同错误率。
Genetics. 2021 Mar 3;217(1):1-10. doi: 10.1093/genetics/iyaa014.
4
Sequencing of 53,831 diverse genomes from the NHLBI TOPMed Program.美国国立卫生研究院生物医学高级研究与发展局(NHLBI)TOPMed 项目中对 53831 个不同基因组进行测序。
Nature. 2021 Feb;590(7845):290-299. doi: 10.1038/s41586-021-03205-y. Epub 2021 Feb 10.
5
Probabilistic Estimation of Identity by Descent Segment Endpoints and Detection of Recent Selection.通过血统片段末端的概率估计和近期选择的检测来估计身份。
Am J Hum Genet. 2020 Nov 5;107(5):895-910. doi: 10.1016/j.ajhg.2020.09.010. Epub 2020 Oct 13.
6
Population-Specific Recombination Maps from Segments of Identity by Descent.基于血缘同一性片段的特定人群重组图谱。
Am J Hum Genet. 2020 Jul 2;107(1):137-148. doi: 10.1016/j.ajhg.2020.05.016. Epub 2020 Jun 12.
7
A Fast and Simple Method for Detecting Identity-by-Descent Segments in Large-Scale Data.一种在大规模数据中快速简单检测同源片段的方法。
Am J Hum Genet. 2020 Apr 2;106(4):426-437. doi: 10.1016/j.ajhg.2020.02.010. Epub 2020 Mar 12.
8
Accurate, scalable and integrative haplotype estimation.精确、可扩展且综合的单倍型估计。
Nat Commun. 2019 Nov 28;10(1):5436. doi: 10.1038/s41467-019-13225-y.
9
Estimating the Genome-wide Mutation Rate with Three-Way Identity by Descent.利用三亲同缘关系估计全基因组突变率。
Am J Hum Genet. 2019 Nov 7;105(5):883-893. doi: 10.1016/j.ajhg.2019.09.012. Epub 2019 Oct 3.
10
The UK Biobank resource with deep phenotyping and genomic data.英国生物银行资源库,具有深度表型和基因组数据。
Nature. 2018 Oct;562(7726):203-209. doi: 10.1038/s41586-018-0579-z. Epub 2018 Oct 10.