• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

一种用于单体型估计的合并效率研究。

A study of the efficiency of pooling in haplotype estimation.

机构信息

Department of Statistics and Applied Probability, National University of Singapore,117546 Singapore.

出版信息

Bioinformatics. 2010 Oct 15;26(20):2556-63. doi: 10.1093/bioinformatics/btq492. Epub 2010 Aug 27.

DOI:10.1093/bioinformatics/btq492
PMID:20801910
Abstract

MOTIVATION

It has been claimed in the literature that pooling DNA samples is efficient in estimating haplotype frequencies. There is, however, no theoretical justification based on calculation of statistical efficiency. In fact, the limited evidence given so far is based on simulation studies with small numbers of loci. With rapid advance in technology, it is of interest to see if pooling is still efficient when the number of loci increases.

METHODS

Instead of resorting to simulation studies, we make use of asymptotic statistical theory to perform exact calculation of the efficiency of pooling relative to no pooling in the estimation of haplotype frequencies. As an intermediate step, we use the log-linear formulation of the haplotype probabilities and derive the asymptotic variance-covariance matrix of the maximum likelihood estimators of the canonical parameters of the log-linear model.

RESULTS

Based on our calculations under linkage equilibrium, pooling can suffer huge loss in efficiency relative to no pooling when there are more than three independent loci and the alleles are not rare. Pooling works better for rare alleles. In particular, if all the minor allele frequencies are 0.05, pooling maintains an advantage over no pooling until the number of independent loci reaches 6. High linkage disequilibrium effectively reduces the number of independent loci by ruling out certain haplotypes from occurring. Similar calculations of efficiency for the case of no pooling justify the common belief that it is not worthwhile to use molecular methods to resolve the phase ambiguity of individual genotype data.

AVAILABILITY

The R codes for the calculation are available at http://www.stat.nus.edu.sg/∼staxj/pooling

CONTACT

stakuka@nus.edu.sg.

摘要

动机

文献中声称, pooled DNA samples 对于估计 haplotype frequencies 是有效的。然而,这并没有基于计算统计效率的理论依据。事实上,迄今为止给出的有限证据是基于小数量的 loci 的模拟研究。随着技术的快速发展,有必要观察当 loci 数量增加时,pooling 是否仍然有效。

方法

我们不依赖于模拟研究,而是利用渐近统计理论来执行 exact calculation,以评估相对于不 pooling 的 haplotype frequencies 估计的 pooling 效率。作为中间步骤,我们使用 haplotype probabilities 的对数线性公式,并推导出对数线性模型的典型参数的最大似然估计的渐近方差-协方差矩阵。

结果

基于我们在 linkage equilibrium 下的计算,当有三个以上独立 loci 且等位基因不罕见时,pooling 相对于不 pooling 会遭受巨大的效率损失。pooling 对稀有等位基因效果更好。特别是,如果所有的 minor allele frequencies 都是 0.05,那么在独立 loci 的数量达到 6 之前,pooling 相对于不 pooling 仍具有优势。高连锁不平衡通过排除某些 haplotypes 的出现,有效地减少了独立 loci 的数量。对于不 pooling 的情况的效率的类似计算证明了一个普遍的信念,即使用分子方法解决个体基因型数据的相位模糊性是不值得的。

可用性

计算的 R 代码可在 http://www.stat.nus.edu.sg/∼staxj/pooling 上获得。

联系

stakuka@nus.edu.sg。

相似文献

1
A study of the efficiency of pooling in haplotype estimation.一种用于单体型估计的合并效率研究。
Bioinformatics. 2010 Oct 15;26(20):2556-63. doi: 10.1093/bioinformatics/btq492. Epub 2010 Aug 27.
2
Estimating haplotype-disease associations with pooled genotype data.利用合并的基因型数据估计单倍型与疾病的关联。
Genet Epidemiol. 2005 Jan;28(1):70-82. doi: 10.1002/gepi.20040.
3
On the use of DNA pooling to estimate haplotype frequencies.关于使用DNA池来估计单倍型频率。
Genet Epidemiol. 2003 Jan;24(1):74-82. doi: 10.1002/gepi.10195.
4
Maximum likelihood estimation of haplotype effects and haplotype-environment interactions in association studies.关联研究中单体型效应及单体型-环境相互作用的最大似然估计
Genet Epidemiol. 2005 Dec;29(4):299-312. doi: 10.1002/gepi.20098.
5
Incorporating genotyping uncertainty in haplotype frequency estimation in pedigree studies.在系谱研究中,将基因分型不确定性纳入单倍型频率估计。
Hum Hered. 2007;64(3):172-81. doi: 10.1159/000102990. Epub 2007 May 25.
6
Haplotype frequency estimation in patient populations: the effect of departures from Hardy-Weinberg proportions and collapsing over a locus in the HLA region.患者群体中的单倍型频率估计:偏离哈迪-温伯格比例以及HLA区域中一个基因座上的合并的影响。
Genet Epidemiol. 2002 Feb;22(2):186-95. doi: 10.1002/gepi.0163.
7
Estimating haplotype relative risks in complex disease from unphased SNPs data in families using a likelihood adjusted for ascertainment.利用针对确诊情况进行调整的似然法,从家系中未分型的单核苷酸多态性(SNP)数据估计复杂疾病中的单倍型相对风险。
Genet Epidemiol. 2006 Dec;30(8):666-76. doi: 10.1002/gepi.20178.
8
Maximum-likelihood estimation of haplotype frequencies in nuclear families.核心家庭中单体型频率的最大似然估计。
Genet Epidemiol. 2004 Jul;27(1):21-32. doi: 10.1002/gepi.10323.
9
Efficiency of DNA pooling to estimate joint allele frequencies and measure linkage disequilibrium.DNA混合池估计联合等位基因频率及测量连锁不平衡的效率。
Genet Epidemiol. 2002 Jan;22(1):94-102. doi: 10.1002/gepi.1046.
10
Power of direct vs. indirect haplotyping in association studies.关联研究中直接与间接单倍型分型的效能
Genet Epidemiol. 2004 Feb;26(2):116-24. doi: 10.1002/gepi.10300.

引用本文的文献

1
An EM algorithm based on an internal list for estimating haplotype distributions of rare variants from pooled genotype data.基于内部列表的 EM 算法,用于从合并基因型数据估计罕见变异体的单体型分布。
BMC Genet. 2013 Sep 13;14:82. doi: 10.1186/1471-2156-14-82.
2
Maximum-parsimony haplotype frequencies inference based on a joint constrained sparse representation of pooled DNA.基于合并 DNA 的联合约束稀疏表示的最大简约单倍型频率推断。
BMC Bioinformatics. 2013 Sep 8;14:270. doi: 10.1186/1471-2105-14-270.
3
Cost-effective genome-wide estimation of allele frequencies from pooled DNA in Atlantic salmon (Salmo salar L.).
从大西洋鲑鱼(Salmo salar L.)混合 DNA 中进行经济有效的全基因组等位基因频率估计。
BMC Genomics. 2013 Jan 16;14:12. doi: 10.1186/1471-2164-14-12.
4
Fast and accurate haplotype frequency estimation for large haplotype vectors from pooled DNA data.从混合 DNA 数据中快速准确估计大型单倍型向量的单倍型频率。
BMC Genet. 2012 Oct 30;13:94. doi: 10.1186/1471-2156-13-94.