Cartwright Reed A, Hussin Julie, Keebler Jonathan E M, Stone Eric A, Awadalla Philip
Arizona State University, AZ, USA.
Stat Appl Genet Mol Biol. 2012 Jan 6;11(2):/j/sagmb.2012.11.issue-2/1544-6115.1713/1544-6115.1713.xml. doi: 10.2202/1544-6115.1713.
Recent advances in high-throughput DNA sequencing technologies and associated statistical analyses have enabled in-depth analysis of whole-genome sequences. As this technology is applied to a growing number of individual human genomes, entire families are now being sequenced. Information contained within the pedigree of a sequenced family can be leveraged when inferring the donors' genotypes. The presence of a de novo mutation within the pedigree is indicated by a violation of Mendelian inheritance laws. Here, we present a method for probabilistically inferring genotypes across a pedigree using high-throughput sequencing data and producing the posterior probability of de novo mutation at each genomic site examined. This framework can be used to disentangle the effects of germline and somatic mutational processes and to simultaneously estimate the effect of sequencing error and the initial genetic variation in the population from which the founders of the pedigree arise. This approach is examined in detail through simulations and areas for method improvement are noted. By applying this method to data from members of a well-defined nuclear family with accurate pedigree information, the stage is set to make the most direct estimates of the human mutation rate to date.
高通量DNA测序技术及相关统计分析的最新进展,已使全基因组序列的深入分析成为可能。随着这项技术应用于越来越多的个人人类基因组,现在正在对整个家族进行测序。在推断供体的基因型时,可以利用已测序家族系谱中包含的信息。系谱中存在新生突变可通过违反孟德尔遗传定律来表明。在此,我们提出一种方法,用于使用高通量测序数据概率性地推断整个系谱中的基因型,并在每个检测的基因组位点产生新生突变的后验概率。该框架可用于区分种系和体细胞突变过程的影响,并同时估计测序误差的影响以及系谱创始人所来自群体中的初始遗传变异。通过模拟对该方法进行了详细研究,并指出了方法改进的方向。通过将此方法应用于具有准确系谱信息的明确核心家庭成员的数据,为做出迄今为止对人类突变率的最直接估计奠定了基础。