Zhu Yuan O, Sherlock Gavin, Petrov Dmitri A
Department of Genetics, Stanford University, Stanford, CA, United States of America.
Department of Biology, Stanford University, Stanford, CA, United States of America.
PLoS Genet. 2017 Jan 3;13(1):e1006455. doi: 10.1371/journal.pgen.1006455. eCollection 2017 Jan.
The characterization of mutational spectra is usually carried out in one of three ways-by direct observation through mutation accumulation (MA) experiments, through parent-offspring sequencing, or by indirect inference from sequence data. Direct observations of spontaneous mutations with MA experiments are limited, given (i) the rarity of spontaneous mutations, (ii) applicability only to laboratory model species with short generation times, and (iii) the possibility that mutational spectra under lab conditions might be different from those observed in nature. Trio sequencing is an elegant solution, but it is not applicable in all organisms. Indirect inference, usually from divergence data, faces no such technical limitations, but rely upon critical assumptions regarding the strength of natural selection that are likely to be violated. Ideally, new mutational events would be directly observed before the biased filter of selection, and without the technical limitations common to lab experiments. One approach is to identify very young mutations from population sequencing data. Here we do so by leveraging two characteristics common to all new mutations-new mutations are necessarily rare in the population, and absent in the genomes of immediate relatives. From 132 clinical yeast strains, we were able to identify 1,425 putatively new mutations and show that they exhibit extremely low signatures of selection, as well as display a mutational spectrum that is similar to that identified by a large scale MA experiment. We verify that population sequencing data are a potential wealth of information for inferring mutational spectra, and should be considered for analysis where MA experiments are infeasible or especially tedious.
通过突变积累(MA)实验直接观察、通过亲子测序或从序列数据进行间接推断。通过MA实验对自发突变进行直接观察存在局限性,原因如下:(i)自发突变罕见;(ii)仅适用于世代时间短的实验室模式物种;(iii)实验室条件下的突变谱可能与自然界中观察到的不同。三联体测序是一种巧妙的解决方案,但并非适用于所有生物体。间接推断通常来自分歧数据,不存在此类技术限制,但依赖于关于自然选择强度的关键假设,而这些假设可能会被违反。理想情况下,新的突变事件应在选择的偏差筛选之前直接观察到,且不受实验室实验常见的技术限制。一种方法是从群体测序数据中识别非常新的突变。在这里,我们通过利用所有新突变共有的两个特征来做到这一点——新突变在群体中必然罕见,且在直系亲属的基因组中不存在。从132株临床酵母菌株中,我们能够识别出1425个推定的新突变,并表明它们表现出极低的选择特征,同时展示出与大规模MA实验所识别的相似的突变谱。我们验证了群体测序数据是推断突变谱的潜在丰富信息来源,在MA实验不可行或特别繁琐的情况下,应考虑用于分析。