Department of Genome Sciences, University of Washington, Seattle, WA, USA.
Institute for Human Genetics, University of California, San Francisco, CA, USA.
Mol Biol Evol. 2023 Oct 4;40(10). doi: 10.1093/molbev/msad213.
Although evolutionary biologists have long theorized that variation in DNA repair efficacy might explain some of the diversity of lifespan and cancer incidence across species, we have little data on the variability of normal germline mutagenesis outside of humans. Here, we shed light on the spectrum and etiology of mutagenesis across mammals by quantifying mutational sequence context biases using polymorphism data from thirteen species of mice, apes, bears, wolves, and cetaceans. After normalizing the mutation spectrum for reference genome accessibility and k-mer content, we use the Mantel test to deduce that mutation spectrum divergence is highly correlated with genetic divergence between species, whereas life history traits like reproductive age are weaker predictors of mutation spectrum divergence. Potential bioinformatic confounders are only weakly related to a small set of mutation spectrum features. We find that clock-like mutational signatures previously inferred from human cancers cannot explain the phylogenetic signal exhibited by the mammalian mutation spectrum, despite the ability of these signatures to fit each species' 3-mer spectrum with high cosine similarity. In contrast, parental aging signatures inferred from human de novo mutation data appear to explain much of the 1-mer spectrum's phylogenetic signal in combination with a novel mutational signature. We posit that future models purporting to explain the etiology of mammalian mutagenesis need to capture the fact that more closely related species have more similar mutation spectra; a model that fits each marginal spectrum with high cosine similarity is not guaranteed to capture this hierarchy of mutation spectrum variation among species.
虽然进化生物学家长期以来一直从理论上推断,DNA 修复效率的差异可能解释了物种间寿命和癌症发病率的一些差异,但我们对人类以外正常种系突变的可变性知之甚少。在这里,我们通过量化来自 13 种老鼠、猿类、熊、狼和鲸目动物的多态性数据,阐明了跨哺乳动物的突变谱和突变发生的病因。在将突变谱归一化为参考基因组可及性和 k-mer 含量后,我们使用 Mantel 检验推断出突变谱的发散与物种间的遗传发散高度相关,而生殖年龄等生命史特征则是突变谱发散的较弱预测因子。潜在的生物信息学混杂因素与一小部分突变谱特征仅弱相关。我们发现,尽管这些特征能够以高余弦相似度拟合每个物种的 3-mer 谱,但先前从人类癌症中推断出的类钟突变特征并不能解释哺乳动物突变谱所表现出的系统发育信号。相比之下,从人类从头突变数据中推断出的亲本衰老特征似乎与一种新的突变特征一起解释了 1-mer 谱的大部分系统发育信号。我们假设,未来旨在解释哺乳动物突变发生病因的模型需要考虑到更密切相关的物种具有更相似的突变谱这一事实;一个与每个边缘谱具有高余弦相似度的模型不一定能捕捉到物种间突变谱变异的这种层次结构。