Chitra Uthsav, Arnold Brian J, Raphael Benjamin J
bioRxiv. 2024 Jul 19:2024.07.17.603976. doi: 10.1101/2024.07.17.603976.
Epistasis, or interactions in which alleles at one locus modify the fitness effects of alleles at other loci, plays a fundamental role in genetics, protein evolution, and many other areas of biology. Epistasis is typically quantified by computing the deviation from the expected fitness under an additive or multiplicative model using one of several formulae. However, these formulae are not all equivalent. Importantly, one widely used formula - which we call the formula - measures deviations from a fitness model on an scale, thus mixing two measurement scales. We show that for pairwise interactions, the chimeric formula yields a different magnitude, but the same sign (synergistic vs. antagonistic) of epistasis compared to the multiplicative formula that measures both fitness and deviations on a multiplicative scale. However, for higher-order interactions, we show that the chimeric formula can have both different magnitude sign compared to the multiplicative formula - thus confusing negative epistatic interactions with positive interactions, and vice versa. We resolve these inconsistencies by deriving fundamental connections between the different epistasis formulae and the parameters of the . Our results demonstrate that the additive and multiplicative epistasis formulae are more mathematically sound than the chimeric formula. Moreover, we demonstrate that the mathematical issues with the chimeric epistasis formula lead to markedly different biological interpretations of real data. Analyzing multi-gene knockout data in yeast, multi-way drug interactions in , and deep mutational scanning (DMS) of several proteins, we find that 10 - 60% of higher-order interactions have a change in sign with the multiplicative or additive epistasis formula. These sign changes result in qualitatively different findings on functional divergence in the yeast genome, synergistic vs. antagonistic drug interactions, and and epistasis between protein mutations. In particular, in the yeast data, the more appropriate multiplicative formula identifies nearly 500 additional negative three-way interactions, thus extending the trigenic interaction network by 25%.
上位性,即在一个位点的等位基因改变其他位点等位基因的适合度效应的相互作用,在遗传学、蛋白质进化及生物学的许多其他领域中起着基础性作用。上位性通常通过使用几种公式之一计算与加性或乘性模型下预期适合度的偏差来进行量化。然而,这些公式并非全都等价。重要的是,一个广泛使用的公式——我们称之为嵌合公式——在对数尺度上测量与对数适合度模型的偏差,从而混合了两种测量尺度。我们表明,对于成对相互作用,与在乘性尺度上测量适合度和偏差的乘性公式相比,嵌合公式产生不同的上位性大小,但上位性的符号相同(协同与拮抗)。然而,对于高阶相互作用,我们表明与乘性公式相比,嵌合公式可以具有不同的大小和符号——从而将负向上位性相互作用与正向相互作用混淆,反之亦然。我们通过推导不同上位性公式与对数正态分布参数之间的基本联系来解决这些不一致性。我们的结果表明,加性和乘性上位性公式在数学上比嵌合公式更合理。此外,我们证明嵌合上位性公式的数学问题导致对实际数据有明显不同的生物学解释。分析酵母中的多基因敲除数据、癌症中的多向药物相互作用以及几种蛋白质的深度突变扫描(DMS),我们发现10% - 60%的高阶相互作用在使用乘性或加性上位性公式时符号会发生变化。这些符号变化导致在酵母基因组功能差异、协同与拮抗药物相互作用以及蛋白质突变之间的上位性方面产生定性不同的结果。特别是在酵母数据中,更合适的乘性公式识别出近500个额外的负向三向相互作用,从而使三基因相互作用网络扩展了25%。