Biomathematics Research Group, Department of Mathematics, University of Turku, Turku, Finland.
PLoS One. 2010 Jul 15;5(7):e11611. doi: 10.1371/journal.pone.0011611.
Recent technological developments in genetic screening approaches have offered the means to start exploring quantitative genotype-phenotype relationships on a large-scale. What remains unclear is the extent to which the quantitative genetic interaction datasets can distinguish the broad spectrum of interaction classes, as compared to existing information on mutation pairs associated with both positive and negative interactions, and whether the scoring of varying degrees of such epistatic effects could be improved by computational means. To address these questions, we introduce here a computational approach for improving the quantitative discrimination power encoded in the genetic interaction screening data. Our matrix approximation model decomposes the original double-mutant fitness matrix into separate components, representing variability across the array and query mutants, which can be utilized for estimating and correcting the single-mutant fitness effects, respectively. When applied to three large-scale quantitative interaction datasets in yeast, we could improve the accuracy of scoring various interaction classes beyond that obtained with the original fitness data, especially in synthetic genetic array (SGA) and in genetic interaction mapping (GIM) datasets. In addition to the known pairs of interactions used in the evaluation of the computational approach, a number of novel interaction pairs were also predicted, along with underlying biological mechanisms, which remained undetected by the original datasets. It was shown that the optimal choice of the scoring function depends heavily on the screening approach and on the interaction class under analysis. Moreover, a simple preprocessing of the fitness matrix could further enhance the discrimination power of the epistatic miniarray profiling (E-MAP) dataset. These systematic evaluation results provide in-depth information on the optimal analysis of the future, large-scale screening experiments. In general, the modeling framework, enabling accurate identification and classification of genetic interactions, provides a solid basis for completing and mining the genetic interaction networks in yeast and other organisms.
近年来,遗传筛选方法的技术发展为大规模探索定量基因型-表型关系提供了手段。目前尚不清楚的是,与与正、负相互作用相关的突变对的现有信息相比,定量遗传相互作用数据集在多大程度上可以区分广泛的相互作用类别,以及通过计算手段是否可以提高这种上位效应的评分程度。为了解决这些问题,我们在这里引入了一种计算方法,用于提高遗传相互作用筛选数据中编码的定量区分能力。我们的矩阵逼近模型将原始双突变体适合度矩阵分解为独立的成分,分别表示阵列和查询突变体的变异性,分别用于估计和校正单突变体适合度效应。当应用于酵母中的三个大规模定量相互作用数据集时,我们可以提高评分各种相互作用类别的准确性,超出原始适合度数据获得的准确性,特别是在合成遗传阵列(SGA)和遗传相互作用映射(GIM)数据集中。除了用于评估计算方法的已知相互作用对之外,还预测了一些新的相互作用对,以及潜在的生物学机制,这些机制在原始数据集中未被发现。结果表明,评分函数的最佳选择在很大程度上取决于筛选方法和分析的相互作用类别。此外,对适合度矩阵进行简单的预处理可以进一步提高上位微型数组分析(E-MAP)数据集的区分能力。这些系统评估结果提供了有关未来大规模筛选实验的最佳分析的深入信息。总的来说,这种能够准确识别和分类遗传相互作用的建模框架为完成和挖掘酵母和其他生物体中的遗传相互作用网络提供了坚实的基础。