Ness Robert O, Sachs Karen, Mallick Parag, Vitek Olga
1 Department of Statistics, Purdue University , West Lafayette, Indiana.
2 Department of Immunology, School of Medicine, Stanford University , Palo Alto, California.
J Comput Biol. 2018 Jul;25(7):709-725. doi: 10.1089/cmb.2017.0247. Epub 2018 Jun 21.
Machine learning methods for learning network structure are applied to quantitative proteomics experiments and reverse-engineer intracellular signal transduction networks. They provide insight into the rewiring of signaling within the context of a disease or a phenotype. To learn the causal patterns of influence between proteins in the network, the methods require experiments that include targeted interventions that fix the activity of specific proteins. However, the interventions are costly and add experimental complexity. We describe an active learning strategy for selecting optimal interventions. Our approach takes as inputs pathway databases and historic data sets, expresses them in form of prior probability distributions on network structures, and selects interventions that maximize their expected contribution to structure learning. Evaluations on simulated and real data show that the strategy reduces the detection error of validated edges as compared with an unguided choice of interventions and avoids redundant interventions, thereby increasing the effectiveness of the experiment.
用于学习网络结构的机器学习方法被应用于定量蛋白质组学实验,并对细胞内信号转导网络进行逆向工程。它们为洞察疾病或表型背景下信号通路的重新布线提供了思路。为了了解网络中蛋白质之间的因果影响模式,这些方法需要进行包括靶向干预的实验,以固定特定蛋白质的活性。然而,这些干预成本高昂且增加了实验复杂性。我们描述了一种用于选择最佳干预的主动学习策略。我们的方法将通路数据库和历史数据集作为输入,以网络结构上的先验概率分布形式表示它们,并选择对结构学习的预期贡献最大的干预措施。对模拟数据和真实数据的评估表明,与无指导的干预选择相比,该策略降低了已验证边的检测误差,并避免了冗余干预,从而提高了实验的有效性。