Chen Weiqi, Liu Jing, He Shan
School of Computer Science, University of Birmingham, Edgbaston, Birmingham, B15 2TT, UK.
Key Laboratory of Intelligent Perception and Image Understanding of Ministry of Education, Xidian University, Xi'an, Shaanxi, 710071, People's Republic of China.
BMC Syst Biol. 2017 Mar 14;11(Suppl 2):8. doi: 10.1186/s12918-017-0388-2.
Active module, defined as an area in biological network that shows striking changes in molecular activity or phenotypic signatures, is important to reveal dynamic and process-specific information that is correlated with cellular or disease states.
A prior information guided active module identification approach is proposed to detect modules that are both active and enriched by prior knowledge. We formulate the active module identification problem as a multi-objective optimisation problem, which consists two conflicting objective functions of maximising the coverage of known biological pathways and the activity of the active module simultaneously. Network is constructed from protein-protein interaction database. A beta-uniform-mixture model is used to estimate the distribution of p-values and generate scores for activity measurement from microarray data. A multi-objective evolutionary algorithm is used to search for Pareto optimal solutions. We also incorporate a novel constraints based on algebraic connectivity to ensure the connectedness of the identified active modules.
Application of proposed algorithm on a small yeast molecular network shows that it can identify modules with high activities and with more cross-talk nodes between related functional groups. The Pareto solutions generated by the algorithm provides solutions with different trade-off between prior knowledge and novel information from data. The approach is then applied on microarray data from diclofenac-treated yeast cells to build network and identify modules to elucidate the molecular mechanisms of diclofenac toxicity and resistance. Gene ontology analysis is applied to the identified modules for biological interpretation.
Integrating knowledge of functional groups into the identification of active module is an effective method and provides a flexible control of balance between pure data-driven method and prior information guidance.
活性模块被定义为生物网络中分子活性或表型特征显示出显著变化的区域,对于揭示与细胞或疾病状态相关的动态和过程特异性信息非常重要。
提出了一种先验信息引导的活性模块识别方法,以检测既具有活性又被先验知识富集的模块。我们将活性模块识别问题表述为一个多目标优化问题,该问题由两个相互冲突的目标函数组成,即同时最大化已知生物途径的覆盖率和活性模块的活性。网络由蛋白质 - 蛋白质相互作用数据库构建。使用β - 均匀混合模型来估计p值的分布,并从微阵列数据生成用于活性测量的分数。使用多目标进化算法搜索帕累托最优解。我们还纳入了基于代数连通性的新颖约束,以确保所识别的活性模块的连通性。
将所提出的算法应用于一个小型酵母分子网络表明,它可以识别具有高活性且在相关功能组之间具有更多相互作用节点的模块。该算法生成的帕累托解提供了在先验知识和来自数据的新信息之间具有不同权衡的解决方案。然后将该方法应用于双氯芬酸处理的酵母细胞的微阵列数据,以构建网络并识别模块,以阐明双氯芬酸毒性和抗性的分子机制。将基因本体分析应用于所识别的模块进行生物学解释。
将功能组知识整合到活性模块的识别中是一种有效的方法,并提供了对纯数据驱动方法和先验信息引导之间平衡的灵活控制。