Abbas Mustafa N, Broneske David, Saake Gunter
Databases and Software Engineering, Otto-von-Guericke-University, Magdeburg, Germany.
German Centre for Higher Education Research and Science Studies, Hannover, Germany.
Sci Rep. 2025 May 15;15(1):16855. doi: 10.1038/s41598-025-01667-y.
Detecting protein complexes is crucial in computational biology for understanding cellular mechanisms and facilitating drug discovery. Evolutionary algorithms (EAs) have proven effective in uncovering protein complexes within networks of protein-protein interactions (PPIs). However, their integration with functional insights from gene ontology (GO) annotations remains underexplored. This paper presents two primary contributions: First, it proposes a novel multi-objective optimization model for detecting protein complexes, conceptualizing the task as a problem with inherently conflicting objectives based on biological data. Second, it introduces an innovative gene ontology-based mutation operator, termed the Functional Similarity-Based Protein Translocation Operator ([Formula: see text]). This operator enhances collaboration between the canonical model and the GO-informed mutation strategy, thereby improving the algorithm's performance. As far as we know, this is the initial effort to incorporate the biological characteristics of PPIs into both the problem formulation and the development of intricate perturbation strategies. We assess the effectiveness of the proposed multi-objective evolutionary algorithm through experiments conducted on two widely recognized PPI networks and two standard complex datasets provided by the Munich Information Center for Protein Sequences (MIPS). To further assess the robustness of our algorithm, we create artificial networks by introducing different noise levels into the original Saccharomyces cerevisiae (yeast) PPI networks. This allows us to evaluate how perturbations in protein interactions affect the algorithm's performance compared to other approaches. The experimental results highlight that our algorithm outperforms several state-of-the-art methods in accurately identifying protein complexes. Moreover, the findings emphasize the substantial advantages of incorporating our heuristic perturbation operator, which significantly improves the quality of the detected complexes over other evolutionary algorithm-based methods.
在计算生物学中,检测蛋白质复合物对于理解细胞机制和促进药物发现至关重要。进化算法(EAs)已被证明在揭示蛋白质-蛋白质相互作用(PPI)网络中的蛋白质复合物方面是有效的。然而,它们与来自基因本体(GO)注释的功能见解的整合仍未得到充分探索。本文提出了两个主要贡献:第一,它提出了一种用于检测蛋白质复合物的新型多目标优化模型,将该任务概念化为基于生物学数据具有内在冲突目标的问题。第二,它引入了一种创新的基于基因本体的变异算子,称为基于功能相似性的蛋白质易位算子([公式:见原文])。该算子增强了规范模型与基于GO的变异策略之间的协作,从而提高了算法的性能。据我们所知,这是将PPI的生物学特征纳入问题表述和复杂扰动策略开发的初步努力。我们通过在两个广泛认可的PPI网络以及慕尼黑蛋白质序列信息中心(MIPS)提供的两个标准复合物数据集上进行的实验,评估了所提出的多目标进化算法的有效性。为了进一步评估我们算法的鲁棒性,我们通过在原始酿酒酵母(酵母)PPI网络中引入不同噪声水平来创建人工网络。这使我们能够与其他方法相比,评估蛋白质相互作用中的扰动如何影响算法的性能。实验结果表明,我们的算法在准确识别蛋白质复合物方面优于几种现有技术方法。此外,研究结果强调了纳入我们的启发式扰动算子的显著优势,与其他基于进化算法的方法相比,该算子显著提高了检测到的复合物的质量。