Liu Guangming, Chai Bianfang, Yang Kuo, Yu Jian, Zhou Xuezhong
Beijing Key Lab of Traffic Data Analysis and Mining, Beijing Jiaotong University, No. 3 Shangyuancun Haidian District, Beijing, People's Republic of China.
Department of Information Engineering, Hebei GEO University, Shijiazhuang, People's Republic of China.
IET Syst Biol. 2018 Apr;12(2):45-54. doi: 10.1049/iet-syb.2017.0084.
A large amount of available protein-protein interaction (PPI) data has been generated by high-throughput experimental techniques. Uncovering functional modules from PPI networks will help us better understand the underlying mechanisms of cellular functions. Numerous computational algorithms have been designed to identify functional modules automatically in the past decades. However, most community detection methods (non-overlapping or overlapping types) are unsupervised models, which cannot incorporate the well-known protein complexes as a priori. The authors propose a novel semi-supervised model named pairwise constrains nonnegative matrix tri-factorisation (PCNMTF), which takes full advantage of the well-known protein complexes to find overlapping functional modules based on protein module indicator matrix and module correlation matrix simultaneously from PPI networks. PCNMTF determinately models and learns the mixed module memberships of each protein by considering the correlation among modules simultaneously based on the non-negative matrix tri-factorisation. The experiment results on both synthetic and real-world biological networks demonstrate that PCNMTF gains more precise functional modules than that of state-of-the-art methods.
高通量实验技术已经产生了大量可用的蛋白质-蛋白质相互作用(PPI)数据。从PPI网络中发现功能模块将有助于我们更好地理解细胞功能的潜在机制。在过去几十年中,人们设计了许多计算算法来自动识别功能模块。然而,大多数社区检测方法(非重叠或重叠类型)都是无监督模型,无法将已知的蛋白质复合物作为先验信息纳入其中。作者提出了一种名为成对约束非负矩阵三因式分解(PCNMTF)的新型半监督模型,该模型充分利用已知的蛋白质复合物,基于蛋白质模块指示矩阵和模块相关矩阵,同时从PPI网络中找到重叠的功能模块。PCNMTF通过基于非负矩阵三因式分解同时考虑模块之间的相关性,确定性地对每个蛋白质的混合模块成员资格进行建模和学习。在合成和真实生物网络上的实验结果表明,PCNMTF比现有方法获得了更精确的功能模块。