Yu Guoxian, Zhu Hailong, Domeniconi Carlotta, Guo Maozu
BMC Syst Biol. 2015;9 Suppl 1(Suppl 1):S3. doi: 10.1186/1752-0509-9-S1-S3. Epub 2015 Jan 21.
High throughput techniques produce multiple functional association networks. Integrating these networks can enhance the accuracy of protein function prediction. Many algorithms have been introduced to generate a composite network, which is obtained as a weighted sum of individual networks. The weight assigned to an individual network reflects its benefit towards the protein functional annotation inference. A classifier is then trained on the composite network for predicting protein functions. However, since these techniques model the optimization of the composite network and the prediction tasks as separate objectives, the resulting composite network is not necessarily optimal for the follow-up protein function prediction.
We address this issue by modeling the optimization of the composite network and the prediction problems within a unified objective function. In particular, we use a kernel target alignment technique and the loss function of a network based classifier to jointly adjust the weights assigned to the individual networks. We show that the proposed method, called MNet, can achieve a performance that is superior (with respect to different evaluation criteria) to related techniques using the multiple networks of four example species (yeast, human, mouse, and fly) annotated with thousands (or hundreds) of GO terms.
MNet can effectively integrate multiple networks for protein function prediction and is robust to the input parameters. Supplementary data is available at https://sites.google.com/site/guoxian85/home/mnet. The Matlab code of MNet is available upon request.
高通量技术产生了多个功能关联网络。整合这些网络可以提高蛋白质功能预测的准确性。已经引入了许多算法来生成一个复合网络,该复合网络是作为各个网络的加权和获得的。分配给单个网络的权重反映了其对蛋白质功能注释推断的益处。然后在复合网络上训练一个分类器来预测蛋白质功能。然而,由于这些技术将复合网络的优化和预测任务建模为单独的目标,因此得到的复合网络对于后续的蛋白质功能预测不一定是最优的。
我们通过在一个统一的目标函数中对复合网络的优化和预测问题进行建模来解决这个问题。具体来说,我们使用核目标对齐技术和基于网络的分类器的损失函数来共同调整分配给各个网络的权重。我们表明,所提出的方法,称为MNet,使用注释有数千(或数百)个GO术语的四个示例物种(酵母、人类、小鼠和果蝇)的多个网络,相对于不同的评估标准,可以实现优于相关技术的性能。
MNet可以有效地整合多个网络用于蛋白质功能预测,并且对输入参数具有鲁棒性。补充数据可在https://sites.google.com/site/guoxian85/home/mnet获取。MNet的Matlab代码可根据要求提供。