Yuan Mingzhi, Shen Ao, Ma Yingfan, Du Jie, An Bohan, Wang Manning
Digital Medical Research Center, School of Basic Medical Sciences, Fudan University, 131 Dong'an Road, 200032 Shanghai, China.
Shanghai Key Laboratory of Medical Image Computing and Computer Assisted Intervention, Fudan University, 131 Dong'an Road, 200032 Shanghai, China.
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae695.
Proteins can be represented in different data forms, including sequence, structure, and surface, each of which has unique advantages and certain limitations. It is promising to fuse the complementary information among them. In this work, we propose a framework called ProteinF3S for enzyme function prediction that fuses the complementary information across protein sequence, structure, and surface. To achieve more effective fusion, we propose a multi-scale bidirectional fusion strategy between protein structure and surface, in which the hierarchical features of a surface encoder and a structure encoder interact with each other bidirectionally. Based on these interactions, more distinctive features can be obtained. After that, we achieve further fusion by concatenating the sequence features with the features containing structure and surface information, so that better performance can be achieved. To validate our method, we conduct extensive experiments on tasks including enzyme reaction classification and enzyme commission number prediction. Our method achieves new state-of-the-art performance and shows that fusing different forms of data is effective in enzyme function prediction.
蛋白质可以用不同的数据形式表示,包括序列、结构和表面,每种形式都有独特的优势和一定的局限性。融合它们之间的互补信息很有前景。在这项工作中,我们提出了一个名为ProteinF3S的框架用于酶功能预测,该框架融合了蛋白质序列、结构和表面的互补信息。为了实现更有效的融合,我们提出了一种蛋白质结构和表面之间的多尺度双向融合策略,其中表面编码器和结构编码器的层次特征相互双向交互。基于这些交互,可以获得更具特色的特征。之后,我们通过将序列特征与包含结构和表面信息的特征连接起来实现进一步融合,从而获得更好的性能。为了验证我们的方法,我们在酶反应分类和酶委员会编号预测等任务上进行了广泛的实验。我们的方法取得了新的最优性能,并表明融合不同形式的数据在酶功能预测中是有效的。