Lu Xiaohua, Xie Liangxu, Xu Lei, Mao Rongzhi, Xu Xiaojun, Chang Shan
Institute of Bioinformatics and Medical Engineering, Jiangsu University of Technology, Changzhou 213001, China.
Comput Struct Biotechnol J. 2024 Apr 12;23:1666-1679. doi: 10.1016/j.csbj.2024.04.030. eCollection 2024 Dec.
Accurately predicting molecular properties is a challenging but essential task in drug discovery. Recently, many mono-modal deep learning methods have been successfully applied to molecular property prediction. However, mono-modal learning is inherently limited as it relies solely on a single modality of molecular representation, which restricts a comprehensive understanding of drug molecules. To overcome the limitations, we propose a multimodal fused deep learning (MMFDL) model to leverage information from different molecular representations. Specifically, we construct a triple-modal learning model by employing Transformer-Encoder, Bidirectional Gated Recurrent Unit (BiGRU), and graph convolutional network (GCN) to process three modalities of information from chemical language and molecular graph: SMILES-encoded vectors, ECFP fingerprints, and molecular graphs, respectively. We evaluate the proposed triple-modal model using five fusion approaches on six molecule datasets, including Delaney, Llinas2020, Lipophilicity, SAMPL, BACE, and pKa from DataWarrior. The results show that the MMFDL model achieves the highest Pearson coefficients, and stable distribution of Pearson coefficients in the random splitting test, outperforming mono-modal models in accuracy and reliability. Furthermore, we validate the generalization ability of our model in the prediction of binding constants for protein-ligand complex molecules, and assess the resilience capability against noise. Through analysis of feature distributions in chemical space and the assigned contribution of each modal model, we demonstrate that the MMFDL model shows the ability to acquire complementary information by using proper models and suitable fusion approaches. By leveraging diverse sources of bioinformatics information, multimodal deep learning models hold the potential for successful drug discovery.
准确预测分子性质是药物研发中一项具有挑战性但又至关重要的任务。最近,许多单模态深度学习方法已成功应用于分子性质预测。然而,单模态学习本质上存在局限性,因为它仅依赖于分子表示的单一模态,这限制了对药物分子的全面理解。为了克服这些局限性,我们提出了一种多模态融合深度学习(MMFDL)模型,以利用来自不同分子表示的信息。具体而言,我们通过使用Transformer-Encoder、双向门控循环单元(BiGRU)和图卷积网络(GCN)构建了一个三模态学习模型,分别处理来自化学语言和分子图的三种信息模态:SMILES编码向量、ECFP指纹和分子图。我们使用五种融合方法在六个分子数据集上评估了所提出的三模态模型,这些数据集包括来自DataWarrior的Delaney、Llinas2020、亲脂性、SAMPL、BACE和pKa。结果表明,MMFDL模型在随机拆分测试中获得了最高的皮尔逊系数,且皮尔逊系数分布稳定,在准确性和可靠性方面优于单模态模型。此外,我们验证了我们的模型在预测蛋白质-配体复合分子结合常数方面的泛化能力,并评估了其抗噪声能力。通过分析化学空间中的特征分布以及每个模态模型的指定贡献,我们证明了MMFDL模型通过使用适当的模型和合适的融合方法显示出获取互补信息的能力。通过利用多种生物信息学信息来源,多模态深度学习模型在成功的药物研发中具有潜力。