Li Chuan, Liu Jiangting, Chen Jianfang, Yuan Yuan, Yu Jin, Gou Qiaolin, Guo Yanzhi, Pu Xuemei
College of Computer Science, Sichuan University, Chengdu 610064, China.
College of Chemistry, Sichuan University, Chengdu 610064, China.
J Chem Inf Model. 2022 Mar 28;62(6):1399-1410. doi: 10.1021/acs.jcim.2c00085. Epub 2022 Mar 8.
Molecular dynamics (MD) simulations have made great contribution to revealing structural and functional mechanisms for many biomolecular systems. However, how to identify functional states and important residues from vast conformation space generated by MD remains challenging; thus an intelligent navigation is highly desired. Despite intelligent advantages of deep learning exhibited in analyzing MD trajectory, its black-box nature limits its application. To address this problem, we explore an interpretable convolutional neural network (CNN)-based deep learning framework to automatically identify diverse active states from the MD trajectory for G-protein-coupled receptors (GPCRs), named the ICNNMD model. To avoid the information loss in representing the conformation structure, the pixel representation is introduced, and then the CNN module is constructed to efficiently extract features followed by a fully connected neural network to realize the classification task. More importantly, we design a local interpretable model-agnostic explanation interpreter for the classification result by local approximation with a linear model, through which important residues underlying distinct active states can be quickly identified. Our model showcases higher than 99% classification accuracy for three important GPCR systems with diverse active states. Notably, some important residues in regulating different biased activities are successfully identified, which are beneficial to elucidating diverse activation mechanisms for GPCRs. Our model can also serve as a general tool to analyze MD trajectory for other biomolecular systems. All source codes are freely available at https://github.com/Jane-Liu97/ICNNMD for aiding MD studies.
分子动力学(MD)模拟在揭示许多生物分子系统的结构和功能机制方面做出了巨大贡献。然而,如何从MD生成的庞大构象空间中识别功能状态和重要残基仍然具有挑战性;因此,非常需要智能导航。尽管深度学习在分析MD轨迹方面展现出智能优势,但其黑箱性质限制了其应用。为了解决这个问题,我们探索了一种基于可解释卷积神经网络(CNN)的深度学习框架,用于从G蛋白偶联受体(GPCR)的MD轨迹中自动识别不同的活性状态,称为ICNNMD模型。为避免在表示构象结构时出现信息丢失,引入了像素表示,然后构建CNN模块以有效提取特征,随后通过全连接神经网络实现分类任务。更重要的是,我们通过使用线性模型进行局部近似,为分类结果设计了一种与模型无关的局部可解释性解释解释器,通过该解释器可以快速识别不同活性状态下的重要残基。我们的模型在具有不同活性状态的三个重要GPCR系统上展示了高于99%的分类准确率。值得注意的是,成功识别了一些调节不同偏向活性的重要残基,这有助于阐明GPCR的多种激活机制。我们的模型还可以作为分析其他生物分子系统MD轨迹的通用工具。所有源代码可在https://github.com/Jane-Liu97/ICNNMD上免费获取,以帮助进行MD研究。