Zhou Xiaogen, Li Yang, Zhang Chengxin, Zheng Wei, Zhang Guijun, Zhang Yang
Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI 48109, USA.
College of Information Engineering, Zhejiang University of Technology, HangZhou 310023, China.
bioRxiv. 2020 Oct 16:2020.10.15.340455. doi: 10.1101/2020.10.15.340455.
Progress in cryo-electron microscopy (cryo-EM) has provided the potential for large-size protein structure determination. However, the solution rate for multi-domain proteins remains low due to the difficulty in modeling inter-domain orientations. We developed DEMO-EM, an automatic method to assemble multi-domain structures from cryo-EM maps through a progressive structural refinement procedure combining rigid-body domain fitting and flexible assembly simulations with deep neural network inter-domain distance profiles. The method was tested on a large-scale benchmark set of proteins containing up to twelve continuous and discontinuous domains with medium-to-low-resolution density maps, where DEMO-EM produced models with correct inter-domain orientations (TM-score >0.5) for 98% of cases and significantly outperformed the state-of-the-art methods. DEMO-EM was applied to SARS-Cov-2 coronavirus genome and generated models with average TM-score/RMSD of 0.97/1.4Å to the deposited structures. These results demonstrated an efficient pipeline that enables automated and reliable large-scale multi-domain protein structure modeling with atomic-level accuracy from cryo-EM maps.
冷冻电子显微镜(cryo-EM)技术的进步为解析大型蛋白质结构提供了可能。然而,由于在构建结构域间取向模型时存在困难,多结构域蛋白质的解析成功率仍然较低。我们开发了DEMO-EM,这是一种通过渐进式结构优化程序,从冷冻电镜图谱中自动组装多结构域结构的方法,该程序将刚体结构域拟合、灵活组装模拟与深度神经网络结构域间距离轮廓相结合。该方法在一组大规模的蛋白质基准数据集上进行了测试,这些蛋白质包含多达十二个连续和不连续的结构域,具有中低分辨率的密度图,其中DEMO-EM在98%的情况下生成了具有正确结构域间取向(TM分数>0.5)的模型,显著优于现有方法。DEMO-EM被应用于严重急性呼吸综合征冠状病毒2(SARS-Cov-2)冠状病毒基因组,并生成了与已存结构平均TM分数/均方根偏差(RMSD)为0.97/1.4Å的模型。这些结果展示了一种高效的流程,能够从冷冻电镜图谱中以原子水平的精度实现自动化且可靠的大规模多结构域蛋白质结构建模。