Farrell Daniel P, Anishchenko Ivan, Shakeel Shabih, Lauko Anna, Passmore Lori A, Baker David, DiMaio Frank
Department of Biochemistry, University of Washington, Seattle, WA 98105, USA.
Institute for Protein Design, University of Washington, Seattle, WA 98105, USA.
IUCrJ. 2020 Aug 20;7(Pt 5):881-892. doi: 10.1107/S2052252520009306. eCollection 2020 Sep 1.
Cryo-electron microscopy of protein complexes often leads to moderate resolution maps (4-8 Å), with visible secondary-structure elements but poorly resolved loops, making model building challenging. In the absence of high-resolution structures of homologues, only coarse-grained structural features are typically inferred from these maps, and it is often impossible to assign specific regions of density to individual protein subunits. This paper describes a new method for overcoming these difficulties that integrates predicted residue distance distributions from a deep-learned convolutional neural network, computational protein folding using , and automated EM-map-guided complex assembly. We apply this method to a 4.6 Å resolution cryoEM map of Fanconi Anemia core complex (FAcc), an E3 ubiquitin ligase required for DNA interstrand crosslink repair, which was previously challenging to interpret as it comprises 6557 residues, only 1897 of which are covered by homology models. In the published model built from this map, only 387 residues could be assigned to the specific subunits with confidence. By building and placing into density 42 deep-learning-guided models containing 4795 residues not included in the previously published structure, we are able to determine an almost-complete atomic model of FAcc, in which 5182 of the 6557 residues were placed. The resulting model is consistent with previously published biochemical data, and facilitates interpretation of disease-related mutational data. We anticipate that our approach will be broadly useful for cryoEM structure determination of large complexes containing many subunits for which there are no homologues of known structure.
蛋白质复合物的冷冻电子显微镜检查通常会得到中等分辨率的图谱(4 - 8 Å),其中二级结构元件清晰可见,但环区分辨率较差,这使得模型构建具有挑战性。在缺乏同源物的高分辨率结构的情况下,通常只能从这些图谱中推断出粗粒度的结构特征,而且往往无法将密度的特定区域分配给各个蛋白质亚基。本文描述了一种克服这些困难的新方法,该方法整合了来自深度卷积神经网络的预测残基距离分布、使用的计算蛋白质折叠以及自动的电子显微镜图谱引导的复合物组装。我们将此方法应用于范可尼贫血核心复合物(FAcc)的4.6 Å分辨率冷冻电镜图谱,FAcc是一种DNA链间交联修复所需的E3泛素连接酶,此前因其包含6557个残基而难以解释,其中只有1897个残基被同源模型覆盖。在从该图谱构建的已发表模型中,只有387个残基能够可靠地分配给特定亚基。通过构建并将包含4795个残基的42个深度学习引导模型放入密度图中,这些残基未包含在先前发表的结构中,我们能够确定FAcc的几乎完整的原子模型,其中6557个残基中的5182个被定位。所得模型与先前发表的生化数据一致,并有助于解释与疾病相关的突变数据。我们预计,我们的方法将广泛应用于含有许多亚基且无已知结构同源物的大型复合物的冷冻电镜结构测定。