Suppr超能文献

通过冷冻电镜映射和蛋白质序列之间的跨模态对齐进行蛋白质复合物结构建模。

Protein complex structure modeling by cross-modal alignment between cryo-EM maps and protein sequences.

机构信息

School of Computer Science and Engineering, Sun Yat-sen University, Guangzhou, China.

Sun Yat-sen Memorial Hospital, Sun Yat-sen University, Guangzhou, China.

出版信息

Nat Commun. 2024 Oct 11;15(1):8808. doi: 10.1038/s41467-024-53116-5.

Abstract

Cryo-electron microscopy (cryo-EM) technique is widely used for protein structure determination. Current automatic cryo-EM protein complex modeling methods mostly rely on prior chain separation. However, chain separation without sequence guidance often suffers from errors caused by cross-chain interaction or noise densities, which would accumulate and mislead the subsequent steps. Here, we present EModelX, a fully automated cryo-EM protein complex structure modeling method, which achieves sequence-guiding modeling through cross-modal alignments between cryo-EM maps and protein sequences. EModelX first employs multi-task deep learning to predict Cα atoms, backbone atoms, and amino acid types from cryo-EM maps, which is subsequently used to sample Cα traces with amino acid profiles. The profiles are then aligned with protein sequences to obtain initial structural models, which yielded an average RMSD of 1.17 Å in our test set, approaching atomic-level precision in recovering PDB-deposited structures. After filling unmodeled gaps through sequence-guiding Cα threading, the final models achieved an average TM-score of 0.808, outperforming the state-of-the-art method. The further combination with AlphaFold can improve the average TM-score to 0.911. Analyzes conducted by comparing some EModelX-built models and PDB structures highlight its potential to improve PDB structures. EModelX is accessible at https://bio-web1.nscc-gz.cn/app/EModelX .

摘要

冷冻电镜(cryo-EM)技术广泛应用于蛋白质结构测定。当前的自动冷冻电镜蛋白质复合物建模方法大多依赖于预先的链分离。然而,没有序列指导的链分离常常受到跨链相互作用或噪声密度引起的错误的影响,这些错误会累积并误导后续步骤。在这里,我们提出了 EModelX,一种全自动的冷冻电镜蛋白质复合物结构建模方法,它通过冷冻电镜图谱和蛋白质序列之间的跨模态对齐来实现序列引导建模。EModelX 首先利用多任务深度学习从冷冻电镜图谱中预测 Cα 原子、骨架原子和氨基酸类型,然后用氨基酸图谱采样 Cα 轨迹。然后将这些图谱与蛋白质序列对齐,得到初始结构模型,在我们的测试集中,平均 RMSD 为 1.17 Å,接近恢复 PDB 结构的原子级精度。通过序列引导的 Cα 穿线填充未建模的间隙后,最终模型的平均 TM 分数达到 0.808,优于最先进的方法。与 AlphaFold 的进一步结合可以将平均 TM 分数提高到 0.911。通过比较一些 EModelX 构建的模型和 PDB 结构的分析,突出了其改善 PDB 结构的潜力。EModelX 可在 https://bio-web1.nscc-gz.cn/app/EModelX 访问。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/f1fc/11470027/d10ea637d954/41467_2024_53116_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验