Department of Electrical Engineering and Computer Science, University of Missouri, Columbia, MO 65211, United States.
NextGen Precision Health, University of Missouri, Columbia, MO 65211, United States.
Bioinformatics. 2024 Mar 4;40(3). doi: 10.1093/bioinformatics/btae109.
Cryo-electron microscopy (cryo-EM) is a powerful technique for determining the structures of large protein complexes. Picking single protein particles from cryo-EM micrographs (images) is a crucial step in reconstructing protein structures from them. However, the widely used template-based particle picking process requires some manual particle picking and is labor-intensive and time-consuming. Though machine learning and artificial intelligence (AI) can potentially automate particle picking, the current AI methods pick particles with low precision or low recall. The erroneously picked particles can severely reduce the quality of reconstructed protein structures, especially for the micrographs with low signal-to-noise ratio.
To address these shortcomings, we devised CryoTransformer based on transformers, residual networks, and image processing techniques to accurately pick protein particles from cryo-EM micrographs. CryoTransformer was trained and tested on the largest labeled cryo-EM protein particle dataset-CryoPPP. It outperforms the current state-of-the-art machine learning methods of particle picking in terms of the resolution of 3D density maps reconstructed from the picked particles as well as F1-score, and is poised to facilitate the automation of the cryo-EM protein particle picking.
The source code and data for CryoTransformer are openly available at: https://github.com/jianlin-cheng/CryoTransformer.
低温电子显微镜(cryo-EM)是确定大型蛋白质复合物结构的强大技术。从 cryo-EM 显微照片(图像)中挑选单个蛋白质颗粒是从它们重建蛋白质结构的关键步骤。然而,广泛使用的基于模板的颗粒挑选过程需要一些手动颗粒挑选,并且劳动强度大,耗时。尽管机器学习和人工智能(AI)有可能实现颗粒挑选的自动化,但当前的 AI 方法的颗粒挑选精度或召回率较低。错误挑选的颗粒会严重降低重建蛋白质结构的质量,特别是对于信噪比低的显微照片。
为了解决这些缺点,我们基于变压器、残差网络和图像处理技术设计了 CryoTransformer,以从 cryo-EM 显微照片中准确地挑选蛋白质颗粒。CryoTransformer 在最大的标记 cryo-EM 蛋白质颗粒数据集 CryoPPP 上进行了训练和测试。它在从挑选的颗粒重建的 3D 密度图的分辨率以及 F1 分数方面优于当前最先进的机器学习颗粒挑选方法,并且有望促进 cryo-EM 蛋白质颗粒挑选的自动化。
CryoTransformer 的源代码和数据可在以下网址公开获取:https://github.com/jianlin-cheng/CryoTransformer。