Yokoyama Yuichi, Terada Tohru, Shimizu Kentaro, Nishikawa Kouki, Kozai Daisuke, Shimada Atsuhiro, Mizoguchi Akira, Fujiyoshi Yoshinori, Tani Kazutoshi
Graduate School of Interdisciplinary Information Studies, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.
Interfaculty Initiative in Information Studies, The University of Tokyo, 7-3-1 Hongo, Bunkyo-ku, Tokyo, 113-0033, Japan.
Biophys Rev. 2020 Apr;12(2):349-354. doi: 10.1007/s12551-020-00669-6. Epub 2020 Mar 11.
Recent advances in cryo-electron microscopy (cryo-EM) have enabled protein structure determination at atomic resolutions. Cryo-EM specimens are prepared by rapidly freezing a protein solution on a metal grid coated with a holey carbon film; this results in the formation of an ice film on each hole. The thickness of the ice film is a critical factor for high-resolution structure determination; ice that is too thick degrades the contrast of the protein image while ice that is too thin excludes the protein from the hole or denatures the protein. Therefore, trained researchers need to manually select "good" regions with appropriate ice thicknesses for imaging. To reduce the time spent on such tasks, we developed a deep learning program consisting of a "detector" and a "classifier" to identify good regions from low-magnification EM images. In our method, the holes in a low-magnification EM image are detected via a detector, and the ice image on each hole is classified as either good or bad via a classifier. The detector detected more than 95% of the holes regardless of the type of samples. The classifier was trained for different types of samples because the appropriate ice thickness varies between sample types. The accuracies of the classifiers were 93.8% for a soluble protein sample (β-galactosidase) and 95.3% for a membrane protein sample (bovine heart cytochrome c oxidase). In addition, we found that a training data set containing ~ 2100 hole images from 300 low-magnification EM images was sufficient to obtain good accuracy, such as higher than 90%. We expect that the throughput of the cryo-EM data collection step will be greatly improved by using our method.
低温电子显微镜(cryo-EM)的最新进展已能够在原子分辨率下测定蛋白质结构。通过将蛋白质溶液快速冷冻在涂有带孔碳膜的金属网格上来制备cryo-EM样本;这会在每个孔上形成一层冰膜。冰膜的厚度是高分辨率结构测定的关键因素;过厚的冰会降低蛋白质图像的对比度,而过薄的冰会使蛋白质无法进入孔中或使蛋白质变性。因此,训练有素的研究人员需要手动选择具有合适冰厚度的“良好”区域进行成像。为了减少在此类任务上花费的时间,我们开发了一个由“检测器”和“分类器”组成的深度学习程序,以从低倍率EM图像中识别出良好区域。在我们的方法中,通过检测器检测低倍率EM图像中的孔,并通过分类器将每个孔上的冰图像分类为好或坏。无论样本类型如何,检测器检测到的孔超过95%。由于不同样本类型的合适冰厚度不同,因此针对不同类型的样本对分类器进行了训练。对于可溶性蛋白质样本(β-半乳糖苷酶),分类器的准确率为93.8%,对于膜蛋白样本(牛心细胞色素c氧化酶),准确率为95.3%。此外,我们发现,一个包含来自300张低倍率EM图像的约2100个孔图像的训练数据集足以获得较高的准确率,如高于90%。我们预计,使用我们的方法将大大提高cryo-EM数据收集步骤的通量。