Chen Bo, Shen Bingxin, Frank Joachim
Department of Biology, Columbia University, New York, NY 10027, USA.
Howard Hughes Medical Institute, Department of Biochemistry and Molecular Biophysics, Columbia University, New York, NY 10032, USA.
J Struct Biol. 2014 Dec;188(3):267-73. doi: 10.1016/j.jsb.2014.10.006. Epub 2014 Oct 30.
Recently developed classification methods have enabled resolving multiple biological structures from cryo-EM data collected on heterogeneous biological samples. However, there remains the problem of how to base the decisions in the classification on the statistics of the cryo-EM data, to reduce the subjectivity in the process. Here, we propose a quantitative analysis to determine the iteration of convergence and the number of distinguishable classes, based on the statistics of the single particles in an iterative classification scheme. We start the classification with more number of classes than anticipated based on prior knowledge, and then combine the classes that yield similar reconstructions. The classes yielding similar reconstructions can be identified from the migrating particles (jumpers) during consecutive iterations after the iteration of convergence. We therefore termed the method "jumper analysis", and applied it to the output of RELION 3D classification of a benchmark experimental dataset. This work is a step forward toward fully automated single-particle reconstruction and classification of cryo-EM data.
最近开发的分类方法能够从在异质生物样本上收集的冷冻电镜数据中解析出多个生物结构。然而,如何基于冷冻电镜数据的统计信息来进行分类决策,以减少该过程中的主观性,这一问题仍然存在。在此,我们提出一种定量分析方法,基于迭代分类方案中单个粒子的统计信息来确定收敛迭代次数和可区分类别数。我们开始分类时使用的类别数量比基于先验知识预期的更多,然后合并产生相似重建结果的类别。在收敛迭代之后的连续迭代过程中,可以从迁移粒子(跳跃者)中识别出产生相似重建结果的类别。因此,我们将该方法称为“跳跃者分析”,并将其应用于一个基准实验数据集的RELION 3D分类输出。这项工作朝着冷冻电镜数据的全自动单粒子重建和分类迈出了一步。