Department of Plant and Microbial Biology, University of California, Berkeley, CA 94720, USA.
J Struct Biol. 2010 Apr;170(1):98-108. doi: 10.1016/j.jsb.2010.01.007. Epub 2010 Jan 18.
Biological macromolecules can adopt multiple conformational and compositional states due to structural flexibility and alternative subunit assemblies. This structural heterogeneity poses a major challenge in the study of macromolecular structure using single-particle electron microscopy. We propose a fully automated, unsupervised method for the three-dimensional reconstruction of multiple structural models from heterogeneous data. As a starting reference, our method employs an initial structure that does not account for any heterogeneity. Then, a multi-stage clustering is used to create multiple models representative of the heterogeneity within the sample. The multi-stage clustering combines an existing approach based on Multivariate Statistical Analysis to perform clustering within individual Euler angles, and a newly developed approach to sort out class averages from individual Euler angles into homogeneous groups. Structural models are computed from individual clusters. The whole data classification is further refined using an iterative multi-model projection-matching approach. We tested our method on one synthetic and three distinct experimental datasets. The tests include the cases where a macromolecular complex exhibits structural flexibility and cases where a molecule is found in ligand-bound and unbound states. We propose the use of our approach as an efficient way to reconstruct distinct multiple models from heterogeneous data.
生物大分子由于结构的灵活性和亚基组装的多样性,可以采用多种构象和组成状态。这种结构异质性给使用单颗粒电子显微镜研究大分子结构带来了重大挑战。我们提出了一种全自动、无监督的方法,用于从异质数据中三维重建多个结构模型。作为起始参考,我们的方法使用不考虑任何异质性的初始结构。然后,采用多阶段聚类来创建多个代表样品内异质性的模型。多阶段聚类结合了一种基于多变量统计分析的现有方法,用于在单个欧拉角内进行聚类,以及一种新开发的方法,用于将单个欧拉角中的类平均值分类为同质组。从单个聚类中计算结构模型。使用迭代多模型投影匹配方法进一步细化整个数据分类。我们在一个合成数据集和三个不同的实验数据集上测试了我们的方法。测试包括大分子复合物表现出结构灵活性的情况,以及分子处于配体结合和未结合状态的情况。我们建议使用我们的方法作为一种从异质数据中重建不同的多个模型的有效方法。