Department of Biochemistry and Molecular Biology, Institute for Biophysical Dynamics, University of Chicago, Illinois, USA.
Biophys J. 2011 Jan 19;100(2):440-9. doi: 10.1016/j.bpj.2010.10.053.
The initial output of a time-resolved macromolecular crystallography experiment is a time-dependent series of difference electron density maps that displays the time-dependent changes in underlying structure as a reaction progresses. The goal is to interpret such data in terms of a small number of crystallographically refinable, time-independent structures, each associated with a reaction intermediate; to establish the pathways and rate coefficients by which these intermediates interconvert; and thereby to elucidate a chemical kinetic mechanism. One strategy toward achieving this goal is to use cluster analysis, a statistical method that groups objects based on their similarity. If the difference electron density at a particular voxel in the time-dependent difference electron density (TDED) maps is sensitive to the presence of one and only one intermediate, then its temporal evolution will exactly parallel the concentration profile of that intermediate with time. The rationale is therefore to cluster voxels with respect to the shapes of their TDEDs, so that each group or cluster of voxels corresponds to one structural intermediate. Clusters of voxels whose TDEDs reflect the presence of two or more specific intermediates can also be identified. From such groupings one can then infer the number of intermediates, obtain their time-independent difference density characteristics, and refine the structure of each intermediate. We review the principles of cluster analysis and clustering algorithms in a crystallographic context, and describe the application of the method to simulated and experimental time-resolved crystallographic data for the photocycle of photoactive yellow protein.
时间分辨大分子晶体学实验的初始输出是一个时变的差电子密度图系列,它显示了反应过程中基础结构的时变变化。目标是以少量可晶体学精修的、与时间无关的结构来解释这些数据,这些结构与每个反应中间体相关联;确定这些中间体相互转化的途径和速率系数;从而阐明化学动力学机制。实现这一目标的一种策略是使用聚类分析,这是一种根据对象的相似性对其进行分组的统计方法。如果在时间相关的差分电子密度 (TDED) 图中特定体素的差分电子密度对一个且仅一个中间体的存在敏感,则其时间演化将与该中间体的浓度随时间的变化完全平行。因此,其基本原理是根据 TDED 的形状对体素进行聚类,使得每个体素组或聚类对应于一个结构中间体。也可以识别反映存在两个或更多特定中间体的体素聚类。然后,可以从这些分组中推断出中间体的数量,获得它们与时间无关的差分密度特征,并精修每个中间体的结构。我们在晶体学背景下回顾了聚类分析和聚类算法的原理,并描述了该方法在光致变色黄色蛋白光循环的模拟和实验时间分辨晶体学数据中的应用。