Casadei Cecilia M, Hosseinizadeh Ahmad, Schertler Gebhard F X, Ourmazd Abbas, Santra Robin
University of Wisconsin Milwaukee, Milwaukee, Wisconsin 53201, USA.
Struct Dyn. 2022 Aug 16;9(4):044101. doi: 10.1063/4.0000156. eCollection 2022 Jul.
Time-resolved serial femtosecond crystallography (TR-SFX) provides access to protein dynamics on sub-picosecond timescales, and with atomic resolution. Due to the nature of the experiment, these datasets are often highly incomplete and the measured diffracted intensities are affected by partiality. To tackle these issues, one established procedure is that of splitting the data into time bins, and averaging the multiple measurements of equivalent reflections within each bin. This binning and averaging often involve a loss of information. Here, we propose an alternative approach, which we call low-pass spectral analysis (LPSA). In this method, the data are projected onto the subspace defined by a set of trigonometric functions, with frequencies up to a certain cutoff. This approach attenuates undesirable high-frequency features and facilitates retrieving the underlying dynamics. A time-lagged embedding step can be included prior to subspace projection to improve the stability of the results with respect to the parameters involved. Subsequent modal decomposition allows to produce a low-rank description of the system's evolution. Using a synthetic time-evolving model with incomplete and partial observations, we analyze the LPSA results in terms of quality of the retrieved signal, as a function of the parameters involved. We compare the performance of LPSA to that of a range of other sophisticated data analysis techniques. We show that LPSA allows to achieve excellent dynamics reconstruction at modest computational cost. Finally, we demonstrate the superiority of dynamics retrieval by LPSA compared to time binning and merging, which is, to date, the most commonly used method to extract dynamical information from TR-SFX data.
时间分辨飞秒晶体学(TR-SFX)能够在亚皮秒时间尺度上以原子分辨率获取蛋白质动力学信息。由于实验的性质,这些数据集通常非常不完整,并且测量的衍射强度会受到偏倚的影响。为了解决这些问题,一种既定的方法是将数据按时间分箱,并对每个箱内等效反射的多次测量进行平均。这种分箱和平均操作通常会导致信息丢失。在此,我们提出一种替代方法,我们称之为低通谱分析(LPSA)。在这种方法中,数据被投影到由一组三角函数定义的子空间上,频率上限为某个截止值。这种方法会衰减不需要的高频特征,并有助于恢复潜在的动力学信息。在子空间投影之前可以包含一个时间延迟嵌入步骤,以提高结果相对于所涉及参数的稳定性。随后的模态分解能够生成系统演化的低秩描述。我们使用一个具有不完全观测和部分观测的合成时间演化模型,根据所涉及参数的函数关系,分析了LPSA在恢复信号质量方面的结果。我们将LPSA的性能与一系列其他复杂数据分析技术的性能进行了比较。我们表明,LPSA能够以适度的计算成本实现出色的动力学重建。最后,我们证明了与时间分箱和合并相比,LPSA在动力学恢复方面的优越性,时间分箱和合并是迄今为止从TR-SFX数据中提取动力学信息最常用的方法。