Department of Electrical and Computer Engineering, Iowa State University, Ames, Iowa 50011, USA.
J Chem Phys. 2018 Mar 28;148(12):123301. doi: 10.1063/1.5001325.
Dynamic Force Spectroscopy (DFS) is a widely used technique to characterize the dissociation kinetics and interaction energy landscape of receptor-ligand complexes with single-molecule resolution. In an Atomic Force Microscope (AFM)-based DFS experiment, receptor-ligand complexes, sandwiched between an AFM tip and substrate, are ruptured at different stress rates by varying the speed at which the AFM-tip and substrate are pulled away from each other. The rupture events are grouped according to their pulling speeds, and the mean force and loading rate of each group are calculated. These data are subsequently fit to established models, and energy landscape parameters such as the intrinsic off-rate (k) and the width of the potential energy barrier (x) are extracted. However, due to large uncertainties in determining mean forces and loading rates of the groups, errors in the estimated k and x can be substantial. Here, we demonstrate that the accuracy of fitted parameters in a DFS experiment can be dramatically improved by sorting rupture events into groups using cluster analysis instead of sorting them according to their pulling speeds. We test different clustering algorithms including Gaussian mixture, logistic regression, and K-means clustering, under conditions that closely mimic DFS experiments. Using Monte Carlo simulations, we benchmark the performance of these clustering algorithms over a wide range of k and x, under different levels of thermal noise, and as a function of both the number of unbinding events and the number of pulling speeds. Our results demonstrate that cluster analysis, particularly K-means clustering, is very effective in improving the accuracy of parameter estimation, particularly when the number of unbinding events are limited and not well separated into distinct groups. Cluster analysis is easy to implement, and our performance benchmarks serve as a guide in choosing an appropriate method for DFS data analysis.
动态力谱(DFS)是一种广泛使用的技术,用于以单分子分辨率表征受体-配体复合物的离解动力学和相互作用能景观。在基于原子力显微镜(AFM)的 DFS 实验中,受体-配体复合物夹在 AFM 尖端和基底之间,通过改变 AFM 尖端和基底彼此远离的速度,以不同的应变速率断裂。根据它们的拉断速度将断裂事件分组,并计算每组的平均力和加载率。随后,将这些数据拟合到已建立的模型中,并提取能量景观参数,如固有离解速率(k)和势能垒宽度(x)。然而,由于确定组的平均力和加载率的不确定性较大,估计的 k 和 x 误差可能很大。在这里,我们通过使用聚类分析而不是根据它们的拉断速度对断裂事件进行分组,证明了 DFS 实验中拟合参数的准确性可以大大提高。我们测试了不同的聚类算法,包括高斯混合、逻辑回归和 K-均值聚类,这些算法的条件与 DFS 实验非常相似。使用蒙特卡罗模拟,我们在不同的热噪声水平下,以及作为离解事件数量和拉断速度数量的函数,在广泛的 k 和 x 范围内对这些聚类算法的性能进行了基准测试。我们的结果表明,聚类分析,特别是 K-均值聚类,在提高参数估计的准确性方面非常有效,特别是当离解事件的数量有限且不能很好地分为不同的组时。聚类分析易于实现,我们的性能基准为选择适当的 DFS 数据分析方法提供了指导。