CNR-Istituto dei Sistemi Complessi, Via dei Taurini 19, 00185 Rome, Italy.
INFN-Sezione di Roma1, P.le Aldo Moro, 2 00185 Rome, Italy.
Phys Biol. 2023 Jul 10;20(5). doi: 10.1088/1478-3975/ace1c5.
Correlation analysis and its close variant principal component analysis are tools widely applied to predict the biological functions of macromolecules in terms of the relationship between fluctuation dynamics and structural properties. However, since this kind of analysis does not necessarily imply causation links among the elements of the system, its results run the risk of being biologically misinterpreted. By using as a benchmark the structure of ubiquitin, we report a critical comparison of correlation-based analysis with the analysis performed using two other indicators, response function and transfer entropy, that quantify the causal dependence. The use of ubiquitin stems from its simple structure and from recent experimental evidence of an allosteric control of its binding to target substrates. We discuss the ability of correlation, response and transfer-entropy analysis in detecting the role of the residues involved in the allosteric mechanism of ubiquitin as deduced by experiments. To maintain the comparison as much as free from the complexity of the modeling approach and the quality of time series, we describe the fluctuations of ubiquitin native state by the Gaussian network model which, being fully solvable, allows one to derive analytical expressions of the observables of interest. Our comparison suggests that a good strategy consists in combining correlation, response and transfer entropy, such that the preliminary information extracted from correlation analysis is validated by the two other indicators in order to discard those spurious correlations not associated with true causal dependencies.
相关分析及其近亲主成分分析是广泛应用于预测生物大分子生物功能的工具,这些预测是基于波动动力学与结构特性之间的关系。然而,由于这种分析并不一定暗示系统元素之间存在因果关系,因此其结果可能存在生物学误解的风险。我们以泛素的结构为基准,报告了基于相关的分析与使用另外两个指标(响应函数和转移熵)进行的分析之间的严格比较,这两个指标量化了因果依赖性。选择泛素的原因是其结构简单,并且最近有实验证据表明其与靶底物结合的变构控制。我们讨论了相关、响应和转移熵分析在检测实验推断出的泛素变构机制中涉及残基的作用的能力。为了使比较尽可能不受建模方法的复杂性和时间序列质量的影响,我们通过高斯网络模型描述了泛素天然状态的波动,由于高斯网络模型是完全可解的,因此可以推导出感兴趣的可观测量的解析表达式。我们的比较表明,一个好的策略是结合相关、响应和转移熵,以便从相关分析中提取的初步信息可以通过另外两个指标进行验证,以排除那些与真正因果关系无关的虚假相关性。