Schmidt Michael, Hamacher Kay
Department of Physics, TU Darmstadt, Karolinenpl. 5, 64289 Darmstadt, Germany.
Department of Biology, TU Darmstadt, Schnittspahnstr. 10, 64287 Darmstadt, Germany.
Phys Rev E. 2021 Apr;103(4-1):042418. doi: 10.1103/PhysRevE.103.042418.
Direct-coupling analysis is a statistical learning method for protein contact prediction based on sequence information alone. The maximum entropy principle leads to an effective inverse Potts model. Predictions on contacts are based on fitted local fields and couplings from an empirical multiple sequence alignment. Typically, the l_{2} norm of the resulting two-body couplings is used for contact prediction. However, this procedure discards important information. In this paper we show that the usage of the full fields and coupling information improves prediction accuracy.
直接耦合分析是一种仅基于序列信息进行蛋白质接触预测的统计学习方法。最大熵原理导致了一个有效的逆Potts模型。接触预测基于从经验多序列比对中拟合的局部场和耦合。通常,所得两体耦合的l₂范数用于接触预测。然而,这个过程会丢弃重要信息。在本文中,我们表明使用完整的场和耦合信息可以提高预测准确性。