Schmidler Scott C, Hughes Roy Gene, Oas Terrence G, Zhao Shiwen
bioRxiv. 2023 Sep 17:2023.09.16.558052. doi: 10.1101/2023.09.16.558052.
Helix-coil models are routinely used to interpret CD data of helical peptides or predict the helicity of naturally-occurring and designed polypeptides. However, a helix-coil model contains significantly more information than mean helicity alone, as it defines the entire ensemble - the equilibrium population of every possible helix-coil configuration - for a given sequence. Many desirable quantities of this ensemble are either not obtained as ensemble averages, or are not available using standard helicity-averaging calculations. Enumeration of the entire ensemble can allow calculation of a wider set of ensemble properties, but the exponential size of the configuration space typically renders this intractable. We present an algorithm that efficiently approximates the helix-coil ensemble to arbitrary accuracy, by sequentially generating a list of the M highest populated configurations in descending order of population. Truncating this list of (configuration, population) pairs at a desired accuracy provides an approximating sub-ensemble. We demonstrate several uses of this approach for providing insight into helix-coil ensembles and folding mechanisms, including landscape visualization.
螺旋-卷曲模型通常用于解释螺旋肽的圆二色性(CD)数据,或预测天然存在和设计的多肽的螺旋度。然而,螺旋-卷曲模型所包含的信息远不止平均螺旋度,因为它定义了给定序列的整个系综——每种可能的螺旋-卷曲构型的平衡群体。这个系综的许多理想量要么无法作为系综平均值获得,要么无法通过标准的螺旋度平均计算得到。对整个系综进行枚举可以计算出更广泛的系综性质集,但构型空间的指数大小通常使其难以处理。我们提出了一种算法,通过按群体数量降序顺序依次生成M个最丰富构型的列表,以任意精度有效地逼近螺旋-卷曲系综。在所需精度处截断这个(构型,群体)对列表可提供一个近似子系综。我们展示了这种方法的几种用途,以深入了解螺旋-卷曲系综和折叠机制,包括景观可视化。