Molecular Medicine Program, Hospital for Sick Children, Toronto, Ontario M5G 0A4, Canada.
Department of Biochemistry, University of Toronto, Toronto, Ontario M5S 1A8, Canada.
J Phys Chem A. 2022 Sep 8;126(35):5985-6003. doi: 10.1021/acs.jpca.2c03726. Epub 2022 Aug 28.
The power of structural information for informing biological mechanisms is clear for stable folded macromolecules, but similar structure-function insight is more difficult to obtain for highly dynamic systems such as intrinsically disordered proteins (IDPs) which must be described as structural ensembles. Here, we present IDPConformerGenerator, a flexible, modular open-source software platform for generating large and diverse ensembles of disordered protein states that builds conformers that obey geometric, steric, and other physical restraints on the input sequence. IDPConformerGenerator samples backbone phi (φ), psi (ψ), and omega (ω) torsion angles of relevant sequence fragments from loops and secondary structure elements extracted from folded protein structures in the RCSB Protein Data Bank and builds side chains from robust Monte Carlo algorithms using expanded rotamer libraries. IDPConformerGenerator has many user-defined options enabling variable fractional sampling of secondary structures, supports Bayesian models for assessing the agreement of IDP ensembles for consistency with experimental data, and introduces a machine learning approach to transform between internal and Cartesian coordinates with reduced error. IDPConformerGenerator will facilitate the characterization of disordered proteins to ultimately provide structural insights into these states that have key biological functions.
结构信息在解释生物机制方面的作用对于稳定折叠的大分子来说是显而易见的,但对于高度动态的系统,如必须描述为结构集合的无规卷曲蛋白质(IDP),类似的结构-功能的洞察力就更难获得。在这里,我们展示了 IDPConformerGenerator,这是一个灵活的、模块化的开源软件平台,用于生成大规模的、多样化的无序蛋白质状态集合,这些集合的构象符合输入序列的几何、空间和其他物理限制。IDPConformerGenerator 从折叠蛋白质结构中提取的 RCSB 蛋白质数据库中的环和二级结构元素中提取相关序列片段的主链 φ(φ)、psi(ψ)和 omega(ω)扭转角,并使用扩展的构象库构建侧链从稳健的蒙特卡罗算法。IDPConformerGenerator 具有许多用户定义的选项,可实现二级结构的可变分数采样,支持贝叶斯模型来评估 IDP 集合与实验数据的一致性,并且引入了机器学习方法来在内部和笛卡尔坐标之间进行转换,从而减少误差。IDPConformerGenerator 将有助于无序蛋白质的表征,最终为这些具有关键生物学功能的状态提供结构见解。