Berlin Konstantin, Castañeda Carlos A, Schneidman-Duhovny Dina, Sali Andrej, Nava-Tudela Alfredo, Fushman David
J Am Chem Soc. 2013 Nov 6;135(44):16595-609. doi: 10.1021/ja4083717.
Structural analysis of proteins and nucleic acids is complicated by their inherent flexibility, conferred, for example, by linkers between their contiguous domains. Therefore, the macromolecule needs to be represented by an ensemble of conformations instead of a single conformation. Determining this ensemble is challenging because the experimental data are a convoluted average of contributions from multiple conformations. As the number of the ensemble degrees of freedom generally greatly exceeds the number of independent observables, directly deconvolving experimental data into a representative ensemble is an ill-posed problem. Recent developments in sparse approximations and compressive sensing have demonstrated that useful information can be recovered from underdetermined (ill-posed) systems of linear equations by using sparsity regularization. Inspired by these advances, we designed the Sparse Ensemble Selection (SES) method for recovering multiple conformations from a limited number of observations. SES is more general and accurate than previously published minimum-ensemble methods, and we use it to obtain representative conformational ensembles of Lys48-linked diubiquitin, characterized by the residual dipolar coupling data measured at several pH conditions. These representative ensembles are validated against NMR chemical shift perturbation data and compared to maximum-entropy results. The SES method reproduced and quantified the previously observed pH dependence of the major conformation of Lys48-linked diubiquitin, and revealed lesser-populated conformations that are preorganized for binding known diubiquitin receptors, thus providing insights into possible mechanisms of receptor recognition by polyubiquitin. SES is applicable to any experimental observables that can be expressed as a weighted linear combination of data for individual states.
蛋白质和核酸的结构分析因其固有的灵活性而变得复杂,这种灵活性例如由其相邻结构域之间的连接子赋予。因此,大分子需要用一组构象来表示,而不是单一构象。确定这组构象具有挑战性,因为实验数据是多个构象贡献的复杂平均值。由于构象自由度的数量通常大大超过独立可观测值的数量,将实验数据直接反卷积为具有代表性的构象集是一个不适定问题。稀疏近似和压缩感知的最新进展表明,通过使用稀疏正则化,可以从未确定(不适定)的线性方程组中恢复有用信息。受这些进展的启发,我们设计了稀疏构象集选择(SES)方法,用于从有限数量的观测中恢复多个构象。SES比以前发表的最小构象集方法更通用、更准确,我们用它来获得赖氨酸48连接的双泛素的代表性构象集,其特征是在几种pH条件下测量的剩余偶极耦合数据。这些代表性构象集根据核磁共振化学位移扰动数据进行了验证,并与最大熵结果进行了比较。SES方法再现并量化了先前观察到的赖氨酸48连接的双泛素主要构象的pH依赖性,并揭示了预先组织好用于结合已知双泛素受体的较少丰度构象,从而为多聚泛素识别受体的可能机制提供了见解。SES适用于任何可以表示为各个状态数据的加权线性组合的实验观测值。