Biswas Parbati, Zou Jinming, Saven Jeffery G
Department of Chemistry, University of Pennsylvania, Philadelphia, Pennsylvania 19104, USA.
J Chem Phys. 2005 Oct 15;123(15):154908. doi: 10.1063/1.2062047.
Combinatorial protein libraries provide a promising route to investigate the determinants and features of protein folding and to identify novel folding amino acid sequences. A library of sequences based on a pool of different monomer types are screened for folding molecules, consistent with a particular foldability criterion. The number of sequences grows exponentially with the length of the polymer, making both experimental and computational tabulations of sequences infeasible. Herein a statistical theory is extended to specify the properties of sequences having particular values of global energetic quantities that specify their energy landscape. The theory yields the site-specific monomer probabilities. A foldability criterion is derived that characterizes the properties of sequences by quantifying the energetic separation of the target state from low-energy states in the unfolded ensemble and the fluctuations of the energies in the unfolded state ensemble. For a simple lattice model of proteins, excellent agreement is observed between the theory and the results of exact enumeration. The theory may be used to provide a quantitative framework for the design and interpretation of combinatorial experiments.
组合蛋白质文库为研究蛋白质折叠的决定因素和特征以及鉴定新型折叠氨基酸序列提供了一条很有前景的途径。基于不同单体类型的集合构建一个序列文库,筛选符合特定折叠能力标准的折叠分子。序列数量随聚合物长度呈指数增长,这使得对序列进行实验和计算列表都不可行。在此,一种统计理论被扩展以指定具有特定全局能量量值的序列的性质,这些量值指定了它们的能量景观。该理论得出位点特异性单体概率。通过量化目标状态与未折叠系综中低能量状态的能量分离以及未折叠状态系综中能量的波动,推导出一个表征序列性质的折叠能力标准。对于蛋白质的简单晶格模型,该理论与精确枚举结果之间观察到了极好的一致性。该理论可用于为组合实验的设计和解释提供一个定量框架。