Faure Guilhem, Bornot Aurélie, de Brevern Alexandre G
Equipe de Bioinformatique Génomique et Moléculaire , Université Paris Diderot, 75251 Paris, France.
Biochimie. 2008 Apr;90(4):626-39. doi: 10.1016/j.biochi.2007.11.007. Epub 2007 Nov 28.
Three-dimensional structures of proteins are the support of their biological functions. Their folds are stabilized by contacts between residues. Inner protein contacts are generally described through direct atomic contacts, i.e. interactions between side-chain atoms, while contact prediction methods mainly used inter-Calpha distances. In this paper, we have analyzed the protein contacts on a recent high quality non-redundant databank using different criteria. First, we have studied the average number of contacts depending on the distance threshold to define a contact. Preferential contacts between types of amino acids have been highlighted. Detailed analyses have been done concerning the proximity of contacts in the sequence, the size of the proteins and fold classes. The strongest differences have been extracted, highlighting important residues. Then, we studied the influence of five different side-chain conformation prediction methods (SCWRL, IRECS, SCAP, SCATD and SCCOMP) on the distribution of contacts. The prediction rates of these different methods are quite similar. However, using a distance criterion between side chains, the results are quite different, e.g. SCAP predicts 50% more contacts than observed, unlike other methods that predict fewer contacts than observed. Contacts deduced are quite distinct from one method to another with at most 75% contacts in common. Moreover, distributions of amino acid preferential contacts present unexpected behaviours distinct from previously observed in the X-ray structures, especially at the surface of proteins. For instance, the interactions involving Tryptophan greatly decrease.
蛋白质的三维结构是其生物学功能的支撑。它们的折叠通过残基之间的接触得以稳定。蛋白质内部的接触通常通过直接原子接触来描述,即侧链原子之间的相互作用,而接触预测方法主要使用Cα原子间距离。在本文中,我们使用不同标准分析了近期一个高质量非冗余数据库中的蛋白质接触情况。首先,我们研究了取决于定义接触的距离阈值的平均接触数。突出了氨基酸类型之间的优先接触。针对序列中接触的邻近性、蛋白质大小和折叠类别进行了详细分析。提取了最显著的差异,突出了重要残基。然后,我们研究了五种不同的侧链构象预测方法(SCWRL、IRECS、SCAP、SCATD和SCCOMP)对接触分布的影响。这些不同方法的预测率相当相似。然而,使用侧链之间的距离标准时,结果差异很大,例如,SCAP预测的接触比观察到的多50%,而其他方法预测的接触比观察到的少。不同方法推导的接触彼此差异很大,最多只有75%的接触相同。此外,氨基酸优先接触的分布呈现出与先前在X射线结构中观察到的不同的意外行为,尤其是在蛋白质表面。例如,涉及色氨酸的相互作用大幅减少。