Treger M, Westhof E
Laboratoire de Biostatistique et d'Informatique Médicale, Faculté de Médecine, Université Louis Pasteur, 4 rue Kirshleger, F-67000 Strasbourg, France.
J Mol Recognit. 2001 Jul-Aug;14(4):199-214. doi: 10.1002/jmr.534.
Forty-five crystals of complexes between proteins and RNA molecules from the Protein Data Bank have been statistically surveyed for the number of contacts between RNA components (phosphate, ribose and the four bases) and amino acid side chains. Three groups of complexes were defined: the tRNA synthetases; the ribosomal complexes; and a third group containing a variety of complexes. The types of atomic contacts were a priori classified into ionic, neutral H-bond, C-H...O H-bond, or van der Waals interaction. All the contacts were organized into a relational database which allows for statistical analysis. The main conclusions are the following: (i) in all three groups of complexes, the most preferred amino acids (Arg, Asn, Ser, Lys) and the less preferred ones (Ala, Ile, Leu, Val) are the same; Trp and Cys are rarely observed (respectively 15 and 5 amino acids in the ensemble of interfaces); (ii) of the total number of amino acids located at the interfaces 22% are hydrophobic, 40% charged (positive 32%, negative 8%), 30% polar and 8% are Gly; (iii) in ribosomal complexes, phosphate is preferred over ribose, which is preferred over the bases, but there is no significant preference in the other two groups; (iv) there is no significant prevalence of a base type at protein-RNA interfaces, but specifically Arg and Lys display a preference for phosphate over ribose and bases; Pro and Asn prefer bases over ribose and phosphate; Met, Phe and Tyr prefer ribose over phosphate and bases. Further, Ile, Pro, Ser prefer A over the others; Leu prefers C; Asp and Gly prefer G; and Asn prefers U. Considering the contact types, the following conclusions could be drawn: (i) 23% of the contacts are via potential H-bonds (including CH...O H-bonds and ionic interactions), 72% belong to van der Waals interactions and 5% are considered as short contacts; (ii) of all potential H-bonds, 54% are standard, 33% are of the C-H...O type and 13% are ionic; (iii) the Watson-Crick sites of G, O6(G) and principally N2(G) and the hydroxyl group O2' is more often involved in H-bonds than expected; the protein main chain is involved in 32% and the side chains in 68% of the H-bonds; considering the neutral and ionic H-bonds, the following couples are more frequent than expected-base A-Ser, base G-Asp/Glu, base U-Asn. The RNA CH groups interact preferentially with oxygen atoms (62% on the main chain and 19% on the side chains); (iv) the bases are involved in 38% of all H-bonds and more than 26% of the H-bonds have the H donor group on the RNA; (v) the atom O2' is involved in 21% of all H-bonds, a number greater than expected; (vi) amino acids less frequently in direct contact with RNA components interact frequently via their main chain atoms through water molecules with RNA atoms; in contrast, those frequently observed in direct contact, except Ser, use instead their side chain atoms for water bridging interactions.
我们对蛋白质数据库中45个蛋白质与RNA分子的复合物晶体进行了统计调查,以研究RNA组分(磷酸、核糖和四种碱基)与氨基酸侧链之间的接触数量。我们定义了三组复合物:氨酰-tRNA合成酶;核糖体复合物;以及包含各种复合物的第三组。原子接触类型预先分为离子型、中性氢键、C-H…O氢键或范德华相互作用。所有接触都被整理到一个关系数据库中,以便进行统计分析。主要结论如下:(i)在所有三组复合物中,最常见的氨基酸(精氨酸、天冬酰胺、丝氨酸、赖氨酸)和最不常见的氨基酸(丙氨酸、异亮氨酸、亮氨酸、缬氨酸)是相同的;色氨酸和半胱氨酸很少出现(在整个界面中分别为15个和5个氨基酸);(ii)位于界面的氨基酸总数中,22%是疏水的,40%是带电荷的(32%为正电荷,8%为负电荷),30%是极性的,8%是甘氨酸;(iii)在核糖体复合物中,磷酸比核糖更受青睐,核糖比碱基更受青睐,但在其他两组中没有明显偏好;(iv)在蛋白质-RNA界面上,碱基类型没有明显的优势,但精氨酸和赖氨酸特别显示出对磷酸比对核糖和碱基更偏好;脯氨酸和天冬酰胺偏好碱基胜过核糖和磷酸;甲硫氨酸、苯丙氨酸和酪氨酸偏好核糖胜过磷酸和碱基。此外,异亮氨酸、脯氨酸、丝氨酸偏好腺嘌呤胜过其他碱基;亮氨酸偏好胞嘧啶;天冬氨酸和甘氨酸偏好鸟嘌呤;天冬酰胺偏好尿嘧啶。考虑到接触类型,可以得出以下结论:(i)23%的接触是通过潜在的氢键(包括C-H…O氢键和离子相互作用),72%属于范德华相互作用,5%被视为短接触;(ii)在所有潜在氢键中,54%是标准的,33%是C-H…O类型,13%是离子型;(iii)鸟嘌呤的沃森-克里克位点、O6(G)以及主要是N2(G)和羟基O2′比预期更常参与氢键;蛋白质主链参与32%的氢键,侧链参与68%的氢键;考虑到中性和离子型氢键,以下配对比预期更频繁——碱基A-丝氨酸、碱基G-天冬氨酸/谷氨酸、碱基U-天冬酰胺。RNA的C-H基团优先与氧原子相互作用(主链上为62%,侧链上为19%);(iv)碱基参与所有氢键的38%,超过26%的氢键在RNA上有氢供体基团;(v)原子O2′参与所有氢键的21%,这个数字比预期的要大;(vi)与RNA组分直接接触较少的氨基酸经常通过其主链原子通过水分子与RNA原子相互作用;相反,那些经常直接接触的氨基酸,除了丝氨酸,而是使用其侧链原子进行水桥相互作用。