Jeong Euna, Kim Hyunwoo, Lee Seong-Wook, Han Kyungsook
School of Computer Science and Engineering, Inha University, Incheon 402-751, Korea.
Mol Cells. 2003 Oct 31;16(2):161-7.
With the availability of many genome sequences, the mining of biological data is attracting much attention, most of it limited to the sequences of macromolecules. Sequence data are easy to analyze as they can be treated as strings of characters, whereas the structure of a macromolecule is much more complex. We developed a set of algorithms to analyze the structures of protein-RNA complexes at the atomic level and used them to analyze protein-RNA interactions using structural data on 51 protein-RNA complexes. The analysis revealed, among other things, that: (1) polar and charged amino acids have a strong tendency to interact with nucleotides, (2) arginine and asparagine tend to hydrogen bond with uracil, and (3) histidine favors uracil in water-mediated bonding with RNA. We analyzed a large set of structural data of protein-RNA complexes involving water-mediated hydrogen bonds as well as direct hydrogen bonds. The interaction patterns discovered from the analysis provide useful information for predicting the structure of RNA that binds proteins, and of proteins that bind RNA.
随着众多基因组序列的可得性,生物数据挖掘正备受关注,其中大部分局限于大分子序列。序列数据易于分析,因为它们可被视为字符序列,而大分子的结构则要复杂得多。我们开发了一套算法,用于在原子水平分析蛋白质-RNA复合物的结构,并利用它们基于51个蛋白质-RNA复合物的结构数据来分析蛋白质-RNA相互作用。分析结果揭示了诸多内容,其中包括:(1)极性和带电荷的氨基酸有与核苷酸相互作用的强烈倾向;(2)精氨酸和天冬酰胺倾向于与尿嘧啶形成氢键;(3)组氨酸在与RNA的水介导键合中更倾向于尿嘧啶。我们分析了大量涉及水介导氢键以及直接氢键的蛋白质-RNA复合物结构数据。从分析中发现的相互作用模式为预测与蛋白质结合的RNA结构以及与RNA结合的蛋白质结构提供了有用信息。