Lemieux Sébastien, Major François
Département d'Informatique et de Recherche Opérationnelle, Université de Montréal, C.P. 6128, Succ. Centre-Ville, Montréal, Québec H3C 3J7, Canada.
Nucleic Acids Res. 2002 Oct 1;30(19):4250-63. doi: 10.1093/nar/gkf540.
The problem of systematic and objective identification of canonical and non-canonical base pairs in RNA three-dimensional (3D) structures was studied. A probabilistic approach was applied, and an algorithm and its implementation in a computer program that detects and analyzes all the base pairs contained in RNA 3D structures were developed. The algorithm objectively distinguishes among canonical and non-canonical base pairing types formed by three, two and one hydrogen bonds (H-bonds), as well as those containing bifurcated and C-H.X...H-bonds. The nodes of a bipartite graph are used to encode the donor and acceptor atoms of a 3D structure. The capacities of the edges correspond to probabilities computed from the geometry of the donor and acceptor groups to form H-bonds. The maximum flow from donors to acceptors directly identifies base pairs and their types. A complete repertoire of base pairing types was built from the detected H-bonds of all X-ray crystal structures of a resolution of 3.0 A or better, including the large and small ribosomal subunits. The base pairing types are labeled using an extension of the nomenclature recently introduced by Leontis and Westhof. The probabilistic method was implemented in MC-Annotate, an RNA structure analysis computer program used to determine the base pairing parameters of the 3D modeling system MC-Sym.
研究了在RNA三维(3D)结构中系统且客观地识别典型和非典型碱基对的问题。应用了一种概率方法,并开发了一种算法及其在计算机程序中的实现,该程序可检测和分析RNA 3D结构中包含的所有碱基对。该算法能客观地区分由三个、两个和一个氢键(H键)形成的典型和非典型碱基配对类型,以及那些包含分叉和C-H…X…H键的类型。使用二分图的节点对3D结构的供体和受体原子进行编码。边的容量对应于根据供体和受体基团形成H键的几何结构计算出的概率。从供体到受体的最大流直接识别碱基对及其类型。根据分辨率为3.0埃或更高的所有X射线晶体结构(包括大、小核糖体亚基)中检测到的H键,构建了完整的碱基配对类型库。碱基配对类型使用Leontis和Westhof最近引入的命名法的扩展进行标记。概率方法在MC-Annotate中实现,MC-Annotate是一个用于确定3D建模系统MC-Sym的碱基配对参数的RNA结构分析计算机程序。