Leiden/Amsterdam Center for Drug Research, Leiden University, Einsteinweg 55, 2333, CC, Leiden, The Netherlands.
J Cheminform. 2012 Sep 17;4(1):21. doi: 10.1186/1758-2946-4-21.
Computer Assisted Structure Elucidation has been used for decades to discover the chemical structure of unknown compounds. In this work we introduce the first open source structure generator, Open Molecule Generator (OMG), which for a given elemental composition produces all non-isomorphic chemical structures that match that elemental composition. Furthermore, this structure generator can accept as additional input one or multiple non-overlapping prescribed substructures to drastically reduce the number of possible chemical structures. Being open source allows for customization and future extension of its functionality. OMG relies on a modified version of the Canonical Augmentation Path, which grows intermediate chemical structures by adding bonds and checks that at each step only unique molecules are produced. In order to benchmark the tool, we generated chemical structures for the elemental formulas and substructures of different metabolites and compared the results with a commercially available structure generator. The results obtained, i.e. the number of molecules generated, were identical for elemental compositions having only C, O and H. For elemental compositions containing C, O, H, N, P and S, OMG produces all the chemically valid molecules while the other generator produces more, yet chemically impossible, molecules. The chemical completeness of the OMG results comes at the expense of being slower than the commercial generator. In addition to being open source, OMG clearly showed the added value of constraining the solution space by using multiple prescribed substructures as input. We expect this structure generator to be useful in many fields, but to be especially of great importance for metabolomics, where identifying unknown metabolites is still a major bottleneck.
计算机辅助结构解析已经使用了几十年,用于发现未知化合物的化学结构。在这项工作中,我们引入了第一个开源结构生成器,即开放分子生成器(OMG),它可以为给定的元素组成生成所有符合该元素组成的非同构化学结构。此外,这个结构生成器可以接受一个或多个不重叠的规定子结构作为附加输入,以大大减少可能的化学结构数量。作为开源的,它允许自定义和未来扩展其功能。OMG 依赖于修改后的规范增广路径,通过添加键来扩展中间化学结构,并检查在每个步骤中只生成唯一的分子。为了对该工具进行基准测试,我们为不同代谢物的元素公式和子结构生成了化学结构,并将结果与商业上可用的结构生成器进行了比较。对于仅包含 C、O 和 H 的元素组成,获得的结果(即生成的分子数量)是相同的。对于包含 C、O、H、N、P 和 S 的元素组成,OMG 生成了所有化学上有效的分子,而其他生成器生成了更多的化学上不可能的分子。OMG 结果的化学完整性是以比商业生成器慢为代价的。除了是开源的,OMG 还通过使用多个规定的子结构作为输入来约束解决方案空间,明显显示了其附加价值。我们期望这个结构生成器在许多领域都有用,但对于代谢组学来说尤其重要,在代谢组学中,识别未知代谢物仍然是一个主要的瓶颈。