Zhu Jianshen, Azam Naveed Ahmed, Zhang Fan, Shurbevski Aleksandar, Haraguchi Kazuya, Zhao Liang, Nagamochi Hiroshi, Akutsu Tatsuya
IEEE/ACM Trans Comput Biol Bioinform. 2022 Nov-Dec;19(6):3233-3245. doi: 10.1109/TCBB.2021.3112598. Epub 2022 Dec 8.
Drug discovery is one of the major goals of computational biology and bioinformatics. A novel framework has recently been proposed for the design of chemical graphs using both artificial neural networks (ANNs) and mixed integer linear programming (MILP). This method consists of a prediction phase and an inverse prediction phase. In the first phase, an ANN is trained using data on existing chemical compounds. In the second phase, given a target chemical property, a feature vector is inferred by solving an MILP formulated from the trained ANN and then a set of chemical structures is enumerated by a graph enumeration algorithm. Although exact solutions are guaranteed by this framework, the types of chemical graphs have been restricted to such classes as trees, monocyclic graphs, and graphs with a specified polymer topology with cycle index up to 2. To overcome the limitation on the topological structure, we propose a new flexible modeling method to the framework so that we can specify a topological substructure of graphs and a partial assignment of chemical elements and bond-multiplicity to a target graph. The results of computational experiments suggest that the proposed system can infer chemical graphs with around up to 50 non-hydrogen atoms.
药物发现是计算生物学和生物信息学的主要目标之一。最近提出了一种新颖的框架,用于使用人工神经网络(ANN)和混合整数线性规划(MILP)设计化学图。该方法由预测阶段和逆预测阶段组成。在第一阶段,使用现有化合物的数据训练人工神经网络。在第二阶段,给定目标化学性质,通过求解由训练好的人工神经网络构建的混合整数线性规划来推断特征向量,然后通过图枚举算法枚举一组化学结构。尽管该框架保证了精确解,但化学图的类型仅限于树、单环图以及具有高达2的循环指数的指定聚合物拓扑结构的图等类别。为了克服拓扑结构上的限制,我们为该框架提出了一种新的灵活建模方法,以便我们可以指定图的拓扑子结构以及目标图的化学元素和键多重性的部分分配。计算实验结果表明,所提出的系统可以推断出具有多达约50个非氢原子的化学图。