Liu Jingyi, Li Weijun, Yu Lina, Wu Min, Li Wenqiang, Li Yanjie, Hao Meilan
AnnLab, Institute of Semiconductors, Chinese Academy of Sciences, Beijing, 100083, China; School of Integrated Circuits & Center of Materials Science and Optoelectronics Engineering, University of Chinese Academy of Sciences, 100049, Beijing, China.
AnnLab, Institute of Semiconductors, Chinese Academy of Sciences, Beijing, 100083, China.
Neural Netw. 2025 Jul;187:107405. doi: 10.1016/j.neunet.2025.107405. Epub 2025 Mar 21.
Symbolic Regression (SR) methods in tree representations have exhibited commendable outcomes across Genetic Programming (GP) and deep learning search paradigms. Nonetheless, the tree representation of mathematical expressions occasionally embodies redundant substructures. Representing expressions as computation graphs is more succinct and intuitive through graph representation. Despite its adoption in evolutionary strategies within SR, deep learning paradigms remain under-explored. Acknowledging the profound advancements of deep learning in tree-centric SR approaches, we advocate for addressing SR tasks using the Directed Acyclic Graph (DAG) representation of mathematical expressions, complemented by a generative graph neural network. We name the proposed method as Graph-based Deep Symbolic Regression (GraphDSR). We vectorize node types and employ an adjacent matrix to delineate connections. The graph neural networks craft the DAG incrementally, sampling node types and graph connections conditioned on previous DAG at every step. During each sample step, the valid check is implemented to avoid meaningless sampling, and four domain-agnostic constraints are adopted to further streamline the search. This process culminates once a coherent expression emerges. Constants undergo optimization by SGD and BFGS algorithms, and rewards refine the graph neural network through reinforcement learning. A comprehensive evaluation across 110 benchmarks underscores the potency of our approach.
树表示法中的符号回归(SR)方法在遗传编程(GP)和深度学习搜索范式中都取得了值得称赞的成果。然而,数学表达式的树表示法有时会包含冗余子结构。通过图形表示将表达式表示为计算图更加简洁直观。尽管它已被用于SR中的进化策略,但深度学习范式仍未得到充分探索。鉴于深度学习在以树为中心的SR方法中取得的重大进展,我们主张使用数学表达式的有向无环图(DAG)表示法来处理SR任务,并辅以生成式图神经网络。我们将所提出的方法命名为基于图的深度符号回归(GraphDSR)。我们对节点类型进行矢量化,并使用邻接矩阵来描述连接。图神经网络逐步构建DAG,在每一步根据先前的DAG对节点类型和图连接进行采样。在每个采样步骤中,执行有效性检查以避免无意义的采样,并采用四个与领域无关的约束来进一步简化搜索。一旦出现连贯的表达式,这个过程就结束了。常量通过SGD和BFGS算法进行优化,奖励通过强化学习来优化图神经网络。对110个基准的全面评估突出了我们方法的有效性。