Suppr超能文献

用于自适应配置全局最优分类与回归树的改进粒子群优化算法

Modified particle swarm optimization algorithm for adaptively configuring globally optimal classification and regression trees.

作者信息

Zhou Yan-Ping, Tang Li-Juan, Jiao Jian, Song Dan-Dan, Jiang Jian-Hui, Yu Ru-Qin

机构信息

Key Laboratory of Pesticide and Chemical Biology of Ministry of Education, College of Chemistry, Central China Normal University, Wuhan 430079, PR China.

出版信息

J Chem Inf Model. 2009 May;49(5):1144-53. doi: 10.1021/ci800374h.

Abstract

The configuration of classification and regression trees (CART) used to include tree-growing by greedy recursive partitioning, which selects the splitting parameters (i.e., splitting variables and values) involved in tree, and tree-pruning, which aims to obtain a final tree of right size. This method is successful for most applications; however, it presents some well-known limitations and drawbacks, such as, less comprehensibility, inclination to overfitting, and suboptima. In the present study, the modified discrete particle swarm optimization method was invoked to adaptively configure the globally optimal CART (MPSOCART) via simultaneously selecting the optimal splitting parameters in CART and the appropriate structure of CART. A new objective function was formulated to decide the appropriate CART architecture and the optimum splitting parameters. The proposed MPSOCART was applied to predict the bioactivities of flavonoid derivatives and inhibitory activities of inhibitors of epidermal growth factor receptor tyrosine kinase, compared with partial least-squares and CART induced by greedy recursive partitioning. The comparison revealed that MPSO was a useful tool for inducing a globally optimal CART, which converges fast to the optimal solution and avoid overfitting in great extent.

摘要

分类与回归树(CART)的构建过去常常包括通过贪婪递归划分来生长树,即选择树中涉及的分裂参数(即分裂变量和值),以及进行树剪枝,其目的是获得大小合适的最终树。这种方法在大多数应用中都很成功;然而,它存在一些众所周知的局限性和缺点,比如,可理解性较差、容易过度拟合以及次优性。在本研究中,调用改进的离散粒子群优化方法,通过同时选择CART中的最优分裂参数和CART的合适结构,来自适应地构建全局最优的CART(MPSOCART)。制定了一个新的目标函数来确定合适的CART架构和最优分裂参数。将所提出的MPSOCART应用于预测黄酮类衍生物的生物活性和表皮生长因子受体酪氨酸激酶抑制剂的抑制活性,并与偏最小二乘法和由贪婪递归划分诱导的CART进行比较。比较结果表明,MPSO是诱导全局最优CART的一种有用工具,它能快速收敛到最优解,并在很大程度上避免过度拟合。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验