Suppr超能文献

一种用于ADME预测的新型自适应集成分类框架。

A novel adaptive ensemble classification framework for ADME prediction.

作者信息

Yang Ming, Chen Jialei, Xu Liwen, Shi Xiufeng, Zhou Xin, Xi Zhijun, An Rui, Wang Xinhong

机构信息

Department of Pharmacy, Longhua Hospital Affiliated to Shanghai University of TCM Shanghai People's Republic of China.

Department of Chemistry, College of Pharmacy, Shanghai University of Traditional Chinese Medicine Shanghai People's Republic of China

出版信息

RSC Adv. 2018 Mar 26;8(21):11661-11683. doi: 10.1039/c8ra01206g. eCollection 2018 Mar 21.

Abstract

It has now become clear that prediction of ADME (absorption, distribution, metabolism, and elimination) characteristics is an important component of the drug discovery process. Therefore, there has been considerable interest in the development of modeling of ADME prediction in recent years. Despite the advances in this field, there remains challenges when facing the unbalanced and high dimensionality problems simultaneously. In this work, we introduce a novel adaptive ensemble classification framework named as AECF to deal with the above issues. AECF includes four components which are (1) data balancing, (2) generating individual models, (3) combining individual models, and (4) optimizing the ensemble. We considered five sampling methods, seven base modeling techniques, and ten ensemble rules to build a choice pool. The proper route of constructing predictive models was determined automatically according to the imbalance ratio (IR). With the adaptive characteristics of AECF, it can be used to work on the different kinds of ADME data, and the balanced data is a special case in AECF. We evaluated the performance of our approach using five extensive ADME datasets concerning Caco-2 cell permeability (CacoP), human intestinal absorption (HIA), oral bioavailability (OB), and P-glycoprotein (P-gp) binders (substrates/inhibitors, PS/PI). The performance of AECF was evaluated on two independent datasets, and the average AUC values were 0.8574-0.8602, 0.8968-0.9182, 0.7821-0.7981, 0.8139-0.8311, and 0.8874-0.8898 for CacoP, HIA, OB, PS and PI, respectively. Our results show that AECF can provide better performance and generality compared with individual models and two representative ensemble methods bagging and boosting. Furthermore, the degree of complementarity among the AECF ensemble members was investigated for the purpose of elucidating the potential advantages of our framework. We found that AECF can effectively select complementary members to construct predictive models by our auto-adaptive optimization approach, and the additional diversity in both sample and feature space mainly contribute to the complementarity of ensemble members.

摘要

现已明确,预测药物的吸收、分布、代谢和排泄(ADME)特性是药物研发过程的一个重要组成部分。因此,近年来人们对ADME预测模型的开发颇感兴趣。尽管该领域取得了进展,但在同时面对不平衡和高维问题时仍存在挑战。在这项工作中,我们引入了一种名为AECF的新型自适应集成分类框架来处理上述问题。AECF包括四个组件,即(1)数据平衡,(2)生成个体模型,(3)组合个体模型,以及(4)优化集成。我们考虑了五种采样方法、七种基础建模技术和十种集成规则来构建一个选择池。根据不平衡率(IR)自动确定构建预测模型的合适途径。凭借AECF的自适应特性,它可用于处理不同类型的ADME数据,平衡数据是AECF中的一种特殊情况。我们使用五个关于Caco - 2细胞通透性(CacoP)、人体肠道吸收(HIA)、口服生物利用度(OB)和P - 糖蛋白(P - gp)结合物(底物/抑制剂,PS/PI)的广泛ADME数据集评估了我们方法的性能。在两个独立数据集上评估了AECF的性能,CacoP、HIA、OB、PS和PI的平均AUC值分别为0.8574 - 0.8602、0.8968 - 0.9182、0.7821 - 0.7981、0.8139 - 0.8311和0.8874 - 0.8898。我们的结果表明,与个体模型以及两种代表性的集成方法装袋法和提升法相比,AECF能提供更好的性能和通用性。此外,为了阐明我们框架的潜在优势,研究了AECF集成成员之间的互补程度。我们发现,AECF可以通过我们的自适应优化方法有效地选择互补成员来构建预测模型,样本和特征空间中的额外多样性主要促成了集成成员的互补性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/4f63/9079056/6b06630d6da3/c8ra01206g-f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验