Yamashita Fumiyoshi, Hara Hideto, Ito Takayuki, Hashida Mitsuru
Department of Drug Delivery Research, Graduate School of Pharmaceutical Sciences, Kyoto University, 46-29 Yoshidashimoadachi-cho, Sakyo-ku, Kyoto 606-8501, Japan.
J Chem Inf Model. 2008 Feb;48(2):364-9. doi: 10.1021/ci700262y. Epub 2008 Jan 23.
In the lead optimization process, medicinal chemists must consider various chemical properties of active compounds, including ADME/Tox properties, and find the best compromise among these. This study presents a novel data mining method for multiobjective optimization of chemical properties, which consists of the hierarchical classification and visualization of multidimensional data. A hierarchical classification tree model is generated by an extension of recursive partitioning that utilizes averaged information gains for multiple objective variables as a quality-of-split criterion. All the hierarchically structured data objects are represented using a large-scale data visualization technique. The technique is an extension of HeiankyoView, which displays data objects as colored icons and group nodes as rectangular borders. Each icon is divided into subregions with different colors, so that it can present multidimensional data according to brightness of the colors. The proposed method was applied to the structure-activity relationship analysis for cytochrome P450 (CYP) substrates. The substrate specificity of six CYP isoforms was successfully delineated: e.g., CYP2C9 substrates are anionic compounds, while CYP2D6 substrates are cationic; and CYP2E1 substrates are smaller compounds, while CYP3A4 substrates are larger compounds.
在先导化合物优化过程中,药物化学家必须考虑活性化合物的各种化学性质,包括药物代谢动力学/药物毒性性质,并在这些性质之间找到最佳平衡点。本研究提出了一种用于化学性质多目标优化的新型数据挖掘方法,该方法由多维数据的层次分类和可视化组成。通过扩展递归划分生成层次分类树模型,该模型利用多个目标变量的平均信息增益作为划分质量标准。所有层次结构的数据对象都使用大规模数据可视化技术进行表示。该技术是平安京视图(HeiankyoView)的扩展,它将数据对象显示为彩色图标,将组节点显示为矩形边框。每个图标被划分为具有不同颜色的子区域,以便根据颜色的亮度呈现多维数据。所提出的方法应用于细胞色素P450(CYP)底物的构效关系分析。成功地描绘了六种CYP亚型的底物特异性:例如,CYP2C9底物是阴离子化合物,而CYP2D6底物是阳离子化合物;CYP2E1底物是较小的化合物,而CYP3A4底物是较大的化合物。