Du Qi-Shi, Huang Ri-Bo, Chou Kuo-Chen
Guangxi University, Key Laboratory of Subtropical Bioresource Conservation and Utilization of Guangxi, Nanning, Guangxi, 530004, China.
Curr Protein Pept Sci. 2008 Jun;9(3):248-60. doi: 10.2174/138920308784534005.
This review is to summarize three new QSAR (quantitative structure-activity relationship) methods recently developed in our group and their applications for drug design. Based on more solid theoretical models and advanced mathematical techniques, the conventional QSAR technique has been recast in the following three aspects. (1) In the fragment-based two dimensional QSAR, or abbreviated as FB-QSAR, the molecular structures in a family of drug candidates are divided into several fragments according to the substitutes being investigated. The bioactivities of drug candidates are correlated with physicochemical properties of the molecular fragments through two sets of coefficients: one is for the physicochemical properties and the other for the molecular fragments. (2) In the multiple field three dimensional QSAR, or MF-3D-QSAR, more molecular potential fields are integrated into the comparative molecular field analysis (CoMFA) through two sets of coefficients: one is for the potential fields and the other for the Cartesian three dimensional grid points. (3) In the AABPP (amino acid-based peptide prediction), the bioactivities of peptides or proteins are correlated with the physicochemical properties of all or partial residues of the sequence through two sets of coefficients: one is for the physicochemical properties of amino acids and the other for the weight factors of the residues. Meanwhile, an iterative double least square (IDLS) technique is developed for solving the two sets of coefficients in a training dataset alternately and iteratively. Using the two sets of coefficients, one can predict the bioactivity of a query peptide, protein, or drug candidate. Compared with the old methods, the new QSAR approaches as summarized in this review possess machine learning ability, can remarkably enhance the prediction power, and provide more structural information. Meanwhile, the future challenge and possible development in this area have been briefly addressed as well.
本综述旨在总结我们团队最近开发的三种新的定量构效关系(QSAR)方法及其在药物设计中的应用。基于更坚实的理论模型和先进的数学技术,传统的QSAR技术在以下三个方面进行了重塑。(1)在基于片段的二维QSAR(简称为FB-QSAR)中,根据所研究的取代基,将一类候选药物的分子结构划分为几个片段。候选药物的生物活性通过两组系数与分子片段的物理化学性质相关联:一组用于物理化学性质,另一组用于分子片段。(2)在多场三维QSAR(或MF-3D-QSAR)中,通过两组系数将更多的分子势场整合到比较分子场分析(CoMFA)中:一组用于势场,另一组用于笛卡尔三维网格点。(3)在基于氨基酸的肽预测(AABPP)中,肽或蛋白质的生物活性通过两组系数与序列中所有或部分残基的物理化学性质相关联:一组用于氨基酸的物理化学性质,另一组用于残基的权重因子。同时,开发了一种迭代双最小二乘法(IDLS)技术,用于在训练数据集中交替迭代求解这两组系数。使用这两组系数,可以预测查询肽、蛋白质或候选药物的生物活性。与旧方法相比,本综述中总结的新QSAR方法具有机器学习能力,可以显著提高预测能力,并提供更多的结构信息。同时,也简要讨论了该领域未来的挑战和可能的发展。