Center for Bioinformatics, University of Kansas, Lawrence, KS 66047, USA.
IEEE Trans Nanobioscience. 2013 Sep;12(3):206-13. doi: 10.1109/TNB.2013.2263511. Epub 2013 May 16.
Drug-induced QT prolongation is a major life-threatening adverse drug effect. It is crucial to predict the QT prolongation effect as early as possible in drug development, however, data on drugs that induce QT prolongation are very limited and noisy. Multi-view learning (MVL) has been applied to many challenging machine learning and data mining problems, especially when complex data from diverse domains are involved and only limited labeled examples are available. Unlike existing MVL methods that use l2-norm co-regularization to obtain a smooth objective function, in this paper we proposed an l1-norm co-regularized MVL algorithm for predicting drug-induced QT prolongation effect and reformulate the l1-norm co-regularized objective function for deriving its gradient in the analytic form, and we can optimize the mapping functions on all views simultaneously and achieve 3-4 times higher computational efficiency, while previous l2-norm co-regularized MVL methods use alternate optimization that alternately optimizes one view with the other views fixed until convergence. l1 -norm co-regularization enforces sparsity in the learned mapping functions and hence the results are expected to be more interpretable. Comprehensive experimental comparisons between our proposed method and previous MVL and single-view learning methods demonstrate that our method significantly outperforms those baseline methods more efficiently.
药物引起的 QT 间期延长是一种主要的危及生命的药物不良反应。在药物开发过程中尽早预测 QT 延长效应至关重要,然而,关于引起 QT 延长的药物的数据非常有限且嘈杂。多视图学习(MVL)已应用于许多具有挑战性的机器学习和数据挖掘问题,尤其是当涉及来自不同领域的复杂数据且只有有限的标记示例时。与现有的使用 l2 范数协同正则化获得平滑目标函数的 MVL 方法不同,在本文中,我们提出了一种用于预测药物引起的 QT 延长效应的 l1 范数协同正则化 MVL 算法,并重新制定了 l1 范数协同正则化目标函数,以推导出其解析形式的梯度,我们可以同时对所有视图上的映射函数进行优化,并实现 3-4 倍的计算效率提高,而之前的 l2 范数协同正则化 MVL 方法使用交替优化,即依次对一个视图进行优化,同时固定其他视图直到收敛。l1 范数协同正则化强制学习的映射函数的稀疏性,因此结果有望更具可解释性。我们的方法与之前的 MVL 和单视图学习方法之间的全面实验比较表明,我们的方法在更高效的情况下显著优于那些基线方法。