Igel Christian, Glasmachers Tobias, Mersch Britta, Pfeifer Nico, Meinicke Peter
Institut für Neuroinformatik, Ruhr-Universität Bochum, Bochum, Germany.
IEEE/ACM Trans Comput Biol Bioinform. 2007 Apr-Jun;4(2):216-26. doi: 10.1109/TCBB.2007.070208.
Biological data mining using kernel methods can be improved by a task-specific choice of the kernel function. Oligo kernels for genomic sequence analysis have proven to have a high discriminative power and to provide interpretable results. Oligo kernels that consider subsequences of different lengths can be combined and parameterized to increase their flexibility. For adapting these parameters efficiently, gradient-based optimization of the kernel-target alignment is proposed. The power of this new, general model selection procedure and the benefits of fitting kernels to problem classes are demonstrated by adapting oligo kernels for bacterial gene start detection.
使用核方法进行生物数据挖掘可通过针对特定任务选择核函数来改进。用于基因组序列分析的寡核苷酸核已被证明具有高判别力并能提供可解释的结果。考虑不同长度子序列的寡核苷酸核可以进行组合和参数化,以增加其灵活性。为了有效地调整这些参数,提出了基于梯度的核对目标比对优化方法。通过将寡核苷酸核应用于细菌基因起始检测,证明了这种新的通用模型选择程序的威力以及使核适合问题类别所带来的好处。