Baldi P, Chauvin Y
Division of Biology, California Institute of Technology, Pasadena 91125, USA.
Neural Comput. 1996 Oct 1;8(7):1541-65. doi: 10.1162/neco.1996.8.7.1541.
We describe a hybrid modeling approach where the parameters of a mode are calculated and modulated by another model, typically a neural network (NN), to avoid both overfitting and underfitting. We develop the approach for the case of Hidden Markov Models (HMMs), by deriving a class of hybrid HMM/NN architectures. These architectures can be trained with unified algorithms that blend HMM dynamic programming with NN backpropagation. In the case of complex data, mixtures of HMMs or modulated HMMs must be used. NNs can then be applied both to the parameters of each single HMM, and to the switching or modulatation of the models, as a function of input or context. Hybrid HMM/NN architectures provide a flexible NN parameterization for the control of model structure and complexity. At the same time, they can capture distributions that, in practice, are inaccessible to single HMMs. The HMM/NN hybrid approach is tested, in its simplest form, by constructing a model of the immunoglobulin protein family. A hybrid model is trained, and a multiple alignment derived, with less than a fourth of the number of parameters used with previous single HMMs.
我们描述了一种混合建模方法,其中一个模型的参数由另一个模型(通常是神经网络(NN))来计算和调制,以避免过拟合和欠拟合。我们通过推导一类混合隐马尔可夫模型/神经网络(HMM/NN)架构,针对隐马尔可夫模型(HMM)的情况开发了该方法。这些架构可以使用将HMM动态规划与NN反向传播相结合的统一算法进行训练。对于复杂数据的情况,必须使用HMM的混合模型或调制HMM。然后,NN既可以应用于每个单个HMM的参数,也可以根据输入或上下文应用于模型的切换或调制。混合HMM/NN架构为模型结构和复杂性的控制提供了灵活的NN参数化。同时,它们可以捕获在实际中单个HMM无法获得的分布。通过构建免疫球蛋白蛋白家族的模型,以最简单的形式测试了HMM/NN混合方法。训练了一个混合模型,并得到了一个多重比对,其使用的参数数量不到先前单个HMM所使用参数数量的四分之一。