Arribas-Gil Ana
Universidad Carlos III de Madrid.
Stat Appl Genet Mol Biol. 2010;9:Article 10. doi: 10.2202/1544-6115.1510. Epub 2010 Jan 26.
In this work we deal with parameter estimation in a latent variable model, namely the multiple-hidden i.i.d. model, which is derived from multiple alignment algorithms. We first provide a rigorous formalism for the homology structure of k sequences related by a star-shaped phylogenetic tree in the context of multiple alignment based on indel evolution models. We discuss possible definitions of likelihoods and compare them to the criterion used in multiple alignment algorithms. Existence of two different Information divergence rates is established and a divergence property is shown under additional assumptions. This would yield consistency for the parameter in parametrization schemes for which the divergence property holds. We finally extend the definition of the multiple-hidden i.i.d. model and the results obtained to the case in which the sequences are related by an arbitrary phylogenetic tree. Simulations illustrate different cases which are not covered by our results.
在这项工作中,我们处理潜变量模型中的参数估计问题,即多重隐藏独立同分布模型,该模型源自多重比对算法。我们首先基于插入缺失进化模型,在多重比对的背景下,为通过星状系统发育树相关联的k个序列的同源结构提供了一种严格的形式体系。我们讨论了似然性的可能定义,并将它们与多重比对算法中使用的标准进行比较。确立了两种不同信息散度率的存在性,并在附加假设下展示了一种散度性质。对于具有散度性质的参数化方案,这将产生参数的一致性。我们最终将多重隐藏独立同分布模型的定义以及所获得的结果扩展到序列由任意系统发育树相关联的情况。模拟说明了我们的结果未涵盖的不同情况。