Sgouralis Ioannis, Pressé Steve
Department of Physics, Arizona State University, Tempe, Arizona.
Department of Physics, Arizona State University, Tempe, Arizona; Department of Molecular Sciences, Arizona State University, Tempe, Arizona.
Biophys J. 2017 May 23;112(10):2021-2029. doi: 10.1016/j.bpj.2017.04.027.
The hidden Markov model (HMM) has been a workhorse of single-molecule data analysis and is now commonly used as a stand-alone tool in time series analysis or in conjunction with other analysis methods such as tracking. Here, we provide a conceptual introduction to an important generalization of the HMM, which is poised to have a deep impact across the field of biophysics: the infinite HMM (iHMM). As a modeling tool, iHMMs can analyze sequential data without a priori setting a specific number of states as required for the traditional (finite) HMM. Although the current literature on the iHMM is primarily intended for audiences in statistics, the idea is powerful and the iHMM's breadth in applicability outside machine learning and data science warrants a careful exposition. Here, we explain the key ideas underlying the iHMM, with a special emphasis on implementation, and provide a description of a code we are making freely available. In a companion article, we provide an important extension of the iHMM to accommodate complications such as drift.
隐马尔可夫模型(HMM)一直是单分子数据分析的主力工具,如今常用于时间序列分析的独立工具,或与其他分析方法(如跟踪)结合使用。在此,我们对HMM的一个重要推广进行概念性介绍,它有望对生物物理学领域产生深远影响:无限隐马尔可夫模型(iHMM)。作为一种建模工具,iHMM可以分析序列数据,而无需像传统(有限)HMM那样事先设定特定的状态数。尽管目前关于iHMM的文献主要面向统计学领域的读者,但这个想法很强大,iHMM在机器学习和数据科学之外的广泛适用性值得仔细阐述。在此,我们解释iHMM背后的关键思想,特别强调实现,并提供我们免费提供的代码描述。在一篇配套文章中,我们对iHMM进行了重要扩展,以适应诸如漂移等复杂情况。