Meng X L
Department of Statistics, University of Chicago, Illinois, USA.
Stat Methods Med Res. 1997 Mar;6(1):3-23. doi: 10.1177/096228029700600102.
Anderson Gray McKendrick's 1926 paper, 'Applications of mathematics to medical problems', was the earliest reference cited in Dempster et al.'s 1977 paper that defined and popularized the EM algorithm. McKendrick's paper was prominently featured by Joseph Oscar Irwin in his 1962 inaugural address as the President of the Royal Statistical Society (in the UK), entitled 'The place of mathematics in medical and biological statistics'. The link of McKendrick's work to the EM algorithm is due to an improvement made by Irwin on a novel method McKendrick used for estimating an infection rate when the observed data do not distinguish between those individuals who are not susceptible to the infection and those who are susceptible, but do not develop symptoms. This article examines this link, along the way illustrating the central ideas underlying the EM algorithm as well as its properties; the examination also suggests a profiling strategy for speeding up EM, which may be worthy of general investigation. McKendrick's data on an epidemic of cholera are used for illustration and to compare EM with Irwin's method as well as the Newton-Raphson algorithm. Issues beyond computation are also discussed whenever appropriate.
安德森·格雷·麦肯德里克1926年发表的论文《数学在医学问题中的应用》,是登普斯特等人1977年那篇定义并推广期望最大化(EM)算法的论文中最早被引用的参考文献。麦肯德里克的论文在约瑟夫·奥斯卡·欧文1962年担任英国皇家统计学会会长时发表的就职演说《数学在医学和生物统计学中的地位》中得到了突出介绍。麦肯德里克的工作与EM算法的联系,源于欧文对麦肯德里克用于估计感染率的一种新方法的改进,当时观测数据无法区分那些不易感染的个体和那些易感染但未出现症状的个体。本文探讨了这种联系,同时阐述了EM算法的核心思想及其性质;该探讨还提出了一种加速EM算法的剖析策略,这可能值得进行全面研究。文中用麦肯德里克关于霍乱疫情的数据进行说明,并将EM算法与欧文的方法以及牛顿 - 拉弗森算法进行比较。只要合适,还会讨论计算之外的问题。