James I R
Biometrics. 1978 Jun;34(2):265-75.
A mixture of two or more normal distributions often provides an adequate model for the distribution of a population consisting of varying proportions of component subpopulations. We consider here the problem of estimating the mixing proportion in a mixture of two normal distributions, the parameters of which can be assumed known. Very large samples may be needed if reasonably precise estimates are to be obtained, thus bringing into consideration the cost or time involved in obtaining large numbers of exact measurements and computing the estimates from them. Simple estimators based on simple, rapidly obtained measurements may then be attractive alternatives provided efficiency losses are not too great. Three such estimators studied here are based on (a) the number of observations less than a fixed point r, (b) the nembers less than s and greater than t, and (c) the sample mean. Optimal choices of the points r, s and t are considered, and the efficiencies of the estimators relative to maximum likelihood estimators (MLE) using the full data are obtained. The simple estimators often perform sufficiently well to make the collection of full data not worthwhile in practice.
两个或多个正态分布的混合常常能为一个由不同比例的子群体组成的总体分布提供一个合适的模型。我们在此考虑估计两个正态分布混合中的混合比例的问题,其中可以假定这两个正态分布的参数是已知的。如果要获得相当精确的估计,可能需要非常大的样本,从而需要考虑获取大量精确测量值并从中计算估计值所涉及的成本或时间。如果效率损失不是太大,基于简单、快速获得的测量值的简单估计量可能是有吸引力的替代方法。这里研究的三个这样的估计量分别基于:(a) 小于固定点r的观测值的数量,(b) 小于s且大于t的观测值的数量,以及(c) 样本均值。我们考虑了点r、s和t的最优选择,并获得了这些估计量相对于使用完整数据的最大似然估计量(MLE)的效率。在实践中,这些简单估计量通常表现得足够好,以至于收集完整数据变得不值得。