Gage Timothy B
University at Albany-SUNY, Albany, New York 12222, USA.
Am J Hum Biol. 2002 Nov-Dec;14(6):728-34. doi: 10.1002/ajhb.10089.
Researchers have traditionally employed Gaussian distributions to model quantitative biological traits. Recently, mixtures of Gaussian distributions have begun to be used as well. However, there are many alternatives to the Gaussian distribution. From a theoretical perspective, the lognormal distribution is as applicable as the Gaussian (both are justified on the basis of the Central Limit Theorem). Here, the utility of mixtures of Gaussians and lognormals for describing birthweight and gestational age distributions are compared. This is carried out within the context of the hybrid-lognormal distribution, in which the Gaussian and lognormal are special cases. The data consists of African American births (1985-1988) and European American births (1988) in the state of New York. The results suggest that of the conventional distributions, a mixture of two Gaussians generally provides the best fit to birthweight and gestational age. However, in the case of birthweight a two-component hybrid-lognormal fits better than any of the simpler models. This may be due to a feature of the hybrid-lognormal distribution that can be interpreted as maternal constraints on fetal development.
传统上,研究人员采用高斯分布对定量生物学特征进行建模。近来,高斯分布的混合模型也开始得到应用。然而,高斯分布存在许多替代方案。从理论角度来看,对数正态分布与高斯分布同样适用(二者均基于中心极限定理得到论证)。在此,我们比较了高斯混合模型和对数正态混合模型在描述出生体重和孕周分布方面的效用。这是在混合对数正态分布的背景下进行的,其中高斯分布和对数正态分布均为特殊情况。数据包括纽约州非裔美国人的出生情况(1985 - 1988年)以及欧裔美国人的出生情况(1988年)。结果表明,在传统分布中,两个高斯分布的混合模型通常对出生体重和孕周的拟合效果最佳。然而,就出生体重而言,双组分混合对数正态模型的拟合效果优于任何更简单的模型。这可能是由于混合对数正态分布的一个特征,该特征可解释为母亲对胎儿发育的限制。