Seenovate, Lyon, France.
Royal Canin Research Center, Aimargues, France.
Vet Res Commun. 2023 Jun;47(2):693-706. doi: 10.1007/s11259-022-10029-2. Epub 2022 Nov 5.
Breed-specific growth curves (GCs) are needed for neonatal puppies, but breed-specific data may be insufficient. We investigated an unsupervised clustering methodology for modeling GCs by augmenting breed-specific data with data from breeds having similar growth profiles. Puppy breeds were grouped by median growth profiles (bodyweights between birth and Day 20) using hierarchical clustering on principal components. Median bodyweights for breeds in a cluster were centered to that cluster's median and used to model cluster GCs by Generalized Additive Models for Location, Shape and Scale. These were centered back to breed growth profiles to produce cluster-scale breed GCs. The accuracy of breed-scale GCs modeled with breed-specific data only and cluster-scale breed GCs were compared when modeled from diminishing sample sizes. A complete dataset of Labrador Retriever bodyweights (birth to Day 20) was split into training (410 puppies) and test (460 puppies) datasets. Cluster-scale breed and breed-scale GCs were modelled from defined sample sizes from the training dataset. Quality criteria were the percentages of observed data in the test dataset outside the target growth centiles of simulations. Accuracy of cluster-scale breed GCs remained consistently high down to sampling sizes of three. They slightly overestimated breed variability, but centile curves were smooth and consistent with breed-scale GCs modeled from the complete Labrador Retriever dataset. At sampling sizes ≤ 20, the quality of breed-scale GCs reduced notably. In conclusion, GCs for neonatal puppies generated using a breed-cluster hybrid methodology can be more satisfactory than GCs at purely the breed level when sample sizes are small.
需要为新生幼犬建立特定品种的生长曲线(GCs),但特定品种的数据可能不足。我们研究了一种无监督聚类方法,通过用具有相似生长曲线的品种的数据来扩充特定品种的数据,从而对 GCs 进行建模。使用主成分的层次聚类,根据中值生长曲线(出生到第 20 天之间的体重)将幼犬品种分组。将一个聚类中的品种的中值体重中心化到该聚类的中值,并使用广义加性模型对位置、形状和比例进行聚类 GC 建模。将这些模型中心化回品种的生长曲线,以生成聚类规模的品种 GC。当从逐渐减少的样本量进行建模时,比较了仅使用特定品种数据建模的品种规模 GC 和聚类规模品种 GC 的准确性。完整的拉布拉多猎犬体重数据集(出生到第 20 天)分为训练(410 只幼犬)和测试(460 只幼犬)数据集。从训练数据集的定义样本量对聚类规模品种和品种规模 GC 进行建模。质量标准是测试数据集中观察到的数据百分比在模拟目标生长百分位数之外。聚类规模品种 GC 的准确性一直保持在采样规模为三的高水平。它们略微高估了品种的变异性,但百分位数曲线光滑且与从完整的拉布拉多猎犬数据集建模的品种规模 GC 一致。在采样规模≤20 时,品种规模 GC 的质量显著降低。总之,当样本量较小时,使用品种聚类混合方法生成的新生幼犬 GCs 比纯粹的品种水平 GCs 更令人满意。