BCAM-Basque Center for Applied Mathematics, Mazarredo 14, 48009, Bilbao, Spain.
Departamento de Matemáticas, Universidad del País Vasco UPV/EHU, 48940, Leioa, Spain.
Sci Rep. 2022 Feb 24;12(1):3177. doi: 10.1038/s41598-022-06935-9.
High throughput phenotyping (HTP) platforms and devices are increasingly used for the characterization of growth and developmental processes for large sets of plant genotypes. Such HTP data require challenging statistical analyses in which longitudinal genetic signals need to be estimated against a background of spatio-temporal noise processes. We propose a two-stage approach for the analysis of such longitudinal HTP data. In a first stage, we correct for design features and spatial trends per time point. In a second stage, we focus on the longitudinal modelling of the spatially corrected data, thereby taking advantage of shared longitudinal features between genotypes and plants within genotypes. We propose a flexible hierarchical three-level P-spline growth curve model, with plants/plots nested in genotypes, and genotypes nested in populations. For selection of genotypes in a plant breeding context, we show how to extract new phenotypes, like growth rates, from the estimated genotypic growth curves and their first-order derivatives. We illustrate our approach on HTP data from the PhenoArch greenhouse platform at INRAE Montpellier and the outdoor Field Phenotyping platform at ETH Zürich.
高通量表型(HTP)平台和设备越来越多地用于对大量植物基因型的生长和发育过程进行特征描述。此类 HTP 数据需要进行具有挑战性的统计分析,需要在时空噪声过程的背景下估计纵向遗传信号。我们提出了一种用于分析此类纵向 HTP 数据的两阶段方法。在第一阶段,我们针对每个时间点的设计特征和空间趋势进行校正。在第二阶段,我们专注于对空间校正后的数据进行纵向建模,从而利用基因型和基因型内植物之间的共享纵向特征。我们提出了一种灵活的分层三级 P-样条生长曲线模型,将植物/地块嵌套在基因型中,将基因型嵌套在群体中。在植物育种背景下选择基因型时,我们展示了如何从估计的基因型生长曲线及其一阶导数中提取新的表型,如生长速率。我们在 INRAE 蒙彼利埃的 PhenoArch 温室平台和 ETH 苏黎世的户外田间表型平台上的 HTP 数据上说明了我们的方法。