Momen Mehdi, Bhatta Madhav, Hussain Waseem, Yu Haipeng, Morota Gota
Department of Animal and Poultry Sciences Virginia Polytechnic Institute and State University Blacksburg VA USA.
Department of Agronomy University of Wisconsin-Madison Madison WI USA.
Plant Direct. 2021 Jan 25;5(1):e00304. doi: 10.1002/pld3.304. eCollection 2021 Jan.
Inferring trait networks from a large volume of genetically correlated diverse phenotypes such as yield, architecture, and disease resistance can provide information on the manner in which complex phenotypes are interrelated. However, studies on statistical methods tailored to multidimensional phenotypes are limited, whereas numerous methods are available for evaluating the massive number of genetic markers. Factor analysis operates at the level of latent variables predicted to generate observed responses. The objectives of this study were to illustrate the manner in which data-driven exploratory factor analysis can map observed phenotypes into a smaller number of latent variables and infer a genomic latent factor network using 45 agro-morphological, disease, and grain mineral phenotypes measured in synthetic hexaploid wheat lines (.). In total, eight latent factors including grain yield, architecture, flag leaf-related traits, grain minerals, yellow rust, two types of stem rust, and leaf rust were identified as common sources of the observed phenotypes. The genetic component of the factor scores for each latent variable was fed into a Bayesian network to obtain a trait structure reflecting the genetic interdependency among traits. Three directed paths were consistently identified by two Bayesian network algorithms. Flag leaf-related traits influenced leaf rust, and yellow rust and stem rust influenced grain yield. Additional paths that were identified included flag leaf-related traits to minerals and minerals to architecture. This study shows that data-driven exploratory factor analysis can reveal smaller dimensional common latent phenotypes that are likely to give rise to numerous observed field phenotypes without relying on prior biological knowledge. The inferred genomic latent factor structure from the Bayesian network provides insights for plant breeding to simultaneously improve multiple traits, as an intervention on one trait will affect the values of focal phenotypes in an interrelated complex trait system.
从大量遗传相关的多样表型(如产量、株型和抗病性)中推断性状网络,可以提供有关复杂表型相互关系的信息。然而,针对多维度表型的统计方法研究有限,而评估大量遗传标记的方法却有很多。因子分析在预测产生观测响应的潜在变量层面上起作用。本研究的目的是说明数据驱动的探索性因子分析如何将观测到的表型映射到较少数量的潜在变量中,并使用在人工合成六倍体小麦品系中测量的45种农艺形态、病害和籽粒矿物质表型推断基因组潜在因子网络(.)。总共确定了八个潜在因子,包括籽粒产量、株型、旗叶相关性状、籽粒矿物质、条锈病、两种秆锈病和叶锈病,它们是观测表型的共同来源。将每个潜在变量的因子得分的遗传成分输入贝叶斯网络,以获得反映性状间遗传相互依赖性的性状结构。两种贝叶斯网络算法一致地识别出三条有向路径。旗叶相关性状影响叶锈病,条锈病和秆锈病影响籽粒产量。识别出的其他路径包括从旗叶相关性状到矿物质以及从矿物质到株型的路径。本研究表明,数据驱动的探索性因子分析可以揭示较小维度的共同潜在表型,这些表型可能会产生众多观测到的田间表型,而无需依赖先验生物学知识。从贝叶斯网络推断出的基因组潜在因子结构为植物育种同时改良多个性状提供了见解,因为对一个性状的干预将影响相互关联的复杂性状系统中目标表型的值。