Yuan Yanchao, Wang Xianlin, Wang Liyuan, Xing Huixian, Wang Qingkang, Saeed Muhammad, Tao Jincai, Feng Wei, Zhang Guihua, Song Xian-Liang, Sun Xue-Zhen
State Key Laboratory of Crop Biology/Agronomy College, Shandong Agricultural University, Taian, China.
Department of Botany, Government College University, Faisalabad, Pakistan.
Front Plant Sci. 2018 Oct 22;9:1359. doi: 10.3389/fpls.2018.01359. eCollection 2018.
Cotton ( spp.) is a leading natural fiber crop and an important source of vegetable protein and oil for humans and livestock. To investigate the genetic architecture of seed nutrients in upland cotton, a genome-wide association study (GWAS) was conducted in a panel of 196 germplasm resources under three environments using a CottonSNP80K chip of 77,774 loci. Relatively high genetic diversity (average gene diversity being 0.331) and phenotypic variation (coefficient of variation, CV, exceeding 3.9%) were detected in this panel. Correlation analysis revealed that the well-documented negative association between seed protein (PR) and oil may be to some extent attributable to the negative correlation between oleic acid (OA) and PR. Linkage disequilibrium (LD) was unevenly distributed among chromosomes and subgenomes. It ranged from 0.10-0.20 Mb (Chr19) to 5.65-5.75 Mb (Chr25) among the chromosomes and the range of Dt-subgenomes LD decay distances was smaller than At-subgenomes. This panel was divided into two subpopulations based on the information of 41,815 polymorphic single-nucleotide polymorphism (SNP) markers. The mixed linear model considering both Q-matrix and K-matrix [MLM(Q+K)] was employed to estimate the association between the SNP markers and the seed nutrients, considering the false positives caused by population structure and the kinship. A total of 47 SNP markers and 28 candidate quantitative trait loci (QTLs) regions were found to be significantly associated with seven cottonseed nutrients, including protein, total fatty acid, and five main fatty acid compositions. In addition, the candidate genes in these regions were analyzed, which included three genes, 62, and that were most likely involved in the control of cottonseed protein concentration. These results improved our understanding of the genetic control of cottonseed nutrients and provided potential molecular tools to develop cultivars with high protein and improved fatty acid compositions in cotton breeding programs through marker-assisted selection.
棉花(棉属)是主要的天然纤维作物,也是人类和牲畜植物蛋白及植物油的重要来源。为研究陆地棉种子营养成分的遗传结构,利用含77774个位点的CottonSNP80K芯片,在196份种质资源组成的群体中于三种环境下开展了全基因组关联研究(GWAS)。该群体检测到相对较高的遗传多样性(平均基因多样性为0.331)和表型变异(变异系数,CV,超过3.9%)。相关性分析表明,种子蛋白(PR)与油之间已充分记录的负相关在一定程度上可能归因于油酸(OA)与PR之间的负相关。连锁不平衡(LD)在染色体和亚基因组间分布不均。染色体间其范围从0.10 - 0.20 Mb(第19号染色体)到5.65 - 5.75 Mb(第25号染色体),且Dt亚基因组的LD衰减距离范围小于At亚基因组。基于41815个多态性单核苷酸多态性(SNP)标记信息,该群体被划分为两个亚群。考虑到群体结构和亲缘关系导致的假阳性,采用同时考虑Q矩阵和K矩阵的混合线性模型[MLM(Q + K)]来估计SNP标记与种子营养成分之间的关联。共发现47个SNP标记和28个候选数量性状位点(QTL)区域与七种棉籽营养成分显著相关,包括蛋白质、总脂肪酸以及五种主要脂肪酸组成。此外,对这些区域中的候选基因进行了分析,其中包括三个基因,分别为62和 ,它们最有可能参与棉籽蛋白浓度的调控。这些结果增进了我们对棉籽营养成分遗传控制的理解,并为通过标记辅助选择在棉花育种项目中培育高蛋白和改良脂肪酸组成的品种提供了潜在的分子工具。