• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于多性状和多环境试验中缺失数据插补的多任务高斯过程

Multi-task Gaussian process for imputing missing data in multi-trait and multi-environment trials.

作者信息

Hori Tomoaki, Montcho David, Agbangla Clement, Ebana Kaworu, Futakuchi Koichi, Iwata Hiroyoshi

机构信息

Department of Agricultural and Environmental Biology, Graduate School of Agricultural and Life Sciences, The University of Tokyo, 1-1-1 Yayoi, Bunkyo-ku, Tokyo, 113-8657, Japan.

Africa Rice Center, 01 B.P. 2031, Cotonou, Benin.

出版信息

Theor Appl Genet. 2016 Nov;129(11):2101-2115. doi: 10.1007/s00122-016-2760-9. Epub 2016 Aug 19.

DOI:10.1007/s00122-016-2760-9
PMID:27540725
Abstract

A method based on a multi-task Gaussian process using self-measuring similarity gave increased accuracy for imputing missing phenotypic data in multi-trait and multi-environment trials. Multi-environmental trial (MET) data often encounter the problem of missing data. Accurate imputation of missing data makes subsequent analysis more effective and the results easier to understand. Moreover, accurate imputation may help to reduce the cost of phenotyping for thinned-out lines tested in METs. METs are generally performed for multiple traits that are correlated to each other. Correlation among traits can be useful information for imputation, but single-trait-based methods cannot utilize information shared by traits that are correlated. In this paper, we propose imputation methods based on a multi-task Gaussian process (MTGP) using self-measuring similarity kernels reflecting relationships among traits, genotypes, and environments. This framework allows us to use genetic correlation among multi-trait multi-environment data and also to combine MET data and marker genotype data. We compared the accuracy of three MTGP methods and iterative regularized PCA using rice MET data. Two scenarios for the generation of missing data at various missing rates were considered. The MTGP performed a better imputation accuracy than regularized PCA, especially at high missing rates. Under the 'uniform' scenario, in which missing data arise randomly, inclusion of marker genotype data in the imputation increased the imputation accuracy at high missing rates. Under the 'fiber' scenario, in which missing data arise in all traits for some combinations between genotypes and environments, the inclusion of marker genotype data decreased the imputation accuracy for most traits while increasing the accuracy in a few traits remarkably. The proposed methods will be useful for solving the missing data problem in MET data.

摘要

一种基于使用自测量相似度的多任务高斯过程的方法,在多性状和多环境试验中对缺失表型数据进行插补时提高了准确性。多环境试验(MET)数据经常遇到数据缺失的问题。准确插补缺失数据可使后续分析更有效,结果更易于理解。此外,准确插补可能有助于降低在MET中测试的稀疏品系的表型分析成本。MET通常针对多个相互关联的性状进行。性状间的相关性对于插补可能是有用的信息,但基于单性状的方法无法利用相关性状共享的信息。在本文中,我们提出了基于多任务高斯过程(MTGP)的插补方法,该方法使用反映性状、基因型和环境之间关系的自测量相似度核。这个框架使我们能够利用多性状多环境数据中的遗传相关性,还能将MET数据和标记基因型数据结合起来。我们使用水稻MET数据比较了三种MTGP方法和迭代正则化主成分分析(PCA)的准确性。考虑了两种在不同缺失率下生成缺失数据的场景。MTGP的插补准确性比正则化PCA更好,尤其是在高缺失率时。在缺失数据随机出现的“均匀”场景下,在插补中纳入标记基因型数据在高缺失率时提高了插补准确性。在基因型和环境的某些组合中所有性状都出现缺失数据的“纤维”场景下,纳入标记基因型数据降低了大多数性状的插补准确性,同时显著提高了少数性状的准确性。所提出的方法将有助于解决MET数据中的缺失数据问题。

相似文献

1
Multi-task Gaussian process for imputing missing data in multi-trait and multi-environment trials.用于多性状和多环境试验中缺失数据插补的多任务高斯过程
Theor Appl Genet. 2016 Nov;129(11):2101-2115. doi: 10.1007/s00122-016-2760-9. Epub 2016 Aug 19.
2
Imputation of missing single nucleotide polymorphism genotypes using a multivariate mixed model framework.使用多元混合模型框架对缺失的单核苷酸多态性基因型进行推断。
J Anim Sci. 2011 Jul;89(7):2042-9. doi: 10.2527/jas.2010-3297. Epub 2011 Feb 25.
3
Multi-generational imputation of single nucleotide polymorphism marker genotypes and accuracy of genomic selection.多世代单核苷酸多态性标记基因型的推断及其基因组选择的准确性。
Animal. 2016 Jul;10(7):1077-85. doi: 10.1017/S1751731115002906. Epub 2016 Jan 6.
4
Genetic diversity analysis of highly incomplete SNP genotype data with imputations: an empirical assessment.基于插补的高度不完整SNP基因型数据的遗传多样性分析:实证评估
G3 (Bethesda). 2014 Mar 13;4(5):891-900. doi: 10.1534/g3.114.010942.
5
Impact of imputation methods on the amount of genetic variation captured by a single-nucleotide polymorphism panel in soybeans.插补方法对大豆单核苷酸多态性面板捕获的遗传变异量的影响。
BMC Bioinformatics. 2016 Feb 2;17:55. doi: 10.1186/s12859-016-0899-7.
6
Design of a low-density SNP chip for the main Australian sheep breeds and its effect on imputation and genomic prediction accuracy.用于澳大利亚主要绵羊品种的低密度单核苷酸多态性(SNP)芯片设计及其对填充和基因组预测准确性的影响。
Anim Genet. 2015 Oct;46(5):544-56. doi: 10.1111/age.12340. Epub 2015 Sep 11.
7
A real data-driven simulation strategy to select an imputation method for mixed-type trait data.一种基于真实数据驱动的选择混合类型性状数据插补方法的模拟策略。
PLoS Comput Biol. 2023 Mar 22;19(3):e1010154. doi: 10.1371/journal.pcbi.1010154. eCollection 2023 Mar.
8
Genomic Prediction for Grain Yield and Yield-Related Traits in Chinese Winter Wheat.中国冬小麦产量及产量相关性状的基因组预测。
Int J Mol Sci. 2020 Feb 17;21(4):1342. doi: 10.3390/ijms21041342.
9
Imputing missing genotypic data of single-nucleotide polymorphisms using neural networks.使用神经网络估算单核苷酸多态性的缺失基因型数据。
Eur J Hum Genet. 2008 Apr;16(4):487-95. doi: 10.1038/sj.ejhg.5201988. Epub 2008 Jan 16.
10
Filling the gap in functional trait databases: use of ecological hypotheses to replace missing data.填补功能性状数据库中的空白:利用生态假设来替代缺失数据。
Ecol Evol. 2014 Apr;4(7):944-58. doi: 10.1002/ece3.989. Epub 2014 Feb 25.

引用本文的文献

1
Context-Aware Time Series Imputation for Multi-Analyte Clinical Data.用于多分析物临床数据的上下文感知时间序列插补
J Healthc Inform Res. 2020 Oct 18;4(4):411-426. doi: 10.1007/s41666-020-00075-3. eCollection 2020 Dec.
2
A Combined Interpolation and Weighted -Nearest Neighbours Approach for the Imputation of Longitudinal ICU Laboratory Data.一种用于纵向重症监护病房实验室数据插补的联合插值与加权最近邻方法。
J Healthc Inform Res. 2020 Mar 2;4(2):174-188. doi: 10.1007/s41666-020-00069-1. eCollection 2020 Jun.
3
Improving Selection Efficiency of Crop Breeding With Genomic Prediction Aided Sparse Phenotyping.

本文引用的文献

1
Genome wide association study for drought, aflatoxin resistance, and important agronomic traits of maize hybrids in the sub-tropics.亚热带地区玉米杂交种干旱、抗黄曲霉毒素及重要农艺性状的全基因组关联研究。
PLoS One. 2015 Feb 25;10(2):e0117737. doi: 10.1371/journal.pone.0117737. eCollection 2015.
2
Multi-environment multi-QTL association mapping identifies disease resistance QTL in barley germplasm from Latin America.多环境多数量性状基因座关联图谱鉴定了来自拉丁美洲的大麦种质中的抗病性数量性状基因座。
Theor Appl Genet. 2015 Mar;128(3):501-16. doi: 10.1007/s00122-014-2448-y. Epub 2014 Dec 30.
3
Genomic prediction in biparental tropical maize populations in water-stressed and well-watered environments using low-density and GBS SNPs.
利用基因组预测辅助稀疏表型分析提高作物育种选择效率
Front Plant Sci. 2021 Oct 6;12:735285. doi: 10.3389/fpls.2021.735285. eCollection 2021.
4
Exploiting mutual information for the imputation of static and dynamic mixed-type clinical data with an adaptive k-nearest neighbours approach.利用互信息,采用自适应 k-最近邻方法对静态和动态混合类型临床数据进行插补。
BMC Med Inform Decis Mak. 2020 Aug 20;20(Suppl 5):174. doi: 10.1186/s12911-020-01166-2.
5
Comparison of -tests for Univariate and Multivariate Mixed-Effect Models in Genome-Wide Association Mapping.全基因组关联图谱中一元和多元混合效应模型的t检验比较。
Front Genet. 2019 Feb 4;10:30. doi: 10.3389/fgene.2019.00030. eCollection 2019.
6
3D-MICE: integration of cross-sectional and longitudinal imputation for multi-analyte longitudinal clinical data.3D-MICE:用于多分析物纵向临床数据的截面和纵向插补的集成。
J Am Med Inform Assoc. 2018 Jun 1;25(6):645-653. doi: 10.1093/jamia/ocx133.
7
Advantages and limitations of multiple-trait genomic prediction for Fusarium head blight severity in hybrid wheat (Triticum aestivum L.).杂交小麦(普通小麦)赤霉病严重程度多性状基因组预测的优势与局限性
Theor Appl Genet. 2018 Mar;131(3):685-701. doi: 10.1007/s00122-017-3029-7. Epub 2017 Dec 2.
在水分胁迫和水分充足环境下,利用低密度和GBS单核苷酸多态性对双亲本热带玉米群体进行基因组预测
Heredity (Edinb). 2015 Mar;114(3):291-9. doi: 10.1038/hdy.2014.99. Epub 2014 Nov 19.
4
Kernel-based whole-genome prediction of complex traits: a review.基于核的全基因组复杂性状预测:综述。
Front Genet. 2014 Oct 16;5:363. doi: 10.3389/fgene.2014.00363. eCollection 2014.
5
A reaction norm model for genomic selection using high-dimensional genomic and environmental data.利用高维基因组和环境数据进行基因组选择的反应规范模型。
Theor Appl Genet. 2014 Mar;127(3):595-607. doi: 10.1007/s00122-013-2243-1. Epub 2013 Dec 12.
6
Integrating environmental covariates and crop modeling into the genomic selection framework to predict genotype by environment interactions.将环境协变量和作物模型整合到基因组选择框架中,以预测基因型与环境互作。
Theor Appl Genet. 2014 Feb;127(2):463-80. doi: 10.1007/s00122-013-2231-5. Epub 2013 Nov 22.
7
Imputing missing yield trial data.缺失产量试验数据的推断。
Theor Appl Genet. 1990 Jun;79(6):753-61. doi: 10.1007/BF00224240.
8
Field high-throughput phenotyping: the new crop breeding frontier.大田高通量表型分析:作物新的育种前沿。
Trends Plant Sci. 2014 Jan;19(1):52-61. doi: 10.1016/j.tplants.2013.09.008. Epub 2013 Oct 16.
9
The statistical analysis of multi-environment data: modeling genotype-by-environment interaction and its genetic basis.多环境数据的统计分析:基因型-环境互作及其遗传基础建模。
Front Physiol. 2013 Mar 12;4:44. doi: 10.3389/fphys.2013.00044. eCollection 2013.
10
High-throughput phenotyping and genomic selection: the frontiers of crop breeding converge.高通量表型分析和基因组选择:作物育种的前沿正在交汇。
J Integr Plant Biol. 2012 May;54(5):312-20. doi: 10.1111/j.1744-7909.2012.01116.x.