GVCHAP：一种使用单倍型和单核苷酸多态性标记进行基因组预测和方差成分估计的计算流程。

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers.

作者信息

Prakapenka Dzianis, Wang Chunkao, Liang Zuoxiang, Bian Cheng, Tan Cheng, Da Yang

机构信息

Department of Animal Science, University of Minnesota, Saint Paul, MN, United States.

State Key Laboratory for Agrobiotechnology, China Agricultural University, Beijing, China.

出版信息

Front Genet. 2020 Apr 7;11:282. doi: 10.3389/fgene.2020.00282. eCollection 2020.

DOI:10.3389/fgene.2020.00282

PMID:32318093

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC7154123/

Abstract

Haplotype prediction models open many possibilities to improve the accuracy of genomic selection but require more data processing and computing time than single-SNP prediction models. To facilitate haplotype analysis for genomic prediction and estimation using structural and functional genomic information, we developed a computing pipeline to implement haplotype analysis with capabilities for preparation of input data for haplotype analysis, genomic prediction and estimation using GVCHAP, and analysis of GVCHAP results. Data preparation includes utility programs for haplotype imputing; defining haplotype blocks by a fixed number of SNPs, a fixed distance in base pairs per block, or user defined block lengths based on structural or functional genomic information or a mixture of both types of information; and defining haplotype genotypes within each haplotype block. GVCHAP is the main program for genomic prediction and estimation, calculates GREML (genomic restricted maximum likelihood) estimates of variance components and heritabilities, and calculates GBLUP (genomic best linear unbiased prediction) for additive and dominance values of single SNPs as well as additive values of haplotypes with reliability estimates for training and validation populations. A two-step strategy and a method of multi-node processing are implemented to remove the computing bottleneck due to the creation of genomic relationship matrices for large samples. The analysis of GVCHAP results includes calculation of observed prediction accuracies from validation studies and preparation of input files for graphical visualization of heritability estimates of haplotype blocks as well as estimates of SNP effects and heritabilities. The entire pipeline provides an efficient and versatile computing tool for identifying the most accurate haplotype model among many candidate haplotype models utilizing structural and functional genomic information for genomic selection.

摘要

单倍型预测模型为提高基因组选择的准确性开辟了许多可能性，但比单核苷酸多态性（SNP）预测模型需要更多的数据处理和计算时间。为了利用结构和功能基因组信息促进基因组预测和估计的单倍型分析，我们开发了一个计算流程，以实现单倍型分析，具备为单倍型分析准备输入数据、使用GVCHAP进行基因组预测和估计以及分析GVCHAP结果的能力。数据准备包括用于单倍型填充的实用程序；通过固定数量的SNP、每个块固定的碱基对距离或基于结构或功能基因组信息或两种信息混合的用户定义块长度来定义单倍型块；以及在每个单倍型块内定义单倍型基因型。GVCHAP是基因组预测和估计的主要程序，计算方差分量和遗传力的基因组限制最大似然（GREML）估计值，并计算单SNP的加性和显性值以及单倍型加性值的基因组最佳线性无偏预测（GBLUP），同时给出训练和验证群体的可靠性估计值。实施了两步策略和多节点处理方法，以消除由于为大样本创建基因组关系矩阵而导致的计算瓶颈。GVCHAP结果分析包括根据验证研究计算观察到的预测准确性，以及准备输入文件，用于以图形方式可视化单倍型块的遗传力估计值以及SNP效应和遗传力估计值。整个流程提供了一个高效且通用的计算工具，用于在利用结构和功能基因组信息进行基因组选择的众多候选单倍型模型中识别最准确的单倍型模型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a9dc/7154123/0e798310e8a5/fgene-11-00282-g001.jpg

相似文献

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers.GVCHAP：一种使用单倍型和单核苷酸多态性标记进行基因组预测和方差成分估计的计算流程。

Front Genet. 2020 Apr 7;11:282. doi: 10.3389/fgene.2020.00282. eCollection 2020.

GVCBLUP: a computer package for genomic prediction and variance component estimation of additive and dominance effects.GVCBLUP：一个用于加性和显性效应基因组预测和方差分量估计的计算机程序包。

BMC Bioinformatics. 2014 Aug 9;15(1):270. doi: 10.1186/1471-2105-15-270.

Multi-allelic haplotype model based on genetic partition for genomic prediction and variance component estimation using SNP markers.基于遗传划分的多等位基因单倍型模型，用于使用单核苷酸多态性（SNP）标记进行基因组预测和方差分量估计。

BMC Genet. 2015 Dec 18;16:144. doi: 10.1186/s12863-015-0301-1.

Mixed model methods for genomic prediction and variance component estimation of additive and dominance effects using SNP markers.使用单核苷酸多态性（SNP）标记进行基因组预测以及加性效应和显性效应的方差分量估计的混合模型方法。

PLoS One. 2014 Jan 30;9(1):e87666. doi: 10.1371/journal.pone.0087666. eCollection 2014.

Haplotype genomic prediction of phenotypic values based on chromosome distance and gene boundaries using low-coverage sequencing in Duroc pigs.基于低覆盖度测序的基于染色体距离和基因边界的表型值单倍型基因组预测在杜洛克猪中的应用。

Genet Sel Evol. 2021 Oct 7;53(1):78. doi: 10.1186/s12711-021-00661-y.

Haplotype Analysis of Genomic Prediction Using Structural and Functional Genomic Information for Seven Human Phenotypes.利用结构和功能基因组信息对七种人类表型进行基因组预测的单倍型分析

Front Genet. 2020 Nov 26;11:588907. doi: 10.3389/fgene.2020.588907. eCollection 2020.

Genomic prediction with haplotype blocks in wheat.利用单倍型块对小麦进行基因组预测。

Front Plant Sci. 2023 May 9;14:1168547. doi: 10.3389/fpls.2023.1168547. eCollection 2023.

Genomic prediction using information across years with epistatic models and dimension reduction via haplotype blocks.利用上位性模型和单倍型块进行降维，在多年信息上进行基因组预测。

PLoS One. 2023 Mar 31;18(3):e0282288. doi: 10.1371/journal.pone.0282288. eCollection 2023.

Genomic Prediction Accuracy Using Haplotypes Defined by Size and Hierarchical Clustering Based on Linkage Disequilibrium.基于连锁不平衡，使用由大小和层次聚类定义的单倍型进行基因组预测准确性研究。

Front Genet. 2020 Mar 6;11:134. doi: 10.3389/fgene.2020.00134. eCollection 2020.

Multifactorial methods integrating haplotype and epistasis effects for genomic estimation and prediction of quantitative traits.整合单倍型和上位性效应的多因素方法用于数量性状的基因组估计和预测。

Front Genet. 2022 Oct 14;13:922369. doi: 10.3389/fgene.2022.922369. eCollection 2022.

引用本文的文献

Review of applications of artificial intelligence (AI) methods in crop research.人工智能（AI）方法在作物研究中的应用综述。

J Appl Genet. 2024 May;65(2):225-240. doi: 10.1007/s13353-023-00826-z. Epub 2024 Jan 13.

Genome-Wide Association Study of Fiber Diameter in Alpacas.羊驼纤维直径的全基因组关联研究

Animals (Basel). 2023 Oct 25;13(21):3316. doi: 10.3390/ani13213316.

Digital Phenotyping: A Game Changer for the Broiler Industry.数字表型分析：肉鸡产业的变革者。

Animals (Basel). 2023 Aug 10;13(16):2585. doi: 10.3390/ani13162585.

Expanding the application of haplotype-based genomic predictions to the wild: A case of antibody response against Teladorsagia circumcincta in Soay sheep.将基于单倍型的基因组预测应用扩展到野外：Soay 绵羊对泰乐菌素环形线虫抗体反应的案例。

BMC Genomics. 2023 Jun 17;24(1):335. doi: 10.1186/s12864-023-09407-0.

Genomic prediction with haplotype blocks in wheat.利用单倍型块对小麦进行基因组预测。

Front Plant Sci. 2023 May 9;14:1168547. doi: 10.3389/fpls.2023.1168547. eCollection 2023.

SNP- and haplotype-based single-step genomic predictions for body weight, wool, and reproductive traits in North American Rambouillet sheep.基于 SNP 和单倍型的北美罗姆尼羊体重、羊毛和繁殖性状的一步基因组预测。

J Anim Breed Genet. 2023 Mar;140(2):216-234. doi: 10.1111/jbg.12748. Epub 2022 Nov 21.

Front Genet. 2022 Oct 14;13:922369. doi: 10.3389/fgene.2022.922369. eCollection 2022.

Genet Sel Evol. 2021 Oct 7;53(1):78. doi: 10.1186/s12711-021-00661-y.

Front Genet. 2020 Nov 26;11:588907. doi: 10.3389/fgene.2020.588907. eCollection 2020.

Improving Prediction Accuracy Using Multi-allelic Haplotype Prediction and Training Population Optimization in Wheat.利用多等位基因单体型预测和小麦训练群体优化提高预测准确性。

G3 (Bethesda). 2020 Jul 7;10(7):2265-2273. doi: 10.1534/g3.120.401165.

本文引用的文献

Genome-wide haplotype analysis improves trait predictions in Brassica napus hybrids.全基因组单倍型分析提高甘蓝型油菜杂种的性状预测。

Plant Sci. 2019 Jun;283:157-164. doi: 10.1016/j.plantsci.2019.02.007. Epub 2019 Mar 13.

A One-Penny Imputed Genome from Next-Generation Reference Panels.基于新一代参考面板的单分钱估算基因组。

Am J Hum Genet. 2018 Sep 6;103(3):338-348. doi: 10.1016/j.ajhg.2018.07.015. Epub 2018 Aug 9.

Haplotype-Based Genome-Wide Prediction Models Exploit Local Epistatic Interactions Among Markers.基于单倍型的全基因组预测模型利用了标记之间的局部上位性相互作用。

G3 (Bethesda). 2018 May 4;8(5):1687-1699. doi: 10.1534/g3.117.300548.

Fixed-length haplotypes can improve genomic prediction accuracy in an admixed dairy cattle population.固定长度单倍型可提高混种奶牛群体的基因组预测准确性。

Genet Sel Evol. 2017 Jul 3;49(1):54. doi: 10.1186/s12711-017-0329-y.

Genome-wide association study and accuracy of genomic prediction for teat number in Duroc pigs using genotyping-by-sequencing.利用简化基因组测序技术对杜洛克猪乳头数进行全基因组关联研究及基因组预测准确性分析

Genet Sel Evol. 2017 Mar 29;49(1):35. doi: 10.1186/s12711-017-0311-8.

Short communication: The combined use of linkage disequilibrium-based haploblocks and allele frequency-based haplotype selection methods enhances genomic evaluation accuracy in dairy cattle.简短通讯：基于连锁不平衡的单倍型块与基于等位基因频率的单倍型选择方法的联合使用提高了奶牛基因组评估的准确性。

J Dairy Sci. 2017 Apr;100(4):2905-2908. doi: 10.3168/jds.2016-11798. Epub 2017 Feb 1.

BMC Genet. 2015 Dec 18;16:144. doi: 10.1186/s12863-015-0301-1.

Selection of haplotype variables from a high-density marker map for genomic prediction.从高密度标记图谱中选择单倍型变量用于基因组预测。

Genet Sel Evol. 2015 Aug 1;47(1):61. doi: 10.1186/s12711-015-0143-3.

Fast imputation using medium or low-coverage sequence data.使用中等或低覆盖率序列数据进行快速插补。

BMC Genet. 2015 Jul 14;16:82. doi: 10.1186/s12863-015-0243-7.

Genomic selection and association mapping in rice (Oryza sativa): effect of trait genetic architecture, training population composition, marker number and statistical model on accuracy of rice genomic selection in elite, tropical rice breeding lines.水稻（Oryza sativa）的基因组选择与关联图谱分析：性状遗传结构、训练群体组成、标记数量及统计模型对优质热带水稻育种系基因组选择准确性的影响

PLoS Genet. 2015 Feb 17;11(2):e1004982. doi: 10.1371/journal.pgen.1004982. eCollection 2015 Feb.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

GVCHAP：一种使用单倍型和单核苷酸多态性标记进行基因组预测和方差成分估计的计算流程。

GVCHAP: A Computing Pipeline for Genomic Prediction and Variance Component Estimation Using Haplotypes and SNP Markers.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献