基于亲缘关系的多核学习进行基因组预测，能够对表型性状的潜在遗传机制提出假设。

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits.

作者信息

Raimondi Daniele, Verplaetse Nora, Passemiers Antoine, Jans Deborah Sarah, Cleynen Isabelle, Moreau Yves

机构信息

Institut de Génétique Moléculaire de Montpellier (IGMM), CNRS-UMR5535, Université de Montpellier, Montpellier, 34293, France.

ESAT-STADIUS, KU Leuven, Leuven, 3001, Belgium.

出版信息

Genome Biol. 2025 Apr 4;26(1):84. doi: 10.1186/s13059-025-03544-3.

DOI:10.1186/s13059-025-03544-3

PMID:40181452

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11969835/

Abstract

BACKGROUND

Genomic prediction encompasses the techniques used in agricultural technology to predict the genetic merit of individuals towards valuable phenotypic traits. It is related to Genome Interpretation in humans, which models the individual risk of developing disease traits. Genomic prediction is dominated by linear mixed models, such as the Genomic Best Linear Unbiased Prediction (GBLUP), which computes kinship matrices from SNP array data, while Genome Interpretation applications to clinical genetics rely mainly on Polygenic Risk Scores.

RESULTS

In this article, we exploit the positive semidefinite characteristics of the kinship matrices that are conventionally used in GBLUP to propose a novel Genomic Multiple Kernel Learning method (GMKL), in which the multiple kinship matrices corresponding to Additive, Dominant, and Epistatic Inheritance Mechanisms are used as kernels in support vector machines, and we apply it to both worlds. We benchmark GMKL on simulated cattle phenotypes, showing that it outperforms the classical GBLUP predictors for genomic prediction. Moreover, we show that GMKL ranks the kinship kernels representing different inheritance mechanisms according to their compatibility with the observed data, allowing it to produce hypotheses on the normally unknown inheritance mechanisms generating the target phenotypes. We then apply GMKL to the prediction of two inflammatory bowel disease cohorts with more than 6500 samples in total, consistently obtaining results suggesting that epistasis might have a relevant, although underestimated role in inflammatory bowel disease (IBD).

CONCLUSIONS

We show that GMKL performs similarly to GBLUP, but it can formulate biological hypotheses about inheritance mechanisms, such as suggesting that epistasis influences IBD.

摘要

背景

基因组预测涵盖了农业技术中用于预测个体在有价值表型性状方面遗传价值的技术。它与人类基因组解读相关，后者对个体患疾病性状的风险进行建模。基因组预测主要由线性混合模型主导，如基因组最佳线性无偏预测（GBLUP），它根据单核苷酸多态性（SNP）阵列数据计算亲缘关系矩阵，而临床遗传学中的基因组解读应用主要依赖多基因风险评分。

结果

在本文中，我们利用GBLUP中常规使用的亲缘关系矩阵的半正定特性，提出了一种新的基因组多核学习方法（GMKL），其中对应于加性、显性和上位性遗传机制的多个亲缘关系矩阵被用作支持向量机中的核，并且我们将其应用于这两个领域。我们在模拟的牛表型上对GMKL进行基准测试，表明它在基因组预测方面优于经典的GBLUP预测器。此外，我们表明GMKL根据亲缘关系核与观测数据的兼容性对代表不同遗传机制的亲缘关系核进行排序，使其能够对产生目标表型的通常未知的遗传机制提出假设。然后，我们将GMKL应用于对两个总共超过6500个样本的炎症性肠病队列的预测，一致获得的结果表明上位性可能在炎症性肠病（IBD）中具有相关作用，尽管该作用被低估了。

结论

我们表明GMKL的表现与GBLUP相似，但它可以对遗传机制提出生物学假设，例如表明上位性影响IBD。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6cae/11969835/3da89a82fc2d/13059_2025_3544_Fig1_HTML.jpg

相似文献

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits.

Genome Biol. 2025 Apr 4;26(1):84. doi: 10.1186/s13059-025-03544-3.

deepGBLUP: joint deep learning networks and GBLUP framework for accurate genomic prediction of complex traits in Korean native cattle.

Genet Sel Evol. 2023 Jul 31;55(1):56. doi: 10.1186/s12711-023-00825-y.

Deep learning versus parametric and ensemble methods for genomic prediction of complex phenotypes.

Genet Sel Evol. 2020 Feb 24;52(1):12. doi: 10.1186/s12711-020-00531-z.

Stacked kinship CNN vs. GBLUP for genomic predictions of additive and complex continuous phenotypes.

Sci Rep. 2022 Nov 18;12(1):19889. doi: 10.1038/s41598-022-24405-0.

A Multiple-Trait Bayesian Lasso for Genome-Enabled Analysis and Prediction of Complex Traits.

Genetics. 2020 Feb;214(2):305-331. doi: 10.1534/genetics.119.302934. Epub 2019 Dec 26.

Genomic studies with preselected markers reveal dominance effects influencing growth traits in Eucalyptus nitens.

G3 (Bethesda). 2022 Jan 4;12(1). doi: 10.1093/g3journal/jkab363.

Application of linear and machine learning models to genomic prediction of fatty acid composition in Japanese Black cattle.

Anim Sci J. 2023 Jan-Dec;94(1):e13883. doi: 10.1111/asj.13883.

Genome-wide association study and prediction of genomic breeding values for fatty-acid composition in Korean Hanwoo cattle using a high-density single-nucleotide polymorphism array.

J Anim Sci. 2018 Sep 29;96(10):4063-4075. doi: 10.1093/jas/sky280.

Genomic prediction based on data from three layer lines using non-linear regression models.

Genet Sel Evol. 2014 Nov 6;46(1):75. doi: 10.1186/s12711-014-0075-3.

Predictive ability of multi-population genomic prediction methods of phenotypes for reproduction traits in Chinese and Austrian pigs.

Genet Sel Evol. 2024 Jun 26;56(1):49. doi: 10.1186/s12711-024-00915-5.

本文引用的文献

Comparison of machine learning methods for genomic prediction of selected Arabidopsis thaliana traits.

PLoS One. 2024 Aug 28;19(8):e0308962. doi: 10.1371/journal.pone.0308962. eCollection 2024.

CAGI, the Critical Assessment of Genome Interpretation, establishes progress and prospects for computational genetic variant interpretation methods.

Genome Biol. 2024 Feb 22;25(1):53. doi: 10.1186/s13059-023-03113-6.

Use of continuous genotypes for genomic prediction in sugarcane.

Plant Genome. 2024 Mar;17(1):e20417. doi: 10.1002/tpg2.20417. Epub 2023 Dec 8.

Genomic prediction based on selective linkage disequilibrium pruning of low-coverage whole-genome sequence variants in a pure Duroc population.

Genet Sel Evol. 2023 Oct 18;55(1):72. doi: 10.1186/s12711-023-00843-w.

Large sample size and nonlinear sparse models outline epistatic effects in inflammatory bowel disease.

Genome Biol. 2023 Oct 5;24(1):224. doi: 10.1186/s13059-023-03064-y.

Genetic architecture of the inflammatory bowel diseases across East Asian and European ancestries.

Nat Genet. 2023 May;55(5):796-806. doi: 10.1038/s41588-023-01384-0. Epub 2023 May 8.

PyAGH: a python package to fast construct kinship matrices based on different levels of omic data.

BMC Bioinformatics. 2023 Apr 18;24(1):153. doi: 10.1186/s12859-023-05280-6.

Editorial: Towards genome interpretation: Computational methods to model the genotype-phenotype relationship.

Front Bioinform. 2022 Nov 30;2:1098941. doi: 10.3389/fbinf.2022.1098941. eCollection 2022.

Stacked kinship CNN vs. GBLUP for genomic predictions of additive and complex continuous phenotypes.

Sci Rep. 2022 Nov 18;12(1):19889. doi: 10.1038/s41598-022-24405-0.

DNNGP, a deep neural network-based method for genomic prediction using multi-omics data in plants.

Mol Plant. 2023 Jan 2;16(1):279-293. doi: 10.1016/j.molp.2022.11.004. Epub 2022 Nov 10.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

基于亲缘关系的多核学习进行基因组预测，能够对表型性状的潜在遗传机制提出假设。

Genomic prediction with kinship-based multiple kernel learning produces hypothesis on the underlying inheritance mechanisms of phenotypic traits.

作者信息

Raimondi Daniele, Verplaetse Nora, Passemiers Antoine, Jans Deborah Sarah, Cleynen Isabelle, Moreau Yves

机构信息

Institut de Génétique Moléculaire de Montpellier (IGMM), CNRS-UMR5535, Université de Montpellier, Montpellier, 34293, France.

ESAT-STADIUS, KU Leuven, Leuven, 3001, Belgium.