Suppr超能文献

利用扩散核对遗传标记进行复杂性状预测及其在奶牛和小麦数据中的应用。

Predicting complex traits using a diffusion kernel on genetic markers with an application to dairy cattle and wheat data.

机构信息

Department of Animal Sciences, University of Wisconsin-Madison, Madison, WI, USA.

出版信息

Genet Sel Evol. 2013 Jun 13;45(1):17. doi: 10.1186/1297-9686-45-17.

Abstract

BACKGROUND

Arguably, genotypes and phenotypes may be linked in functional forms that are not well addressed by the linear additive models that are standard in quantitative genetics. Therefore, developing statistical learning models for predicting phenotypic values from all available molecular information that are capable of capturing complex genetic network architectures is of great importance. Bayesian kernel ridge regression is a non-parametric prediction model proposed for this purpose. Its essence is to create a spatial distance-based relationship matrix called a kernel. Although the set of all single nucleotide polymorphism genotype configurations on which a model is built is finite, past research has mainly used a Gaussian kernel.

RESULTS

We sought to investigate the performance of a diffusion kernel, which was specifically developed to model discrete marker inputs, using Holstein cattle and wheat data. This kernel can be viewed as a discretization of the Gaussian kernel. The predictive ability of the diffusion kernel was similar to that of non-spatial distance-based additive genomic relationship kernels in the Holstein data, but outperformed the latter in the wheat data. However, the difference in performance between the diffusion and Gaussian kernels was negligible.

CONCLUSIONS

It is concluded that the ability of a diffusion kernel to capture the total genetic variance is not better than that of a Gaussian kernel, at least for these data. Although the diffusion kernel as a choice of basis function may have potential for use in whole-genome prediction, our results imply that embedding genetic markers into a non-Euclidean metric space has very small impact on prediction. Our results suggest that use of the black box Gaussian kernel is justified, given its connection to the diffusion kernel and its similar predictive performance.

摘要

背景

可以说,基因型和表型可能以标准数量遗传学中线性加性模型无法很好解决的功能形式联系在一起。因此,开发能够从所有可用分子信息中预测表型值的统计学习模型,这些模型能够捕捉复杂的遗传网络结构,这一点非常重要。贝叶斯核岭回归是为此目的而提出的一种非参数预测模型。它的本质是创建一个基于空间距离的关系矩阵,称为核。虽然模型构建所基于的所有单核苷酸多态性基因型配置的集合是有限的,但过去的研究主要使用了高斯核。

结果

我们试图使用荷斯坦奶牛和小麦数据来研究扩散核的性能,该核专门用于对离散标记输入进行建模。该核可以看作是高斯核的离散化。在荷斯坦数据中,扩散核的预测能力与基于非空间距离的加性基因组关系核相似,但在小麦数据中表现优于后者。然而,扩散核和高斯核之间的性能差异可以忽略不计。

结论

可以得出结论,扩散核捕获总遗传方差的能力并不优于高斯核,至少对于这些数据是这样。尽管扩散核作为基函数的选择可能具有用于全基因组预测的潜力,但我们的结果表明,将遗传标记嵌入非欧几里得度量空间对预测的影响很小。鉴于其与扩散核的联系及其相似的预测性能,我们的结果表明,使用黑盒高斯核是合理的。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07f5/3706293/6121b96c3bd2/1297-9686-45-17-1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验