Suppr超能文献

基于贝叶斯高斯过程模型估计动态 SNP 遗传率。

Estimation of dynamic SNP-heritability with Bayesian Gaussian process models.

机构信息

Research Unit of Mathematical Sciences, University of Oulu, Oulu FI-90014, Finland.

Department of Computer Science, University College London, London WC1E 6BT, UK.

出版信息

Bioinformatics. 2020 Jun 1;36(12):3795-3802. doi: 10.1093/bioinformatics/btaa199.

Abstract

MOTIVATION

Improved DNA technology has made it practical to estimate single-nucleotide polymorphism (SNP)-heritability among distantly related individuals with unknown relationships. For growth- and development-related traits, it is meaningful to base SNP-heritability estimation on longitudinal data due to the time-dependency of the process. However, only few statistical methods have been developed so far for estimating dynamic SNP-heritability and quantifying its full uncertainty.

RESULTS

We introduce a completely tuning-free Bayesian Gaussian process (GP)-based approach for estimating dynamic variance components and heritability as their function. For parameter estimation, we use a modern Markov Chain Monte Carlo method which allows full uncertainty quantification. Several datasets are analysed and our results clearly illustrate that the 95% credible intervals of the proposed joint estimation method (which 'borrows strength' from adjacent time points) are significantly narrower than of a two-stage baseline method that first estimates the variance components at each time point independently and then performs smoothing. We compare the method with a random regression model using MTG2 and BLUPF90 software and quantitative measures indicate superior performance of our method. Results are presented for simulated and real data with up to 1000 time points. Finally, we demonstrate scalability of the proposed method for simulated data with tens of thousands of individuals.

AVAILABILITY AND IMPLEMENTATION

The C++ implementation dynBGP and simulated data are available in GitHub: https://github.com/aarjas/dynBGP. The programmes can be run in R. Real datasets are available in QTL archive: https://phenome.jax.org/centers/QTLA.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

改进的 DNA 技术使得在未知关系的远缘个体中估计单核苷酸多态性(SNP)遗传力成为可能。对于与生长和发育相关的性状,由于该过程具有时间依赖性,因此基于纵向数据来估计 SNP 遗传力是有意义的。然而,迄今为止,仅开发了少数统计方法来估计动态 SNP 遗传力并量化其全部不确定性。

结果

我们介绍了一种完全免调的基于贝叶斯高斯过程(GP)的方法,用于估计作为其函数的动态方差分量和遗传力。对于参数估计,我们使用一种现代的马尔可夫链蒙特卡罗方法,该方法允许进行全面的不确定性量化。分析了几个数据集,我们的结果清楚地表明,与首先在每个时间点独立估计方差分量然后进行平滑的两阶段基线方法相比,所提出的联合估计方法(从相邻时间点“借用强度”)的 95%置信区间明显更窄。我们使用 MTG2 和 BLUPF90 软件将该方法与随机回归模型进行了比较,定量指标表明我们的方法性能优越。结果针对具有多达 1000 个时间点的模拟数据和真实数据进行了呈现。最后,我们展示了所提出的方法对具有数万个个体的模拟数据的可扩展性。

可用性和实现

C++实现 dynBGP 和模拟数据可在 GitHub 上获得:https://github.com/aarjas/dynBGP。程序可以在 R 中运行。真实数据集可在 QTL 档案中获得:https://phenome.jax.org/centers/QTLA。

补充信息

补充数据可在 Bioinformatics 在线获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/34d3/7672693/6b6d5a5fa983/btaa199f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验