Suppr超能文献

高通量测序时间序列的高斯过程检验:在实验进化中的应用

Gaussian process test for high-throughput sequencing time series: application to experimental evolution.

作者信息

Topa Hande, Jónás Ágnes, Kofler Robert, Kosiol Carolin, Honkela Antti

机构信息

Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland.

Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland Helsinki Institute for Information Technology (HIIT), Department of Information and Computer Science, Aalto University, Espoo, Finland, Institut für Populationsgenetik, Vetmeduni Vienna, 1210 Wien, Austria, Vienna Graduate School of Population Genetics, Wien, Austria and Helsinki Institute for Information Technology (HIIT), Department of Computer Science, University of Helsinki, Helsinki, Finland.

出版信息

Bioinformatics. 2015 Jun 1;31(11):1762-70. doi: 10.1093/bioinformatics/btv014. Epub 2015 Jan 21.

Abstract

MOTIVATION

Recent advances in high-throughput sequencing (HTS) have made it possible to monitor genomes in great detail. New experiments not only use HTS to measure genomic features at one time point but also monitor them changing over time with the aim of identifying significant changes in their abundance. In population genetics, for example, allele frequencies are monitored over time to detect significant frequency changes that indicate selection pressures. Previous attempts at analyzing data from HTS experiments have been limited as they could not simultaneously include data at intermediate time points, replicate experiments and sources of uncertainty specific to HTS such as sequencing depth.

RESULTS

We present the beta-binomial Gaussian process model for ranking features with significant non-random variation in abundance over time. The features are assumed to represent proportions, such as proportion of an alternative allele in a population. We use the beta-binomial model to capture the uncertainty arising from finite sequencing depth and combine it with a Gaussian process model over the time series. In simulations that mimic the features of experimental evolution data, the proposed method clearly outperforms classical testing in average precision of finding selected alleles. We also present simulations exploring different experimental design choices and results on real data from Drosophila experimental evolution experiment in temperature adaptation.

AVAILABILITY AND IMPLEMENTATION

R software implementing the test is available at https://github.com/handetopa/BBGP.

摘要

动机

高通量测序(HTS)的最新进展使得详细监测基因组成为可能。新的实验不仅使用HTS在一个时间点测量基因组特征,还监测它们随时间的变化,目的是识别其丰度的显著变化。例如,在群体遗传学中,等位基因频率随时间被监测,以检测表明选择压力的显著频率变化。先前分析HTS实验数据的尝试受到限制,因为它们不能同时包含中间时间点的数据、重复实验以及HTS特有的不确定性来源,如测序深度。

结果

我们提出了贝塔 - 二项式高斯过程模型,用于对随时间具有显著非随机丰度变化的特征进行排名。这些特征被假定代表比例,例如群体中替代等位基因的比例。我们使用贝塔 - 二项式模型来捕捉由于有限测序深度产生的不确定性,并将其与时间序列上的高斯过程模型相结合。在模拟实验进化数据特征的模拟中,所提出的方法在找到选定等位基因的平均精度方面明显优于经典测试。我们还展示了探索不同实验设计选择的模拟以及来自果蝇温度适应实验进化实验的真实数据的结果。

可用性和实现方式

实现该测试的R软件可在https://github.com/handetopa/BBGP获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/2f07/4443671/e5ed51ba3a69/btv014f1p.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验