Suppr超能文献

用于高通量生物学中差异数据发现的广义经验贝叶斯方法。

Generalized empirical Bayesian methods for discovery of differential data in high-throughput biology.

作者信息

Hardcastle Thomas J

机构信息

Department of Plant Sciences, University of Cambridge, Cambridge CB2 3EA, UK.

出版信息

Bioinformatics. 2016 Jan 15;32(2):195-202. doi: 10.1093/bioinformatics/btv569. Epub 2015 Oct 1.

Abstract

MOTIVATION

High-throughput data are now commonplace in biological research. Rapidly changing technologies and application mean that novel methods for detecting differential behaviour that account for a 'large P, small n' setting are required at an increasing rate. The development of such methods is, in general, being done on an ad hoc basis, requiring further development cycles and a lack of standardization between analyses.

RESULTS

We present here a generalized method for identifying differential behaviour within high-throughput biological data through empirical Bayesian methods. This approach is based on our baySeq algorithm for identification of differential expression in RNA-seq data based on a negative binomial distribution, and in paired data based on a beta-binomial distribution. Here we show how the same empirical Bayesian approach can be applied to any parametric distribution, removing the need for lengthy development of novel methods for differently distributed data. Comparisons with existing methods developed to address specific problems in high-throughput biological data show that these generic methods can achieve equivalent or better performance. A number of enhancements to the basic algorithm are also presented to increase flexibility and reduce computational costs.

AVAILABILITY AND IMPLEMENTATION

The methods are implemented in the R baySeq (v2) package, available on Bioconductor http://www.bioconductor.org/packages/release/bioc/html/baySeq.html.

CONTACT

tjh48@cam.ac.uk

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

高通量数据如今在生物学研究中已很常见。技术和应用的快速变化意味着,越来越需要能够处理“大P,小n”情况的检测差异行为的新方法。一般而言,此类方法是临时开发的,需要进一步的开发周期,且分析之间缺乏标准化。

结果

我们在此提出一种通过经验贝叶斯方法在高通量生物学数据中识别差异行为的通用方法。该方法基于我们的baySeq算法,该算法基于负二项分布在RNA测序数据中识别差异表达,并基于β-二项分布在配对数据中识别差异表达。在这里,我们展示了相同的经验贝叶斯方法如何应用于任何参数分布,从而无需为不同分布的数据冗长地开发新方法。与为解决高通量生物学数据中的特定问题而开发的现有方法的比较表明,这些通用方法可以实现同等或更好的性能。还提出了对基本算法的一些改进,以增加灵活性并降低计算成本。

可用性和实现方式

这些方法在R语言的baySeq(v2)包中实现,可在Bioconductor上获取,网址为http://www.bioconductor.org/packages/release/bioc/html/baySeq.html。

联系方式

tjh48@cam.ac.uk

补充信息

补充数据可在《生物信息学》在线获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验