Suppr超能文献

经验贝叶斯排序方法及其在高通量生物学中的应用。

An empirical Bayesian ranking method, with applications to high throughput biology.

机构信息

Biostatistics Division, HRB Clinical Research Facility, National University of Ireland Galway, Galway, Ireland.

Department of Statistics and Data Science, Yale University, New Haven, CT, USA.

出版信息

Bioinformatics. 2020 Jan 1;36(1):177-185. doi: 10.1093/bioinformatics/btz471.

Abstract

MOTIVATION

In bioinformatics, genome-wide experiments look for important biological differences between two groups at a large number of locations in the genome. Often, the final analysis focuses on a P-value-based ranking of locations which might then be investigated further in follow-up experiments. However, this strategy may result in small effect sizes, with low P-values, being ranked more favorably than larger more scientifically important effects. Bayesian ranking techniques may offer a solution to this problem provided a good prior distribution for the collective distribution of effect sizes is available.

RESULTS

We develop an Empirical Bayes ranking algorithm, using the marginal distribution of the data over all locations to estimate an appropriate prior. In simulations and analysis using real datasets, we demonstrate favorable performance compared to ordering P-values and a number of other competing ranking methods. The algorithm is computationally efficient and can be used to rank the entirety of genomic locations or to rank a subset of locations, pre-selected via traditional FWER/FDR methods in a 2-stage analysis.

AVAILABILITY AND IMPLEMENTATION

An R-package, EBrank, implementing the ranking algorithm is available on CRAN.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

在生物信息学中,全基因组实验旨在寻找基因组中大量位置上两组之间的重要生物学差异。通常,最终分析侧重于基于 P 值的位置排序,然后可能在后续实验中进一步研究这些位置。然而,这种策略可能导致小的效应量,低 P 值的位置被排名更有利,而更大更有科学意义的效应则排名较低。贝叶斯排序技术可以提供一种解决方案,前提是可以获得效应大小的总体分布的良好先验分布。

结果

我们开发了一种经验贝叶斯排序算法,使用数据在所有位置上的边缘分布来估计适当的先验分布。在模拟和使用真实数据集的分析中,与排序 P 值和许多其他竞争排序方法相比,我们展示了良好的性能。该算法计算效率高,可用于对全基因组位置进行排序,也可用于对通过 2 阶段分析中传统的 FWER/FDR 方法预选的位置子集进行排序。

可用性和实现

一个实现排序算法的 R 包 EBrank 可在 CRAN 上获得。

补充信息

补充数据可在 Bioinformatics 在线获得。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验