Suppr超能文献

利用预测特异性确定基因集分析何时具有生物学意义。

Using predictive specificity to determine when gene set analysis is biologically meaningful.

机构信息

Stanley Institute for Cognitive Genomics, Cold Spring Harbor Laboratory, Woodbury, NY 11797, USA.

Department of Psychiatry and Michael Smith Laboratories, University of British Columbia, Vancouver, BC, V6T 1Z4, Canada.

出版信息

Nucleic Acids Res. 2017 Feb 28;45(4):e20. doi: 10.1093/nar/gkw957.

Abstract

Gene set analysis, which translates gene lists into enriched functions, is among the most common bioinformatic methods. Yet few would advocate taking the results at face value. Not only is there no agreement on the algorithms themselves, there is no agreement on how to benchmark them. In this paper, we evaluate the robustness and uniqueness of enrichment results as a means of assessing methods even where correctness is unknown. We show that heavily annotated (‘multifunctional’) genes are likely to appear in genomics study results and drive the generation of biologically non-specific enrichment results as well as highly fragile significances. By providing a means of determining where enrichment analyses report non-specific and non-robust findings, we are able to assess where we can be confident in their use. We find significant progress in recent bias correction methods for enrichment and provide our own software implementation. Our approach can be readily adapted to any pre-existing package.

摘要

基因集分析(Gene set analysis)将基因列表转化为富集功能,是最常见的生物信息学方法之一。然而,很少有人会主张盲目接受结果。不仅算法本身没有达成共识,而且在如何对其进行基准测试方面也没有达成共识。在本文中,我们评估了富集结果的稳健性和独特性,即使在正确性未知的情况下,也可以作为评估方法的一种手段。我们表明,注释较多(“多功能”)的基因很可能出现在基因组学研究结果中,并导致产生生物学上非特异性的富集结果以及高度脆弱的显著性。通过提供一种确定富集分析报告非特异性和非稳健结果的方法,我们能够评估在何处可以自信地使用它们。我们发现,最近的富集偏差校正方法取得了显著进展,并提供了我们自己的软件实现。我们的方法可以很容易地适应任何现有的软件包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8fb8/5389513/3baa6e7f4490/gkw957fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验