Suppr超能文献

XGSA:一种用于跨物种基因集分析的统计方法。

XGSA: A statistical method for cross-species gene set analysis.

作者信息

Djordjevic Djordje, Kusumi Kenro, Ho Joshua W K

机构信息

Victor Chang Cardiac Research Institute, Darlinghurst, NSW 2010, Australia, St Vincent's Clinical School, University of New South Wales Australia, Darlinghurst, NSW 2010, Australia.

School of Life Sciences, Arizona State University, Tempe, AZ 85287, USA.

出版信息

Bioinformatics. 2016 Sep 1;32(17):i620-i628. doi: 10.1093/bioinformatics/btw428.

Abstract

MOTIVATION

Gene set analysis is a powerful tool for determining whether an experimentally derived set of genes is statistically significantly enriched for genes in other pre-defined gene sets, such as known pathways, gene ontology terms, or other experimentally derived gene sets. Current gene set analysis methods do not facilitate comparing gene sets across different organisms as they do not explicitly deal with homology mapping between species. There lacks a systematic investigation about the effect of complex gene homology on cross-species gene set analysis.

RESULTS

In this study, we show that not accounting for the complex homology structure when comparing gene sets in two species can lead to false positive discoveries, especially when comparing gene sets that have complex gene homology relationships. To overcome this bias, we propose a straightforward statistical approach, called XGSA, that explicitly takes the cross-species homology mapping into consideration when doing gene set analysis. Simulation experiments confirm that XGSA can avoid false positive discoveries, while maintaining good statistical power compared to other ad hoc approaches for cross-species gene set analysis. We further demonstrate the effectiveness of XGSA with two real-life case studies that aim to discover conserved or species-specific molecular pathways involved in social challenge and vertebrate appendage regeneration.

AVAILABILITY AND IMPLEMENTATION

The R source code for XGSA is available under a GNU General Public License at http://github.com/VCCRI/XGSA CONTACT: jho@victorchang.edu.au.

摘要

动机

基因集分析是一种强大的工具,用于确定实验得出的一组基因在其他预定义基因集中(如已知通路、基因本体术语或其他实验得出的基因集)的基因是否在统计上显著富集。当前的基因集分析方法无法促进跨不同生物体比较基因集,因为它们没有明确处理物种间的同源性映射。对于复杂基因同源性对跨物种基因集分析的影响缺乏系统研究。

结果

在本研究中,我们表明在比较两个物种的基因集时不考虑复杂的同源性结构会导致假阳性发现,特别是在比较具有复杂基因同源性关系的基因集时。为了克服这种偏差,我们提出了一种直接的统计方法,称为XGSA,在进行基因集分析时明确考虑跨物种同源性映射。模拟实验证实,XGSA可以避免假阳性发现,同时与其他用于跨物种基因集分析的临时方法相比,保持良好的统计功效。我们通过两个实际案例研究进一步证明了XGSA的有效性,这两个案例研究旨在发现参与社会挑战和脊椎动物附肢再生的保守或物种特异性分子途径。

可用性和实现

XGSA的R源代码可在GNU通用公共许可证下从http://github.com/VCCRI/XGSA获得。联系方式:jho@victorchang.edu.au。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验