Suppr超能文献

利用统计方法和受控词汇对基因组实验进行优先功能分析:StRAnGER网络应用程序

Exploiting Statistical Methodologies and Controlled Vocabularies for Prioritized Functional Analysis of Genomic Experiments: the StRAnGER Web Application.

作者信息

Chatziioannou Aristotelis A, Moulos Panagiotis

机构信息

Institute of Biological Research and Biotechnology, National Hellenic Research Foundation Athens, Greece.

出版信息

Front Neurosci. 2011 Jan 26;5:8. doi: 10.3389/fnins.2011.00008. eCollection 2011.

Abstract

StRAnGER is a web application for the automated statistical analysis of annotated gene profiling experiments, exploiting controlled biological vocabularies, like the Gene Ontology or the KEGG pathways terms. Starting from annotated lists of differentially expressed genes and gene enrichment scores, regarding the terms of each vocabulary, StRAnGER repartitions and reorders the initial distribution of terms to define a new distribution of elements. Each element pools terms holding the same enrichment score. The new distribution thus derived, is reordered in a decreasing order to the right, according to the observation score of the elements, while elements with the same score, are sorted again in a decreasing order of their enrichment scores. By applying bootstrapping techniques, a corrected measure of the statistical significance of these elements is derived, which enables the selection of terms mapped to these elements, unambiguously associated with respective significant gene sets. The selected terms are immunized against the bias infiltrating statistical enrichment analyses, producing technically very high statistical scores, due to the finite nature of the data population. Besides their high statistical score, another selection criterion for the terms is the number of their members, something that incurs a biological prioritization in line with a Systems Biology context. The output derived, represents a detailed ranked list of significant terms, which constitute a starting point for further functional analysis.

摘要

StRAnGER是一个用于对注释基因谱实验进行自动统计分析的网络应用程序,它利用受控的生物词汇表,如基因本体论或KEGG通路术语。从差异表达基因的注释列表和基因富集分数出发,针对每个词汇表的术语,StRAnGER对术语的初始分布进行重新划分和重新排序,以定义元素的新分布。每个元素汇集具有相同富集分数的术语。由此得出的新分布,根据元素的观察分数从右到左按降序重新排序,而具有相同分数的元素,则再次按其富集分数的降序排序。通过应用自举技术,得出这些元素统计显著性的校正度量,这使得能够选择映射到这些元素的术语,这些术语与各自的显著基因集明确相关。所选术语可避免渗透到统计富集分析中的偏差,由于数据总体的有限性,这种偏差会产生技术上非常高的统计分数。除了高统计分数外,术语的另一个选择标准是其成员数量,这在系统生物学背景下会带来生物学优先级。得出的输出代表显著术语的详细排名列表,这构成了进一步功能分析的起点。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6581/3032379/30519d08442c/fnins-05-00008-g001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验