Suppr超能文献

统计表达模式检验(STEPath):一种在个体和荟萃分析研究中整合基因表达数据和基因组信息的新策略。

Statistical Test of Expression Pattern (STEPath): a new strategy to integrate gene expression data with genomic information in individual and meta-analysis studies.

机构信息

CRIBI Biotechnology Centre, Department of Biology, University of Padova, via U, Bassi 58/B, 35121 Padova, Italy.

出版信息

BMC Bioinformatics. 2011 Apr 11;12:92. doi: 10.1186/1471-2105-12-92.

Abstract

BACKGROUND

In the last decades, microarray technology has spread, leading to a dramatic increase of publicly available datasets. The first statistical tools developed were focused on the identification of significant differentially expressed genes. Later, researchers moved toward the systematic integration of gene expression profiles with additional biological information, such as chromosomal location, ontological annotations or sequence features. The analysis of gene expression linked to physical location of genes on chromosomes allows the identification of transcriptionally imbalanced regions, while, Gene Set Analysis focuses on the detection of coordinated changes in transcriptional levels among sets of biologically related genes. In this field, meta-analysis offers the possibility to compare different studies, addressing the same biological question to fully exploit public gene expression datasets.

RESULTS

We describe STEPath, a method that starts from gene expression profiles and integrates the analysis of imbalanced region as an a priori step before performing gene set analysis. The application of STEPath in individual studies produced gene set scores weighted by chromosomal activation. As a final step, we propose a way to compare these scores across different studies (meta-analysis) on related biological issues. One complication with meta-analysis is batch effects, which occur because molecular measurements are affected by laboratory conditions, reagent lots and personnel differences. Major problems occur when batch effects are correlated with an outcome of interest and lead to incorrect conclusions. We evaluated the power of combining chromosome mapping and gene set enrichment analysis, performing the analysis on a dataset of leukaemia (example of individual study) and on a dataset of skeletal muscle diseases (meta-analysis approach). In leukaemia, we identified the Hox gene set, a gene set closely related to the pathology that other algorithms of gene set analysis do not identify, while the meta-analysis approach on muscular disease discriminates between related pathologies and correlates similar ones from different studies.

CONCLUSIONS

STEPath is a new method that integrates gene expression profiles, genomic co-expressed regions and the information about the biological function of genes. The usage of the STEPath-computed gene set scores overcomes batch effects in the meta-analysis approaches allowing the direct comparison of different pathologies and different studies on a gene set activation level.

摘要

背景

在过去的几十年中,微阵列技术得到了广泛应用,导致可公开获得的数据集数量急剧增加。最初开发的统计工具主要集中在识别显著差异表达的基因上。后来,研究人员转向系统地将基因表达谱与其他生物学信息(如染色体位置、本体注释或序列特征)集成。对基因表达与基因在染色体上的物理位置的分析可以识别转录失衡区域,而基因集分析则侧重于检测生物相关基因集之间转录水平的协调变化。在这个领域中,荟萃分析提供了比较不同研究的可能性,从而可以充分利用公共基因表达数据集来解决相同的生物学问题。

结果

我们描述了 STEPath,这是一种从基因表达谱开始的方法,在进行基因集分析之前,将不平衡区域的分析作为一个先验步骤。STEPath 在单个研究中的应用产生了加权染色体激活的基因集得分。作为最后一步,我们提出了一种在相关生物学问题上比较不同研究(荟萃分析)中这些得分的方法。荟萃分析的一个复杂问题是批次效应,这是由于分子测量受到实验室条件、试剂批次和人员差异的影响而产生的。当批次效应与感兴趣的结果相关并导致错误结论时,就会出现主要问题。我们评估了结合染色体作图和基因集富集分析的能力,在白血病数据集(单个研究的分析)和骨骼肌疾病数据集(荟萃分析方法)上进行了分析。在白血病中,我们确定了 Hox 基因集,这是一个与病理学密切相关的基因集,而其他基因集分析算法则无法识别;而肌肉疾病的荟萃分析方法则可以区分相关的病理学,并将来自不同研究的相似病理学进行关联。

结论

STEPath 是一种新的方法,它集成了基因表达谱、基因组共表达区域以及基因生物学功能的信息。使用 STEPath 计算的基因集得分可以克服荟萃分析方法中的批次效应,从而可以在基因集激活水平上直接比较不同的病理学和不同的研究。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验