Suppr超能文献

BABAR:一个 R 包,用于简化常见参考设计微阵列转录组数据集的标准化。

BABAR: an R package to simplify the normalisation of common reference design microarray-based transcriptomic datasets.

机构信息

Foodborne Bacterial Pathogens, Institute of Food Research, Norwich Research Park, Norwich, NR4 7UA, UK.

出版信息

BMC Bioinformatics. 2010 Feb 3;11:73. doi: 10.1186/1471-2105-11-73.

Abstract

BACKGROUND

The development of DNA microarrays has facilitated the generation of hundreds of thousands of transcriptomic datasets. The use of a common reference microarray design allows existing transcriptomic data to be readily compared and re-analysed in the light of new data, and the combination of this design with large datasets is ideal for 'systems'-level analyses. One issue is that these datasets are typically collected over many years and may be heterogeneous in nature, containing different microarray file formats and gene array layouts, dye-swaps, and showing varying scales of log2- ratios of expression between microarrays. Excellent software exists for the normalisation and analysis of microarray data but many data have yet to be analysed as existing methods struggle with heterogeneous datasets; options include normalising microarrays on an individual or experimental group basis. Our solution was to develop the Batch Anti-Banana Algorithm in R (BABAR) algorithm and software package which uses cyclic loess to normalise across the complete dataset. We have already used BABAR to analyse the function of Salmonella genes involved in the process of infection of mammalian cells.

RESULTS

The only input required by BABAR is unprocessed GenePix or BlueFuse microarray data files. BABAR provides a combination of 'within' and 'between' microarray normalisation steps and diagnostic boxplots. When applied to a real heterogeneous dataset, BABAR normalised the dataset to produce a comparable scaling between the microarrays, with the microarray data in excellent agreement with RT-PCR analysis. When applied to a real non-heterogeneous dataset and a simulated dataset, BABAR's performance in identifying differentially expressed genes showed some benefits over standard techniques.

CONCLUSIONS

BABAR is an easy-to-use software tool, simplifying the simultaneous normalisation of heterogeneous two-colour common reference design cDNA microarray-based transcriptomic datasets. We show BABAR transforms real and simulated datasets to allow for the correct interpretation of these data, and is the ideal tool to facilitate the identification of differentially expressed genes or network inference analysis from transcriptomic datasets.

摘要

背景

DNA 微阵列的发展促进了成千上万的转录组数据集的产生。使用通用参考微阵列设计允许现有转录组数据根据新数据进行快速比较和重新分析,并且这种设计与大型数据集的结合非常适合“系统”级分析。一个问题是,这些数据集通常是在多年中收集的,并且可能在性质上是异构的,包含不同的微阵列文件格式和基因阵列布局、染料交换,并且在微阵列之间表现出不同的表达水平的对数比尺度。存在出色的软件可用于微阵列数据的标准化和分析,但许多数据尚未进行分析,因为现有方法难以处理异构数据集;选项包括基于个体或实验组对微阵列进行标准化。我们的解决方案是开发 R 中的批量反香蕉算法(BABAR)算法和软件包,该算法使用循环局部线性回归来跨整个数据集进行标准化。我们已经使用 BABAR 分析了参与感染哺乳动物细胞过程的沙门氏菌基因的功能。

结果

BABAR 仅需要未处理的 GenePix 或 BlueFuse 微阵列数据文件作为输入。BABAR 提供了“内部”和“外部”微阵列标准化步骤以及诊断箱线图的组合。当应用于真实的异构数据集时,BABAR 对数据集进行了标准化,以在微阵列之间产生可比较的缩放比例,并且微阵列数据与 RT-PCR 分析非常吻合。当应用于真实的非异构数据集和模拟数据集时,BABAR 在识别差异表达基因方面的性能优于标准技术。

结论

BABAR 是一个易于使用的软件工具,简化了异构双色通用参考设计 cDNA 微阵列转录组数据集的同时标准化。我们展示了 BABAR 可以转换真实和模拟数据集,以正确解释这些数据,并且是从转录组数据中识别差异表达基因或进行网络推断分析的理想工具。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验