BMC Bioinformatics. 2013;14 Suppl 14(Suppl 14):S4. doi: 10.1186/1471-2105-14-S14-S4. Epub 2013 Oct 9.
Transcriptome analysis by microarrays has produced important advances in biomedicine. For instance in multiple myeloma (MM), microarray approaches led to the development of an effective disease subtyping via cluster assignment, and a 70 gene risk score. Both enabled an improved molecular understanding of MM, and have provided prognostic information for the purposes of clinical management. Many researchers are now transitioning to Next Generation Sequencing (NGS) approaches and RNA-seq in particular, due to its discovery-based nature, improved sensitivity, and dynamic range. Additionally, RNA-seq allows for the analysis of gene isoforms, splice variants, and novel gene fusions. Given the voluminous amounts of historical microarray data, there is now a need to associate and integrate microarray and RNA-seq data via advanced bioinformatic approaches.
Custom software was developed following a model-view-controller (MVC) approach to integrate Affymetrix probe set-IDs, and gene annotation information from a variety of sources. The tool/approach employs an assortment of strategies to integrate, cross reference, and associate microarray and RNA-seq datasets.
Output from a variety of transcriptome reconstruction and quantitation tools (e.g., Cufflinks) can be directly integrated, and/or associated with Affymetrix probe set data, as well as necessary gene identifiers and/or symbols from a diversity of sources. Strategies are employed to maximize the annotation and cross referencing process. Custom gene sets (e.g., MM 70 risk score (GEP-70)) can be specified, and the tool can be directly assimilated into an RNA-seq pipeline.
A novel bioinformatic approach to aid in the facilitation of both annotation and association of historic microarray data, in conjunction with richer RNA-seq data, is now assisting with the study of MM cancer biology.
通过微阵列进行转录组分析在生物医学领域取得了重要进展。例如,在多发性骨髓瘤(MM)中,通过聚类分配和 70 个基因风险评分的微阵列方法,开发了一种有效的疾病亚型。这两种方法都使人们对 MM 有了更深入的分子理解,并为临床管理提供了预后信息。由于其基于发现的性质、更高的灵敏度和动态范围,许多研究人员现在正在转向下一代测序(NGS)方法,尤其是 RNA-seq。此外,RNA-seq 还可以分析基因异构体、剪接变体和新的基因融合。鉴于历史微阵列数据的数量庞大,现在需要通过先进的生物信息学方法来关联和整合微阵列和 RNA-seq 数据。
采用模型-视图-控制器(MVC)方法开发了定制软件,以整合来自各种来源的 Affymetrix 探针集 ID 和基因注释信息。该工具/方法采用了多种策略来整合、交叉引用和关联微阵列和 RNA-seq 数据集。
各种转录组重建和定量工具(例如 Cufflinks)的输出可以直接集成,并/或与 Affymetrix 探针集数据以及来自各种来源的必要基因标识符和/或符号相关联。采用了策略来最大程度地实现注释和交叉引用过程。可以指定自定义基因集(例如,MM 70 风险评分(GEP-70)),并且可以直接将该工具集成到 RNA-seq 管道中。
现在,一种新的生物信息学方法可用于协助注释和关联历史微阵列数据,同时结合更丰富的 RNA-seq 数据,有助于研究 MM 癌症生物学。