Computational Systems Biology Laboratory, Institute of Biomedicine and Genome-Scale Biology Research Program, University of Helsinki, Haartmaninkatu 8, Helsinki, FIN-00014, Finland.
Genome Med. 2010 Sep 7;2(9):65. doi: 10.1186/gm186.
Coordinated efforts to collect large-scale data sets provide a basis for systems level understanding of complex diseases. In order to translate these fragmented and heterogeneous data sets into knowledge and medical benefits, advanced computational methods for data analysis, integration and visualization are needed.
We introduce a novel data integration framework, Anduril, for translating fragmented large-scale data into testable predictions. The Anduril framework allows rapid integration of heterogeneous data with state-of-the-art computational methods and existing knowledge in bio-databases. Anduril automatically generates thorough summary reports and a website that shows the most relevant features of each gene at a glance, allows sorting of data based on different parameters, and provides direct links to more detailed data on genes, transcripts or genomic regions. Anduril is open-source; all methods and documentation are freely available.
We have integrated multidimensional molecular and clinical data from 338 subjects having glioblastoma multiforme, one of the deadliest and most poorly understood cancers, using Anduril. The central objective of our approach is to identify genetic loci and genes that have significant survival effect. Our results suggest several novel genetic alterations linked to glioblastoma multiforme progression and, more specifically, reveal Moesin as a novel glioblastoma multiforme-associated gene that has a strong survival effect and whose depletion in vitro significantly inhibited cell proliferation. All analysis results are available as a comprehensive website.
Our results demonstrate that integrated analysis and visualization of multidimensional and heterogeneous data by Anduril enables drawing conclusions on functional consequences of large-scale molecular data. Many of the identified genetic loci and genes having significant survival effect have not been reported earlier in the context of glioblastoma multiforme. Thus, in addition to generally applicable novel methodology, our results provide several glioblastoma multiforme candidate genes for further studies.Anduril is available at http://csbi.ltdk.helsinki.fi/anduril/The glioblastoma multiforme analysis results are available at http://csbi.ltdk.helsinki.fi/anduril/tcga-gbm/
为了系统地理解复杂疾病,需要协调努力来收集大规模数据集。为了将这些零散和异构的数据集转化为知识和医学效益,需要先进的数据分析、集成和可视化计算方法。
我们介绍了一种新颖的数据集成框架 Anduril,用于将零散的大规模数据转化为可测试的预测。Anduril 框架允许快速集成异构数据以及生物数据库中的最新计算方法和现有知识。Anduril 自动生成全面的总结报告和一个网站,该网站可以一目了然地显示每个基因的最相关特征,允许根据不同参数对数据进行排序,并提供有关基因、转录本或基因组区域的更详细数据的直接链接。Anduril 是开源的;所有方法和文档都可免费获得。
我们使用 Anduril 整合了 338 名患有胶质母细胞瘤患者的多维分子和临床数据,胶质母细胞瘤是最致命和最难以理解的癌症之一。我们方法的核心目标是识别具有显著生存效应的遗传基因座和基因。我们的研究结果表明,一些与胶质母细胞瘤进展相关的新遗传改变,更具体地说,揭示了 Moesin 是一种新的胶质母细胞瘤相关基因,它具有很强的生存效应,其在体外的耗竭显著抑制了细胞增殖。所有分析结果都可作为一个综合网站获得。
我们的研究结果表明,通过 Anduril 对多维和异构数据进行集成分析和可视化,可以得出关于大规模分子数据的功能后果的结论。许多具有显著生存效应的遗传基因座和基因在胶质母细胞瘤的背景下以前没有报道过。因此,除了一般适用的新方法外,我们的研究结果还为进一步研究提供了几个胶质母细胞瘤候选基因。Anduril 可在 http://csbi.ltdk.helsinki.fi/anduril/ 获得。胶质母细胞瘤的分析结果可在 http://csbi.ltdk.helsinki.fi/anduril/tcga-gbm/ 获得。