a Centre for Convergence of Nano-, Bio-, Information and Cognitive Sciences and Technologies, National Research Centre "Kurchatov Institute" , Moscow , Russia.
b Department of R&D, First Oncology Research and Advisory Center , Moscow , Russia.
Cell Cycle. 2017 Oct 2;16(19):1810-1823. doi: 10.1080/15384101.2017.1361068. Epub 2017 Aug 21.
High throughput technologies opened a new era in biomedicine by enabling massive analysis of gene expression at both RNA and protein levels. Unfortunately, expression data obtained in different experiments are often poorly compatible, even for the same biologic samples. Here, using experimental and bioinformatic investigation of major experimental platforms, we show that aggregation of gene expression data at the level of molecular pathways helps to diminish cross- and intra-platform bias otherwise clearly seen at the level of individual genes. We created a mathematical model of cumulative suppression of data variation that predicts the ideal parameters and the optimal size of a molecular pathway. We compared the abilities to aggregate experimental molecular data for the 5 alternative methods, also evaluated by their capacity to retain meaningful features of biologic samples. The bioinformatic method OncoFinder showed optimal performance in both tests and should be very useful for future cross-platform data analyses.
高通量技术通过大规模分析 RNA 和蛋白质水平的基因表达,开创了生物医学的新时代。不幸的是,即使对于相同的生物样本,不同实验中获得的表达数据通常也不兼容。在这里,我们使用主要实验平台的实验和生物信息学研究,表明在分子途径水平上聚合基因表达数据有助于减少跨平台和平台内偏差,否则在单个基因水平上可以清楚地看到这些偏差。我们创建了一个累积抑制数据变异的数学模型,该模型可以预测理想的参数和分子途径的最佳大小。我们比较了 5 种替代方法对实验分子数据的聚合能力,还评估了它们保留生物样本有意义特征的能力。生物信息学方法 OncoFinder 在这两项测试中的表现都很出色,对于未来的跨平台数据分析应该非常有用。