Department of Chemical, Biochemical, and Environmental Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, MD, 21250, USA.
Department of Chemical and Biomolecular Engineering, University of Connecticut, Storrs, CT, USA.
BMC Bioinformatics. 2024 Sep 27;25(1):312. doi: 10.1186/s12859-024-05938-9.
Derivative profiling is a novel approach to identify differential signals from dynamic omics data sets. This approach applies variable step-size differentiation to time dynamic omics data. This work assumes that there is a general omics derivative that is a useful and descriptive feature of dynamic omics experiments. We assert that this omics derivative, or omics flux, is a valuable descriptor that can be used instead of, or with, fold change calculations.
The results of derivative profiling are compared to established methods such as Multivariate Adaptive Regression Splines, significance versus fold change analysis (Volcano), and an adjusted ratio over intensity (M/A) analysis to find that there is a statistically significant similarity between the results. This comparison is repeated for transcriptomic and phosphoproteomic expression profiles previously characterized in Aspergillus nidulans. This method has been packaged in an open-source, GUI-based MATLAB app, the Derivative Profiling omics Package (DPoP). Gene Ontology (GO) term enrichment has been included in the app so that a user can automatically/programmatically describe the over/under-represented GO terms in the derivative profiling results using domain specific knowledge found in their organism's specific GO database file. The advantage of the DPoP analysis is that it is computationally inexpensive, it does not require fold change calculations, it describes both instantaneous as well as overall behavior, and it achieves statistical confidence with signal trajectories of a single bio-replicate over four or more points.
While we apply this method to time dynamic transcriptomic and phosphoproteomic datasets, it is a numerically generalizable technique that can be applied to any organism and any field interested in time series data analysis. The app described in this work enables omics researchers with no computer science background to apply derivative profiling to their data sets, while also allowing multidisciplined users to build on the nascent idea of profiling derivatives in omics.
衍生分析是一种识别动态组学数据集差异信号的新方法。这种方法对时间动态组学数据应用可变步长微分。该工作假设存在一个通用的组学导数,它是动态组学实验的一个有用和描述性特征。我们断言,这个组学导数或组学通量是一个有价值的描述符,可以替代或与倍数变化计算一起使用。
衍生分析的结果与多变量自适应回归样条、差异倍数与 fold change 分析(火山图)和强度比调整(M/A)分析等已建立的方法进行比较,发现结果之间存在统计学上的显著相似性。对先前在 Aspergillus nidulans 中表征的转录组和磷酸化组学表达谱进行了重复比较。该方法已被打包在一个开源、基于 GUI 的 MATLAB 应用程序中,即衍生分析组学包(DPoP)。该应用程序中包含了基因本体论(GO)术语富集,以便用户可以使用其生物体特定 GO 数据库文件中的特定领域知识,自动/编程方式描述衍生分析结果中过表达/低表达的 GO 术语。DPoP 分析的优势在于它计算成本低,不需要倍数变化计算,它可以描述瞬时和整体行为,并且可以通过单个生物重复的四个或更多点的信号轨迹获得统计置信度。
虽然我们将这种方法应用于时间动态转录组学和磷酸化组学数据集,但它是一种数值上可推广的技术,可以应用于任何对时间序列数据分析感兴趣的生物体和领域。本文描述的应用程序使没有计算机科学背景的组学研究人员能够将衍生分析应用于他们的数据集,同时也允许多学科用户在此基础上进一步发展组学中衍生分析的思想。