School of Mathematics and Statistics, Henan University of Science and Technology, Luoyang, 471000, China.
Longmen Laboratory, Luoyang, 471003, Henan, China.
BMC Bioinformatics. 2024 Jun 15;25(1):215. doi: 10.1186/s12859-024-05836-0.
There exists a critical transition or tipping point during the complex biological process. Such critical transition is usually accompanied by the catastrophic consequences. Therefore, hunting for the tipping point or critical state is of significant importance to prevent or delay the occurrence of catastrophic consequences. However, predicting critical state based on the high-dimensional small sample data is a difficult problem, especially for single-cell expression data.
In this study, we propose the comprehensive neighbourhood-based perturbed mutual information (CPMI) method to detect the critical states of complex biological processes. The CPMI method takes into account the relationship between genes and neighbours, so as to reduce the noise and enhance the robustness. This method is applied to a simulated dataset and six real datasets, including an influenza dataset, two single-cell expression datasets and three bulk datasets. The method can not only successfully detect the tipping points, but also identify their dynamic network biomarkers (DNBs). In addition, the discovery of transcription factors (TFs) which can regulate DNB genes and nondifferential 'dark genes' validates the effectiveness of our method. The numerical simulation verifies that the CPMI method is robust under different noise strengths and is superior to the existing methods on identifying the critical states.
In conclusion, we propose a robust computational method, i.e., CPMI, which is applicable in both the bulk and single cell datasets. The CPMI method holds great potential in providing the early warning signals for complex biological processes and enabling early disease diagnosis.
在复杂的生物过程中存在着一个关键的转变或 tipping 点。这种关键转变通常伴随着灾难性的后果。因此,寻找 tipping 点或临界点对于预防或延迟灾难性后果的发生至关重要。然而,基于高维小样本数据预测临界点是一个难题,特别是对于单细胞表达数据。
在这项研究中,我们提出了基于综合邻域扰动互信息(CPMI)的方法来检测复杂生物过程的临界点。CPMI 方法考虑了基因与邻居之间的关系,从而减少了噪声并增强了鲁棒性。该方法应用于一个模拟数据集和六个真实数据集,包括流感数据集、两个单细胞表达数据集和三个批量数据集。该方法不仅可以成功地检测到 tipping 点,还可以识别它们的动态网络生物标志物(DNB)。此外,发现可以调节 DNB 基因和非差异“暗基因”的转录因子(TFs)验证了我们方法的有效性。数值模拟验证了 CPMI 方法在不同噪声强度下的稳健性,并在识别临界点方面优于现有方法。
总之,我们提出了一种稳健的计算方法,即 CPMI,它适用于批量和单细胞数据集。CPMI 方法在为复杂生物过程提供预警信号和实现早期疾病诊断方面具有巨大的潜力。