Park Heewon, Miyano Satoru
School of Mathematics, Statistics and Data Science, Sungshin Women's University, Seoul, Republic of Korea.
M&D Data Science Center, Tokyo Medical and Dental University, 1-5-45 Yushima, Bunkyo-ku, Tokyo, Japan.
PLoS One. 2025 May 8;20(5):e0321549. doi: 10.1371/journal.pone.0321549. eCollection 2025.
Unraveling the genetic regulatory networks that underlie diseases is essential for comprehending the intricate mechanisms of these conditions. While various computational strategies were developed, the approaches in the existing studies concerning network-based prediction and classification are based on the pre-estimated gene networks. However, the gene network that is pre-estimated fails to yield biologically meaningful explanations for classifying cell lines into particular clinical states. The reason for this limitation is the lack of inclusion of any information about the clinical status of cell lines during the process of network estimation. To achieve effective cell line classification and ensure the biological validity of the cell lines classification, we develop a computational strategy referred to as GRN-multiClassifier for network-based multi-class classification. The GRN-multiClassifier estimates gene network in a manner that simultaneously minimizes both the network estimation error and the negative log-likelihood function of multinomial logistic regression. That is, our strategy estimates optimized gene network to enable the multi-class classification of cell lines into specific clinical conditions. Monte Carlo simulations demonstrate the efficacy of the GRN-multiClassifier. We applied our strategy to network-based classification of acute leukemia cell lines into three distinct categories of acute leukemia. Our strategy shows outstanding performance in the classification of acute leukemia cell lines. The results for the acute leukemia marker identification are strongly supported by existing literature. The implications of our findings suggest that potential pathways involving the inhibition of ACTB and the molecular interactions between "HBA1&HBB," "HBB&HBA1," "IGKV1-5&IGHV4-31," "IGHV4-31&IGKV1-5," "HLA-DRA&CD74" and "ACTB&ACTB" could offer significant insights into the underlying mechanism of acute leukemia.
揭示构成疾病基础的基因调控网络对于理解这些病症的复杂机制至关重要。虽然已经开发了各种计算策略,但现有研究中基于网络的预测和分类方法是基于预先估计的基因网络。然而,预先估计的基因网络未能为将细胞系分类为特定临床状态提供生物学上有意义的解释。这种局限性的原因是在网络估计过程中缺乏关于细胞系临床状态的任何信息。为了实现有效的细胞系分类并确保细胞系分类的生物学有效性,我们开发了一种称为GRN-multiClassifier的计算策略用于基于网络的多类分类。GRN-multiClassifier以同时最小化网络估计误差和多项逻辑回归的负对数似然函数的方式估计基因网络。也就是说,我们的策略估计优化的基因网络以实现将细胞系多类分类为特定临床状态。蒙特卡罗模拟证明了GRN-multiClassifier的有效性。我们将我们的策略应用于将急性白血病细胞系基于网络分类为三种不同类型的急性白血病。我们的策略在急性白血病细胞系分类中表现出卓越的性能。急性白血病标志物鉴定的结果得到了现有文献的有力支持。我们研究结果的意义表明,涉及抑制ACTB以及“HBA1&HBB”“HBB&HBA1”“IGKV1-5&IGHV4-31”“IGHV4-31&IGKV1-5”“HLA-DRA&CD74”和“ACTB&ACTB”之间分子相互作用的潜在途径可能为急性白血病的潜在机制提供重要见解。