Chen Bolin, Zhang Jinlei, Shao Ci, Bian Jun, Kang Ruiming, Shang Xuequn
School of Computer Science, Northwestern Polytechnical University, Xi'an, 710012, China.
Key Laboratory of Big Data Storage and Management, Northwestern Polytechnical University, Ministry of Industry and Information Technology, Xi'an, 710012, China.
BioData Min. 2024 Sep 4;17(1):30. doi: 10.1186/s13040-024-00386-w.
Identifying critical genes is important for understanding the pathogenesis of complex diseases. Traditional studies typically comparing the change of biomecules between normal and disease samples or detecting important vertices from a single static biomolecular network, which often overlook the dynamic changes that occur between different disease stages. However, investigating temporal changes in biomolecular networks and identifying critical genes is critical for understanding the occurrence and development of diseases.
A novel method called Quantifying Importance of Genes with Tensor Decomposition (QIGTD) was proposed in this study. It first constructs a time series network by integrating both the intra and inter temporal network information, which preserving connections between networks at adjacent stages according to the local similarities. A tensor is employed to describe the connections of this time series network, and a 3-order tensor decomposition method was proposed to capture both the topological information of each network snapshot and the time series characteristics of the whole network. QIGTD is also a learning-free and efficient method that can be applied to datasets with a small number of samples.
The effectiveness of QIGTD was evaluated using lung adenocarcinoma (LUAD) datasets and three state-of-the-art methods: T-degree, T-closeness, and T-betweenness were employed as benchmark methods. Numerical experimental results demonstrate that QIGTD outperforms these methods in terms of the indices of both precision and mAP. Notably, out of the top 50 genes, 29 have been verified to be highly related to LUAD according to the DisGeNET Database, and 36 are significantly enriched in LUAD related Gene Ontology (GO) terms, including nuclear division, mitotic nuclear division, chromosome segregation, organelle fission, and mitotic sister chromatid segregation.
In conclusion, QIGTD effectively captures the temporal changes in gene networks and identifies critical genes. It provides a valuable tool for studying temporal dynamics in biological networks and can aid in understanding the underlying mechanisms of diseases such as LUAD.
识别关键基因对于理解复杂疾病的发病机制至关重要。传统研究通常比较正常样本和疾病样本之间生物分子的变化,或者从单个静态生物分子网络中检测重要节点,这往往忽略了不同疾病阶段之间发生的动态变化。然而,研究生物分子网络的时间变化并识别关键基因对于理解疾病的发生和发展至关重要。
本研究提出了一种名为张量分解基因重要性量化(QIGTD)的新方法。它首先通过整合时间内和时间间的网络信息构建一个时间序列网络,根据局部相似性保留相邻阶段网络之间的连接。使用张量来描述这个时间序列网络的连接,并提出了一种三阶张量分解方法来捕捉每个网络快照的拓扑信息和整个网络的时间序列特征。QIGTD也是一种无需学习且高效的方法,可应用于样本数量较少的数据集。
使用肺腺癌(LUAD)数据集评估了QIGTD的有效性,并采用了三种最先进的方法:T度、T紧密性和T中介性作为基准方法。数值实验结果表明,QIGTD在精度和平均精度均值(mAP)指标方面均优于这些方法。值得注意的是,在前50个基因中,根据DisGeNET数据库,有29个已被证实与LUAD高度相关,并且有36个在与LUAD相关的基因本体(GO)术语中显著富集,包括核分裂、有丝分裂核分裂、染色体分离、细胞器分裂和有丝分裂姐妹染色单体分离。
总之,QIGTD有效地捕捉了基因网络中的时间变化并识别了关键基因。它为研究生物网络中的时间动态提供了一个有价值的工具,并有助于理解诸如LUAD等疾病的潜在机制。