Shrestha Bishal, Siciliano Andrew Jordan, Zhu Hao, Liu Tong, Wang Zheng
Department of Computer Science, University of Miami, Coral Gables, FL 33146, United States.
Department of Computer Science, Florida Memorial University, Miami Gardens, FL 33504, United States.
NAR Genom Bioinform. 2025 Jan 27;7(1):lqaf002. doi: 10.1093/nargab/lqaf002. eCollection 2025 Mar.
A novel biochemistry experiment named HiRES has been developed to capture both the chromosomal conformations and gene expression levels of individual single cells simultaneously. Nevertheless, when compared to the extensive volume of single-cell Hi-C data generated from individual cells, the number of datasets produced from this experiment remains limited in the scientific community. Hence, there is a requirement for a computational tool that can forecast the levels of gene expression in individual cells using single-cell Hi-C data from the same cells. We trained a graph transformer called scHiGex that accurately and effectively predicts gene expression levels based on single-cell Hi-C data. We conducted a benchmark of scHiGex that demonstrated notable performance on the predictions with an average absolute error of 0.07. Furthermore, the predicted levels of gene expression led to precise categorizations (adjusted Rand index score 1) of cells into distinct cell types, demonstrating that our model effectively captured the heterogeneity between individual cell types. scHiGex is freely available at https://github.com/zwang-bioinformatics/scHiGex.
一种名为HiRES的新型生物化学实验已被开发出来,用于同时捕获单个单细胞的染色体构象和基因表达水平。然而,与从单个细胞生成的大量单细胞Hi-C数据相比,该实验产生的数据集数量在科学界仍然有限。因此,需要一种计算工具,能够使用来自同一细胞的单细胞Hi-C数据预测单个细胞中的基因表达水平。我们训练了一种名为scHiGex的图变换器,它基于单细胞Hi-C数据准确有效地预测基因表达水平。我们对scHiGex进行了基准测试,结果表明其预测性能显著,平均绝对误差为0.07。此外,预测的基因表达水平导致细胞被精确分类(调整后的兰德指数评分为1)为不同的细胞类型,这表明我们的模型有效地捕捉了单个细胞类型之间的异质性。scHiGex可在https://github.com/zwang-bioinformatics/scHiGex上免费获取。