Suppr超能文献

THGB:通过组合树提升和基于直方图的梯度提升来预测配体-受体相互作用。

THGB: predicting ligand-receptor interactions by combining tree boosting and histogram-based gradient boosting.

机构信息

School of Computer Science, Hunan University of Technology, Zhuzhou, 412007, Hunan, China.

School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang, 421002, Hunan, China.

出版信息

Sci Rep. 2024 Nov 28;14(1):29604. doi: 10.1038/s41598-024-78954-7.

Abstract

Ligand-receptor interaction (LRI) prediction has great significance in biological and medical research and facilitates to infer and analyze cell-to-cell communication. However, wet experiments for new LRI discovery are costly and time-consuming. Here, we propose a computational model called THGB to uncover new LRIs. THGB first extracts feature information of Ligand-Receptor (LR) pairs using iFeature. Next, it adopts a tree boosting model to obtain representative LR features. Finally, it devises the histogram-based gradient boosting model to capture high-quality LRIs. To assess the THGB performance, we compared it with three new LRI prediction models (i.e., CellEnBoost, CellGiQ, and CellComNet) and one classical protein-protein interaction inference model PIPR. The results demonstrated that THGB achieved the best overall predictions in terms of six evaluation indictors (i.e., precision, recall, accuracy, F1-score, AUC, and AUPR). To measure the effect of LR feature selection on the prediction, THGB was compared with four feature selection methods (i.e., PCA, NMF, LLE, and TSVD). The results showed that the tree boosting model was more appropriate to select representative LR features and improve LRI prediction. We also conducted ablation study and found that THGB with feature selection outperformed THGB without feature selection. We hope that THGB is a useful tool to find new LRIs and further infer cell-to-cell communication.

摘要

配体-受体相互作用(LRI)预测在生物和医学研究中具有重要意义,有助于推断和分析细胞间通讯。然而,新的 LRI 发现的湿实验既昂贵又耗时。在这里,我们提出了一种称为 THGB 的计算模型来发现新的 LRI。THGB 首先使用 iFeature 提取配体-受体(LR)对的特征信息。接下来,它采用树提升模型来获得代表性的 LR 特征。最后,它设计了基于直方图的梯度提升模型来捕捉高质量的 LRI。为了评估 THGB 的性能,我们将其与三个新的 LRI 预测模型(即 CellEnBoost、CellGiQ 和 CellComNet)和一个经典的蛋白质-蛋白质相互作用推断模型 PIPR 进行了比较。结果表明,THGB 在六个评估指标(即精度、召回率、准确性、F1 得分、AUC 和 AUPR)方面达到了最佳的总体预测。为了衡量 LR 特征选择对预测的影响,THGB 与四种特征选择方法(即 PCA、NMF、LLE 和 TSVD)进行了比较。结果表明,树提升模型更适合选择代表性的 LR 特征并提高 LRI 预测。我们还进行了消融研究,发现具有特征选择的 THGB 优于没有特征选择的 THGB。我们希望 THGB 是发现新的 LRI 并进一步推断细胞间通讯的有用工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/072e/11604971/023c0119704e/41598_2024_78954_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验