Suppr超能文献

基于梯度提升神经网络和可解释提升机的细胞间通讯分析中潜在配体-受体相互作用的识别。

Identifying potential ligand-receptor interactions based on gradient boosted neural network and interpretable boosting machine for intercellular communication analysis.

机构信息

College of Life Science and Chemistry, Hunan University of Technology, Zhuzhou, 412007, Hunan, China.

School of Computer Science and Engineering, Hunan Institute of Technology, Hengyang, 421002, Hunan, China.

出版信息

Comput Biol Med. 2024 Mar;171:108110. doi: 10.1016/j.compbiomed.2024.108110. Epub 2024 Feb 6.

Abstract

Cell-cell communication is essential to many key biological processes. Intercellular communication is generally mediated by ligand-receptor interactions (LRIs). Thus, building a comprehensive and high-quality LRI resource can significantly improve intercellular communication analysis. Meantime, due to lack of a "gold standard" dataset, it remains a challenge to evaluate LRI-mediated intercellular communication results. Here, we introduce CellGiQ, a high-confident LRI prediction framework for intercellular communication analysis. Highly confident LRIs are first inferred by LRI feature extraction with BioTriangle, LRI selection using LightGBM, and LRI classification based on ensemble of gradient boosted neural network and interpretable boosting machine. Subsequently, known and identified high-confident LRIs are filtered by combining single-cell RNA sequencing (scRNA-seq) data and further applied to intercellular communication inference through a quartile scoring strategy. To validation the predictions, CellGiQ exploited several evaluation strategies: using AUC and AUPR, it surpassed six competing LRI prediction models on four LRI datasets; through Venn diagrams and molecular docking, its predicted LRIs were validated by five other popular intercellular communication inference methods; based on the overlapping LRIs, it computed high Jaccard index with six other state-of-the-art intercellular communication prediction tools within human HNSCC tissues; by comparing with classical models and literature retrieve, its inferred HNSCC-related intercellular communication results was further validated. The novelty of this study is to identify high-confident LRIs based on machine learning as well as design several LRI validation ways, providing reference for computational LRI prediction. CellGiQ provides an open-source and useful tool to decompose LRI-mediated intercellular communication at single cell resolution. CellGiQ is freely available at https://github.com/plhhnu/CellGiQ.

摘要

细胞间通讯对于许多关键的生物过程至关重要。细胞间通讯通常由配体-受体相互作用(LRIs)介导。因此,构建一个全面、高质量的 LRI 资源可以显著改善细胞间通讯分析。同时,由于缺乏“金标准”数据集,评估 LRI 介导的细胞间通讯结果仍然是一个挑战。在这里,我们引入了 CellGiQ,这是一个用于细胞间通讯分析的高置信 LRI 预测框架。首先,通过使用 BioTriangle 进行 LRI 特征提取、使用 LightGBM 进行 LRI 选择以及基于梯度提升神经网络和可解释提升机集成的 LRI 分类,来推断高置信 LRI。随后,通过结合单细胞 RNA 测序(scRNA-seq)数据,对已知和鉴定的高置信 LRI 进行过滤,并通过四分位评分策略进一步应用于细胞间通讯推断。为了验证预测结果,CellGiQ 利用了几种评估策略:使用 AUC 和 AUPR,它在四个 LRI 数据集上优于六个竞争的 LRI 预测模型;通过 Venn 图和分子对接,它预测的 LRI 被其他五种流行的细胞间通讯推断方法验证;基于重叠的 LRI,它与六种其他最先进的细胞间通讯预测工具在人类 HNSCC 组织中计算了高 Jaccard 指数;通过与经典模型和文献检索的比较,它推断的 HNSCC 相关细胞间通讯结果得到了进一步验证。本研究的新颖之处在于基于机器学习识别高置信 LRI,并设计了几种 LRI 验证方法,为计算 LRI 预测提供了参考。CellGiQ 提供了一个开源且有用的工具,可以在单细胞分辨率下分解 LRI 介导的细胞间通讯。CellGiQ 可在 https://github.com/plhhnu/CellGiQ 上免费获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验