Suppr超能文献

结合基因表达和基因嵌入的细胞间距离。

Cell-to-cell distance that combines gene expression and gene embeddings.

作者信息

Guo Fangfang, Gan Dailin, Li Jun

机构信息

Department of Applied and Computational Mathematics and Statistics, University of Notre Dame, Notre Dame, IN 46556, USA.

出版信息

Comput Struct Biotechnol J. 2024 Nov 4;23:3929-3937. doi: 10.1016/j.csbj.2024.10.044. eCollection 2024 Dec.

Abstract

The application of large-language models (LLMs) to single-cell gene-expression data has introduced a new type of data that includes a gene-embedding matrix, in addition to the experimentally obtained gene-expression matrix. This paper addresses a fundamental problem in analyzing such data: how to effectively combine the information from both matrices to better define cell-to-cell distance. We identify a computationally feasible solution that demonstrates superior ability to cluster cells of the same type across all six real datasets we tested, underscoring its advantage as a measure of cell-to-cell distance.

摘要

将大语言模型(LLMs)应用于单细胞基因表达数据引入了一种新型数据,除了通过实验获得的基因表达矩阵外,还包括一个基因嵌入矩阵。本文解决了分析此类数据中的一个基本问题:如何有效整合来自两个矩阵的信息,以更好地定义细胞间距离。我们确定了一种计算上可行的解决方案,该方案在我们测试的所有六个真实数据集上均表现出卓越的能力,能够将相同类型的细胞聚类在一起,凸显了其作为细胞间距离度量的优势。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbd2/11584677/d2c9138a65e5/gr001.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验