Suppr超能文献

QOT:用于单细胞组学中样本级距离矩阵的量化最优传输

QOT: Quantized Optimal Transport for sample-level distance matrix in single-cell omics.

作者信息

Wang Zexuan, Zhan Qipeng, Yang Shu, Mu Shizhuo, Chen Jiong, Garai Sumita, Orzechowski Patryk, Wagenaar Joost, Shen Li

机构信息

Graduate Group in Applied Mathematics and Computational Science, University of Pennsylvania, Philadelphia, PA 19104, United States.

Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA 19104, United States.

出版信息

Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae713.

Abstract

Single-cell technologies have enabled the high-dimensional characterization of cell populations at an unprecedented scale. The innate complexity and increasing volume of data pose significant computational and analytical challenges, especially in comparative studies delineating cellular architectures across various biological conditions (i.e. generation of sample-level distance matrices). Optimal Transport is a mathematical tool that captures the intrinsic structure of data geometrically and has been applied to many bioinformatics tasks. In this paper, we propose QOT (Quantized Optimal Transport), a new method enabling efficient computation of sample-level distance matrix from large-scale single-cell omics data through a quantization step. We apply our algorithm to real-world single-cell genomics and pathomics datasets, aiming to extrapolate cell-level insights to inform sample-level categorizations. Our empirical study shows that QOT outperforms existing two OT-based algorithms in accuracy and robustness when obtaining a distance matrix from high throughput single-cell measures at the sample level. Moreover, the sample level distance matrix could be used in the downstream analysis (i.e. uncover the trajectory of disease progression), highlighting its usage in biomedical informatics and data science.

摘要

单细胞技术能够以前所未有的规模对细胞群体进行高维表征。数据固有的复杂性和体量的不断增加带来了重大的计算和分析挑战,尤其是在描绘不同生物学条件下细胞结构的比较研究中(即生成样本级距离矩阵)。最优传输是一种从几何角度捕捉数据内在结构的数学工具,已应用于许多生物信息学任务。在本文中,我们提出了量化最优传输(QOT),这是一种通过量化步骤从大规模单细胞组学数据高效计算样本级距离矩阵的新方法。我们将算法应用于真实世界的单细胞基因组学和病理组学数据集,旨在推断细胞水平的见解以指导样本级分类。我们的实证研究表明,在从样本水平的高通量单细胞测量中获取距离矩阵时,QOT在准确性和稳健性方面优于现有的两种基于最优传输的算法。此外,样本级距离矩阵可用于下游分析(即揭示疾病进展轨迹),突出了其在生物医学信息学和数据科学中的应用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fbe2/11962597/588196da08e4/bbae713f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验