• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用平均重叠度量指导单细胞RNA测序中的聚类和注释

GUIDING CLUSTERING AND ANNOTATION IN SINGLE-CELL RNA SEQUENCING USING THE AVERAGE OVERLAP METRIC.

作者信息

Thai Christopher, Singh Amartya, Herranz Daniel, Khiabanian Hossein

机构信息

Rutgers Cancer Institute, Rutgers University, New Brunswick, NJ 08901, USA.

Center for Systems and Computational Biology, Rutgers Cancer Institute, Rutgers University, New Brunswick, NJ 08901, USA.

出版信息

bioRxiv. 2025 May 10:2025.05.06.652497. doi: 10.1101/2025.05.06.652497.

DOI:10.1101/2025.05.06.652497
PMID:40654835
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12247988/
Abstract

Defining cell types using unsupervised clustering algorithms based on transcriptional similarity is a powerful application of single-cell RNA sequencing. A single clustering resolution may not yield clusters that represent both broad, well-defined populations and smaller subpopulations simultaneously. Therefore, when cell identities are not known prior to sequencing, robust comparison and annotation of inferred clusters remains a challenge. In this work, we define the distance between single-cell clusters by proposing the use of the average overlap metric to compare ranked lists of differentially expressed genes in a top-weighted manner. We first benchmark our approach in a truth-known dataset comprised of highly similar yet distinct T-cell populations and show that evaluating clusters with average overlap results in a consistent, precise, and biologically meaningful recapitulation of true cell identities. We then apply our approach to data of unsorted mouse thymocytes and characterize stages of T-cell development in the thymus, including minor populations of double-negative (CD4-CD8-) T-cells that are notoriously difficult to confidently detect in unsorted single-cell data. We demonstrate that measuring cluster similarity with average overlap of marker gene rankings enables robust, reproducible characterization of single cells and clarifies biological interpretation of their underlying identities in highly homogeneous populations.

摘要

基于转录相似性,使用无监督聚类算法定义细胞类型是单细胞RNA测序的一项强大应用。单一的聚类分辨率可能无法同时产生代表广泛、明确界定的细胞群体和较小亚群的聚类。因此,当在测序之前细胞身份未知时,对推断出的聚类进行可靠的比较和注释仍然是一项挑战。在这项工作中,我们通过提议使用平均重叠度量以加权方式比较差异表达基因的排名列表,来定义单细胞聚类之间的距离。我们首先在一个由高度相似但又不同的T细胞群体组成的已知真值数据集中对我们的方法进行基准测试,并表明用平均重叠来评估聚类会产生对真实细胞身份的一致、精确且具有生物学意义的重现。然后,我们将我们的方法应用于未分选的小鼠胸腺细胞数据,并表征胸腺中T细胞发育的阶段,包括双阴性(CD4-CD8-)T细胞的少数群体,这些群体在未分选的单细胞数据中 notoriously difficult to confidently detect(很难可靠地检测到)。我们证明,用标记基因排名的平均重叠来测量聚类相似性能够对单细胞进行可靠、可重复的表征,并阐明它们在高度同质群体中潜在身份的生物学解释。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/497d/12247988/1e07bb53d0b2/nihpp-2025.05.06.652497v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/497d/12247988/f2be253a2edf/nihpp-2025.05.06.652497v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/497d/12247988/1e07bb53d0b2/nihpp-2025.05.06.652497v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/497d/12247988/f2be253a2edf/nihpp-2025.05.06.652497v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/497d/12247988/1e07bb53d0b2/nihpp-2025.05.06.652497v1-f0002.jpg

相似文献

1
GUIDING CLUSTERING AND ANNOTATION IN SINGLE-CELL RNA SEQUENCING USING THE AVERAGE OVERLAP METRIC.使用平均重叠度量指导单细胞RNA测序中的聚类和注释
bioRxiv. 2025 May 10:2025.05.06.652497. doi: 10.1101/2025.05.06.652497.
2
Audit and feedback: effects on professional practice.审核与反馈:对专业实践的影响
Cochrane Database Syst Rev. 2025 Mar 25;3(3):CD000259. doi: 10.1002/14651858.CD000259.pub4.
3
Quality improvement strategies for diabetes care: Effects on outcomes for adults living with diabetes.糖尿病护理质量改进策略:对成年糖尿病患者结局的影响。
Cochrane Database Syst Rev. 2023 May 31;5(5):CD014513. doi: 10.1002/14651858.CD014513.
4
New insights for precision treatment of glioblastoma from analysis of single-cell lncRNA expression.从单细胞 lncRNA 表达分析中获得胶质母细胞瘤精准治疗的新见解。
J Cancer Res Clin Oncol. 2021 Jul;147(7):1881-1895. doi: 10.1007/s00432-021-03584-9. Epub 2021 Mar 11.
5
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
6
The effect of sample site and collection procedure on identification of SARS-CoV-2 infection.样本采集部位和采集程序对严重急性呼吸综合征冠状病毒2(SARS-CoV-2)感染鉴定的影响。
Cochrane Database Syst Rev. 2024 Dec 16;12(12):CD014780. doi: 10.1002/14651858.CD014780.
7
Stabilizing machine learning for reproducible and explainable results: A novel validation approach to subject-specific insights.稳定机器学习以获得可重复和可解释的结果:一种针对特定个体见解的新型验证方法。
Comput Methods Programs Biomed. 2025 Jun 21;269:108899. doi: 10.1016/j.cmpb.2025.108899.
8
Automated Image-Based Wound Area Assessment in Outpatient Clinics Using Computer-Aided Methods: A Development and Validation Study.使用计算机辅助方法在门诊诊所进行基于图像的伤口面积自动评估:一项开发与验证研究。
Medicina (Kaunas). 2025 Jun 17;61(6):1099. doi: 10.3390/medicina61061099.
9
Variation within and between digital pathology and light microscopy for the diagnosis of histopathology slides: blinded crossover comparison study.数字病理学与光学显微镜检查在组织病理学切片诊断中的内部及相互间差异:双盲交叉对比研究
Health Technol Assess. 2025 Jul;29(30):1-75. doi: 10.3310/SPLK4325.
10
Stigma Management Strategies of Autistic Social Media Users.自闭症社交媒体用户的污名管理策略
Autism Adulthood. 2025 May 28;7(3):273-282. doi: 10.1089/aut.2023.0095. eCollection 2025 Jun.

本文引用的文献

1
Feature selection followed by a novel residuals-based normalization that includes variance stabilization simplifies and improves single-cell gene expression analysis.特征选择后采用一种新颖的基于残差的归一化方法,包括方差稳定化,可简化和改进单细胞基因表达分析。
BMC Bioinformatics. 2024 Jul 30;25(1):248. doi: 10.1186/s12859-024-05872-w.
2
Ranking of cell clusters in a single-cell RNA-sequencing analysis framework using prior knowledge.基于先验知识的单细胞 RNA 测序分析框架中的细胞簇排序。
PLoS Comput Biol. 2024 Apr 18;20(4):e1011550. doi: 10.1371/journal.pcbi.1011550. eCollection 2024 Apr.
3
A comparison of marker gene selection methods for single-cell RNA sequencing data.
单细胞 RNA 测序数据中标记基因选择方法的比较。
Genome Biol. 2024 Feb 26;25(1):56. doi: 10.1186/s13059-024-03183-0.
4
Mapping the two distinct proliferative bursts early in T-cell development.绘制 T 细胞发育早期两个不同的增殖峰。
Immunol Cell Biol. 2023 Sep;101(8):766-774. doi: 10.1111/imcb.12670. Epub 2023 Jul 19.
5
Dictionary learning for integrative, multimodal and scalable single-cell analysis.基于字典学习的综合、多模态和可扩展的单细胞分析。
Nat Biotechnol. 2024 Feb;42(2):293-304. doi: 10.1038/s41587-023-01767-y. Epub 2023 May 25.
6
Distinct subpopulations of DN1 thymocytes exhibit preferential γδ T lineage potential.DN1 胸腺细胞中的不同亚群表现出优先的 γδ T 谱系潜能。
Front Immunol. 2023 Apr 3;14:1106652. doi: 10.3389/fimmu.2023.1106652. eCollection 2023.
7
Fully-automated and ultra-fast cell-type identification using specific marker combinations from single-cell transcriptomic data.利用单细胞转录组数据中的特定标记组合进行全自动超快速细胞类型识别。
Nat Commun. 2022 Mar 10;13(1):1246. doi: 10.1038/s41467-022-28803-w.
8
Benchmarking clustering algorithms on estimating the number of cell types from single-cell RNA-sequencing data.基于单细胞 RNA 测序数据评估细胞类型数量的聚类算法基准测试。
Genome Biol. 2022 Feb 8;23(1):49. doi: 10.1186/s13059-022-02622-0.
9
From bulk, single-cell to spatial RNA sequencing.从批量、单细胞到空间 RNA 测序。
Int J Oral Sci. 2021 Nov 15;13(1):36. doi: 10.1038/s41368-021-00146-0.
10
Automated methods for cell type annotation on scRNA-seq data.单细胞RNA测序(scRNA-seq)数据细胞类型注释的自动化方法。
Comput Struct Biotechnol J. 2021 Jan 19;19:961-969. doi: 10.1016/j.csbj.2021.01.015. eCollection 2021.