• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

用于综合细胞类型注释的分箱多项逻辑回归。

Binned multinomial logistic regression for integrative cell-type annotation.

作者信息

Motwani Keshav, Bacher Rhonda, Molstad Aaron J

机构信息

Department of Biostatistics, University of Washington.

Department of Biostatistics, University of Florida.

出版信息

Ann Appl Stat. 2023 Dec;17(4):3426-3449. doi: 10.1214/23-aoas1769.

DOI:10.1214/23-aoas1769
PMID:40206429
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11981643/
Abstract

Categorizing individual cells into one of many known cell type categories, also known as cell type annotation, is a critical step in the analysis of single-cell genomics data. The current process of annotation is time-intensive and subjective, which has led to different studies describing cell types with labels of varying degrees of resolution. While supervised learning approaches have provided automated solutions to annotation, there remains a significant challenge in fitting a unified model for multiple datasets with inconsistent labels. In this article, we propose a new multinomial logistic regression estimator which can be used to model cell type probabilities by integrating multiple datasets with labels of varying resolution. To compute our estimator, we solve a nonconvex optimization problem using a blockwise proximal gradient descent algorithm. We show through simulation studies that our approach estimates cell type probabilities more accurately than competitors in a wide variety of scenarios. We apply our method to ten single-cell RNA-seq datasets and demonstrate its utility in predicting fine resolution cell type labels on unlabeled data as well as refining cell type labels on data with existing coarse resolution annotations. Finally, we demonstrate that our method can lead to novel scientific insights in the context of a differential expression analysis comparing peripheral blood gene expression before and after treatment with interferon- . An R package implementing the method is available at https://github.com/keshav-motwani/IBMR and the collection of datasets we analyze is available at https://github.com/keshav-motwani/AnnotatedPBMC.

摘要

将单个细胞归类到众多已知细胞类型类别中的一种,即细胞类型注释,是单细胞基因组学数据分析中的关键步骤。当前的注释过程既耗时又主观,这导致不同研究使用分辨率不同的标签来描述细胞类型。虽然监督学习方法为注释提供了自动化解决方案,但在为标签不一致的多个数据集拟合统一模型方面仍存在重大挑战。在本文中,我们提出了一种新的多项逻辑回归估计器,它可通过整合具有不同分辨率标签的多个数据集来对细胞类型概率进行建模。为了计算我们的估计器,我们使用块近端梯度下降算法解决一个非凸优化问题。我们通过模拟研究表明,在各种场景下,我们的方法比其他方法更准确地估计细胞类型概率。我们将我们的方法应用于十个单细胞RNA测序数据集,并展示了其在预测未标记数据上的高分辨率细胞类型标签以及细化具有现有粗分辨率注释的数据上的细胞类型标签方面的效用。最后,我们证明了我们的方法在比较干扰素治疗前后外周血基因表达的差异表达分析背景下能够带来新的科学见解。实现该方法的R包可在https://github.com/keshav-motwani/IBMR获取,我们分析的数据集集合可在https://github.com/keshav-motwani/AnnotatedPBMC获取。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/d19a20e19445/nihms-2069953-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/06457d3c1a08/nihms-2069953-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/dbccd3b05c28/nihms-2069953-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/f9b9d2745ef9/nihms-2069953-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/99f1bf5c4d38/nihms-2069953-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/0b2104f18f21/nihms-2069953-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/ac55f4e4a588/nihms-2069953-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/d19a20e19445/nihms-2069953-f0007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/06457d3c1a08/nihms-2069953-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/dbccd3b05c28/nihms-2069953-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/f9b9d2745ef9/nihms-2069953-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/99f1bf5c4d38/nihms-2069953-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/0b2104f18f21/nihms-2069953-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/ac55f4e4a588/nihms-2069953-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/dbda/11981643/d19a20e19445/nihms-2069953-f0007.jpg

相似文献

1
Binned multinomial logistic regression for integrative cell-type annotation.用于综合细胞类型注释的分箱多项逻辑回归。
Ann Appl Stat. 2023 Dec;17(4):3426-3449. doi: 10.1214/23-aoas1769.
2
scGAD: a new task and end-to-end framework for generalized cell type annotation and discovery.scGAD:用于广义细胞类型注释和发现的新任务和端到端框架。
Brief Bioinform. 2023 Mar 19;24(2). doi: 10.1093/bib/bbad045.
3
Multiresolution categorical regression for interpretable cell-type annotation.多分辨率分类回归用于可解释的细胞类型注释。
Biometrics. 2023 Dec;79(4):3485-3496. doi: 10.1111/biom.13926. Epub 2023 Oct 5.
4
CALLR: a semi-supervised cell-type annotation method for single-cell RNA sequencing data.CALLR:一种用于单细胞 RNA 测序数据的半监督细胞类型注释方法。
Bioinformatics. 2021 Jul 12;37(Suppl_1):i51-i58. doi: 10.1093/bioinformatics/btab286.
5
scPLAN: a hierarchical computational framework for single transcriptomics data annotation, integration and cell-type label refinement.scPLAN:一种用于单细胞转录组学数据注释、整合和细胞类型标签细化的分层计算框架。
Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae305.
6
Continually adapting pre-trained language model to universal annotation of single-cell RNA-seq data.持续调整预先训练的语言模型,以实现单细胞 RNA-seq 数据的通用注释。
Brief Bioinform. 2024 Jan 22;25(2). doi: 10.1093/bib/bbae047.
7
A machine learning-based method for automatically identifying novel cells in annotating single-cell RNA-seq data.基于机器学习的方法,用于自动识别注释单细胞 RNA-seq 数据中的新型细胞。
Bioinformatics. 2022 Oct 31;38(21):4885-4892. doi: 10.1093/bioinformatics/btac617.
8
CIForm as a Transformer-based model for cell-type annotation of large-scale single-cell RNA-seq data.CIForm 作为一种基于 Transformer 的模型,用于大规模单细胞 RNA-seq 数据的细胞类型注释。
Brief Bioinform. 2023 Jul 20;24(4). doi: 10.1093/bib/bbad195.
9
TransAnno-Net: A Deep Learning Framework for Accurate Cell Type Annotation of Mouse Lung Tissue Using Self-supervised Pretraining.TransAnno-Net:一种使用自监督预训练对小鼠肺组织进行准确细胞类型注释的深度学习框架。
Comput Methods Programs Biomed. 2025 Jul;267:108809. doi: 10.1016/j.cmpb.2025.108809. Epub 2025 Apr 24.
10
scRGCL: a cell type annotation method for single-cell RNA-seq data using residual graph convolutional neural network with contrastive learning.scRGCL:一种使用带有对比学习的残差图卷积神经网络对单细胞RNA测序数据进行细胞类型注释的方法。
Brief Bioinform. 2024 Nov 22;26(1). doi: 10.1093/bib/bbae662.

引用本文的文献

1
Mapping Cell Identity from scRNA-seq: A primer on computational methods.从单细胞RNA测序映射细胞身份:计算方法入门
Comput Struct Biotechnol J. 2025 Apr 2;27:1559-1569. doi: 10.1016/j.csbj.2025.03.051. eCollection 2025.

本文引用的文献

1
Dimension reduction for integrative survival analysis.整合生存分析的降维。
Biometrics. 2023 Sep;79(3):1610-1623. doi: 10.1111/biom.13736. Epub 2022 Oct 17.
2
Benchmarking atlas-level data integration in single-cell genomics.单细胞基因组学中图谱级数据整合的基准测试。
Nat Methods. 2022 Jan;19(1):41-50. doi: 10.1038/s41592-021-01336-8. Epub 2021 Dec 23.
3
Evaluation of some aspects in supervised cell type identification for single-cell RNA-seq: classifier, feature selection, and reference construction.评价单细胞 RNA-seq 中监督细胞类型识别的一些方面:分类器、特征选择和参考构建。
Genome Biol. 2021 Sep 9;22(1):264. doi: 10.1186/s13059-021-02480-2.
4
Integration of survival data from multiple studies.整合来自多项研究的生存数据。
Biometrics. 2022 Dec;78(4):1365-1376. doi: 10.1111/biom.13517. Epub 2021 Sep 16.
5
Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。
Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.
6
Single-cell multi-omics analysis of the immune response in COVID-19.单细胞多组学分析 COVID-19 中的免疫反应。
Nat Med. 2021 May;27(5):904-916. doi: 10.1038/s41591-021-01329-2. Epub 2021 Apr 20.
7
Gene Set Knowledge Discovery with Enrichr.基因集知识发现与 Enrichr
Curr Protoc. 2021 Mar;1(3):e90. doi: 10.1002/cpz1.90.
8
Time-resolved systems immunology reveals a late juncture linked to fatal COVID-19.时间分辨系统免疫学揭示了与致命 COVID-19 相关的晚期节点。
Cell. 2021 Apr 1;184(7):1836-1857.e22. doi: 10.1016/j.cell.2021.02.018. Epub 2021 Feb 10.
9
Automated methods for cell type annotation on scRNA-seq data.单细胞RNA测序(scRNA-seq)数据细胞类型注释的自动化方法。
Comput Struct Biotechnol J. 2021 Jan 19;19:961-969. doi: 10.1016/j.csbj.2021.01.015. eCollection 2021.
10
Genome-wide association study implicates novel loci and reveals candidate effector genes for longitudinal pediatric bone accrual.全基因组关联研究提示新的位点,并揭示了纵向儿童骨骼积累的候选效应基因。
Genome Biol. 2021 Jan 4;22(1):1. doi: 10.1186/s13059-020-02207-9.