• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

马可波罗法:一种无需依赖于先前聚类即可在单细胞 RNA-seq 数据中发现差异表达基因的方法。

MarcoPolo: a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering.

机构信息

Department of Electrical and Computer Engineering, Seoul National University, Seoul, Republic of Korea.

Paul G. Allen School of Computer Science and Engineering, University of Washington, Seattle, WA, USA.

出版信息

Nucleic Acids Res. 2022 Jul 8;50(12):e71. doi: 10.1093/nar/gkac216.

DOI:10.1093/nar/gkac216
PMID:35420135
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9262626/
Abstract

The standard analysis pipeline for single-cell RNA-seq data consists of sequential steps initiated by clustering the cells. An innate limitation of this pipeline is that an imperfect clustering result can irreversibly affect the succeeding steps. For example, there can be cell types not well distinguished by clustering because they largely share the global structure, such as the anterior primitive streak and mid primitive streak cells. If one searches differentially expressed genes (DEGs) solely based on clustering, marker genes for distinguishing these types will be missed. Moreover, clustering depends on many parameters and can often be subjective to manual decisions. To overcome these limitations, we propose MarcoPolo, a method that identifies informative DEGs independently of prior clustering. MarcoPolo sorts out genes by evaluating if the distributions are bimodal, if similar expression patterns are observed in other genes, and if the expressing cells are proximal in a low-dimensional space. Using real datasets with FACS-purified cell labels, we demonstrate that MarcoPolo recovers marker genes better than competing methods. Notably, MarcoPolo finds key genes that can distinguish cell types that are not distinguishable by the standard clustering. MarcoPolo is built in a convenient software package that provides analysis results in an HTML file.

摘要

单细胞 RNA-seq 数据的标准分析流程包括通过对细胞进行聚类来启动的一系列步骤。该流程存在一个固有缺陷,即聚类结果不理想可能会不可逆地影响后续步骤。例如,由于它们在很大程度上共享全局结构,因此某些细胞类型可能无法通过聚类很好地区分,例如前原条带和中胚层原条带细胞。如果仅基于聚类来搜索差异表达基因 (DEG),则会错过用于区分这些类型的标记基因。此外,聚类取决于许多参数,并且通常容易受到手动决策的影响。为了克服这些限制,我们提出了 MarcoPolo 方法,该方法可以在不依赖于先前聚类的情况下识别信息丰富的 DEG。MarcoPolo 通过评估基因的分布是否呈双峰分布、其他基因是否观察到相似的表达模式以及在低维空间中表达细胞是否接近来对基因进行排序。使用具有 FACS 纯化细胞标签的真实数据集,我们证明了 MarcoPolo 比竞争方法更好地恢复了标记基因。值得注意的是,MarcoPolo 发现了可以区分无法通过标准聚类区分的细胞类型的关键基因。MarcoPolo 构建在一个方便的软件包中,该软件包以 HTML 文件形式提供分析结果。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/7df2f00f5044/gkac216fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/ca6f4ae0b616/gkac216fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/0698296d8e50/gkac216fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/b87ed8e0a55c/gkac216fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/3a8126b7e3a9/gkac216fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/6a53ed9ae653/gkac216fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/88d77cb3198c/gkac216fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/d13209d272e6/gkac216fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/7df2f00f5044/gkac216fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/ca6f4ae0b616/gkac216fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/0698296d8e50/gkac216fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/b87ed8e0a55c/gkac216fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/3a8126b7e3a9/gkac216fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/6a53ed9ae653/gkac216fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/88d77cb3198c/gkac216fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/d13209d272e6/gkac216fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3ebd/9262626/7df2f00f5044/gkac216fig8.jpg

相似文献

1
MarcoPolo: a method to discover differentially expressed genes in single-cell RNA-seq data without depending on prior clustering.马可波罗法:一种无需依赖于先前聚类即可在单细胞 RNA-seq 数据中发现差异表达基因的方法。
Nucleic Acids Res. 2022 Jul 8;50(12):e71. doi: 10.1093/nar/gkac216.
2
scMEB: a fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data.scMEB:一种在单细胞 RNA-seq 数据中检测差异表达基因的快速且与聚类无关的方法。
BMC Genomics. 2023 May 25;24(1):280. doi: 10.1186/s12864-023-09374-6.
3
Using RNentropy to Detect Significant Variation in Gene Expression Across Multiple RNA-Seq or Single-Cell RNA-Seq Samples.使用 RNentropy 检测多个 RNA-Seq 或单细胞 RNA-Seq 样本中基因表达的显著变化。
Methods Mol Biol. 2021;2284:77-96. doi: 10.1007/978-1-0716-1307-8_6.
4
scHFC: a hybrid fuzzy clustering method for single-cell RNA-seq data optimized by natural computation.scHFC:一种基于自然计算优化的单细胞 RNA-seq 数据的混合模糊聚类方法。
Brief Bioinform. 2022 Mar 10;23(2). doi: 10.1093/bib/bbab588.
5
A parameter-free deep embedded clustering method for single-cell RNA-seq data.一种无参数深度嵌入聚类方法,用于单细胞 RNA-seq 数据。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac172.
6
Polled Digital Cell Sorter (p-DCS): Automatic identification of hematological cell types from single cell RNA-sequencing clusters.流式数字细胞分选仪(p-DCS):从单细胞 RNA 测序簇中自动识别血细胞类型。
BMC Bioinformatics. 2019 Jul 1;20(1):369. doi: 10.1186/s12859-019-2951-x.
7
Computational Analysis of Single-Cell RNA-Seq Data.单细胞 RNA-Seq 数据分析的计算方法。
Methods Mol Biol. 2021;2284:289-301. doi: 10.1007/978-1-0716-1307-8_16.
8
GE-Impute: graph embedding-based imputation for single-cell RNA-seq data.GE-Impute:基于图嵌入的单细胞 RNA-seq 数据插补。
Brief Bioinform. 2022 Sep 20;23(5). doi: 10.1093/bib/bbac313.
9
Dimension Reduction and Clustering Models for Single-Cell RNA Sequencing Data: A Comparative Study.降维与聚类模型在单细胞 RNA 测序数据中的应用:一项比较研究。
Int J Mol Sci. 2020 Mar 22;21(6):2181. doi: 10.3390/ijms21062181.
10
A deep matrix factorization based approach for single-cell RNA-seq data clustering.基于深度矩阵分解的单细胞 RNA-seq 数据聚类方法。
Methods. 2022 Sep;205:114-122. doi: 10.1016/j.ymeth.2022.06.010. Epub 2022 Jun 28.

引用本文的文献

1
Cluster-independent multiscale marker identification in single-cell RNA-seq data using localized marker detector (LMD).使用局部标记检测器(LMD)在单细胞RNA测序数据中进行独立于聚类的多尺度标记识别。
Commun Biol. 2025 Jul 16;8(1):1058. doi: 10.1038/s42003-025-08485-y.
2
SciGeneX: enhancing transcriptional analysis through gene module detection in single-cell and spatial transcriptomics data.SciGeneX:通过在单细胞和空间转录组学数据中进行基因模块检测来增强转录分析。
NAR Genom Bioinform. 2025 Apr 17;7(2):lqaf043. doi: 10.1093/nargab/lqaf043. eCollection 2025 Jun.
3
Heterogeneity-preserving discriminative feature selection for disease-specific subtype discovery.

本文引用的文献

1
A clustering-independent method for finding differentially expressed genes in single-cell transcriptome data.一种用于在单细胞转录组数据中寻找差异表达基因的聚类无关方法。
Nat Commun. 2020 Aug 28;11(1):4318. doi: 10.1038/s41467-020-17900-3.
2
Demystifying "drop-outs" in single-cell UMI data.破解单细胞 UMI 数据中的“dropout”现象。
Genome Biol. 2020 Aug 6;21(1):196. doi: 10.1186/s13059-020-02096-y.
3
Eleven grand challenges in single-cell data science.单细胞数据科学的 11 大挑战。
用于疾病特异性亚型发现的保持异质性的判别特征选择
Nat Commun. 2025 Apr 16;16(1):3593. doi: 10.1038/s41467-025-58718-1.
4
Single-cell omics: experimental workflow, data analyses and applications.单细胞组学:实验工作流程、数据分析及应用
Sci China Life Sci. 2025 Jan;68(1):5-102. doi: 10.1007/s11427-023-2561-0. Epub 2024 Jul 23.
5
Heterogeneity-Preserving Discriminative Feature Selection for Disease-Specific Subtype Discovery.用于疾病特异性亚型发现的保持异质性的判别特征选择
bioRxiv. 2025 Mar 5:2023.05.14.540686. doi: 10.1101/2023.05.14.540686.
6
scapGNN: A graph neural network-based framework for active pathway and gene module inference from single-cell multi-omics data.scapGNN:一种基于图神经网络的框架,用于从单细胞多组学数据中推断活性途径和基因模块。
PLoS Biol. 2023 Nov 13;21(11):e3002369. doi: 10.1371/journal.pbio.3002369. eCollection 2023 Nov.
7
Normalization of the tumor microenvironment by harnessing vascular and immune modulation to achieve enhanced cancer therapy.通过利用血管和免疫调节使肿瘤微环境正常化,以实现增强的癌症治疗。
Exp Mol Med. 2023 Nov;55(11):2308-2319. doi: 10.1038/s12276-023-01114-w. Epub 2023 Nov 1.
8
A task-specific encoding algorithm for RNAs and RNA-associated interactions based on convolutional autoencoder.基于卷积自动编码器的 RNA 及其相关相互作用的特定任务编码算法。
Nucleic Acids Res. 2023 Nov 27;51(21):e110. doi: 10.1093/nar/gkad929.
9
Synthetic control removes spurious discoveries from double dipping in single-cell and spatial transcriptomics data analyses.合成控制法可消除单细胞和空间转录组学数据分析中双重检验带来的虚假发现。
bioRxiv. 2024 Dec 30:2023.07.21.550107. doi: 10.1101/2023.07.21.550107.
10
scMEB: a fast and clustering-independent method for detecting differentially expressed genes in single-cell RNA-seq data.scMEB:一种在单细胞 RNA-seq 数据中检测差异表达基因的快速且与聚类无关的方法。
BMC Genomics. 2023 May 25;24(1):280. doi: 10.1186/s12864-023-09374-6.
Genome Biol. 2020 Feb 7;21(1):31. doi: 10.1186/s13059-020-1926-6.
4
Benchmarking principal component analysis for large-scale single-cell RNA-sequencing.基于主成分分析的大规模单细胞 RNA-seq 基准测试
Genome Biol. 2020 Jan 20;21(1):9. doi: 10.1186/s13059-019-1900-3.
5
Droplet scRNA-seq is not zero-inflated.液滴单细胞RNA测序不存在零膨胀问题。
Nat Biotechnol. 2020 Feb;38(2):147-150. doi: 10.1038/s41587-019-0379-5.
6
Feature selection and dimension reduction for single-cell RNA-Seq based on a multinomial model.基于多项模型的单细胞 RNA-Seq 特征选择和降维。
Genome Biol. 2019 Dec 23;20(1):295. doi: 10.1186/s13059-019-1861-6.
7
Valid Post-clustering Differential Analysis for Single-Cell RNA-Seq.单细胞 RNA-Seq 的有效聚类后差异分析。
Cell Syst. 2019 Oct 23;9(4):383-392.e6. doi: 10.1016/j.cels.2019.07.012. Epub 2019 Sep 11.
8
Probabilistic cell-type assignment of single-cell RNA-seq for tumor microenvironment profiling.单细胞 RNA-seq 数据中肿瘤微环境细胞类型的概率分配。
Nat Methods. 2019 Oct;16(10):1007-1015. doi: 10.1038/s41592-019-0529-1. Epub 2019 Sep 9.
9
Current best practices in single-cell RNA-seq analysis: a tutorial.单细胞 RNA 测序分析的当前最佳实践:教程。
Mol Syst Biol. 2019 Jun 19;15(6):e8746. doi: 10.15252/msb.20188746.
10
Simulating multiple faceted variability in single cell RNA sequencing.模拟单细胞 RNA 测序中的多重多方面变异性。
Nat Commun. 2019 Jun 13;10(1):2611. doi: 10.1038/s41467-019-10500-w.