• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

mastR:一个用于在多组差异表达分析中自动识别组织特异性基因特征的R包。

mastR: an R package for automated identification of tissue-specific gene signatures in multi-group differential expression analysis.

作者信息

Chen Jinjin, Mohamed Ahmed, Bhuva Dharmesh D, Davis Melissa J, Tan Chin Wee

机构信息

Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC 3052, Australia.

Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC 3010, Australia.

出版信息

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf114.

DOI:10.1093/bioinformatics/btaf114
PMID:40098239
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11937977/
Abstract

MOTIVATION

Biomarker discovery is important and offers insight into potential underlying mechanisms of disease. While existing biomarker identification methods primarily focus on single cell RNA sequencing (scRNA-seq) data, there remains a need for automated methods designed for labeled bulk RNA-seq data from sorted cell populations or experiments. Current methods require curation of results or statistical thresholds and may not account for tissue background expression. Here we bridge these limitations with an automated marker identification method for labeled bulk RNA-seq data that explicitly considers background expressions.

RESULTS

We developed mastR, a novel tool for accurate marker identification using transcriptomic data. It leverages robust statistical pipelines like edgeR and limma to perform pairwise comparisons between groups, and aggregates results using rank-product-based permutation test. A signal-to-noise ratio approach is implemented to minimize background signals. We assessed the performance of mastR-derived NK cell signatures against published curated signatures and found that the mastR-derived signature performs as well, if not better than the published signatures. We further demonstrated the utility of mastR on simulated scRNA-seq data and in comparison with Seurat in terms of marker selection performance.

AVAILABILITY AND IMPLEMENTATION

mastR is freely available from https://bioconductor.org/packages/release/bioc/html/mastR.html. A vignette and guide are available at https://davislaboratory.github.io/mastR. All statistical analyses were carried out using R (version ≥4.3.0) and Bioconductor (version ≥3.17).

摘要

动机

生物标志物的发现很重要,它能深入了解疾病潜在的机制。虽然现有的生物标志物识别方法主要集中在单细胞RNA测序(scRNA-seq)数据上,但对于为来自分选细胞群体或实验的标记批量RNA测序数据设计的自动化方法仍有需求。目前的方法需要对结果进行整理或设定统计阈值,而且可能没有考虑组织背景表达。在这里,我们通过一种用于标记批量RNA测序数据的自动化标记识别方法克服了这些局限性,该方法明确考虑了背景表达。

结果

我们开发了mastR,这是一种利用转录组数据进行准确标记识别的新型工具。它利用edgeR和limma等强大的统计流程在组间进行成对比较,并使用基于秩乘积的置换检验汇总结果。采用信噪比方法来最小化背景信号。我们将mastR衍生的自然杀伤细胞特征与已发表的经过整理的特征进行了性能评估,发现mastR衍生的特征即使不比已发表的特征更好,也表现得一样好。我们进一步展示了mastR在模拟scRNA-seq数据上的效用,并在标记选择性能方面与Seurat进行了比较。

可用性和实现方式

mastR可从https://bioconductor.org/packages/release/bioc/html/mastR.html免费获取。在https://davislaboratory.github.io/mastR上可获取一个 vignette 和指南。所有统计分析均使用R(版本≥4.3.0)和Bioconductor(版本≥3.17)进行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a13/11937977/3182158ff4d1/btaf114f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a13/11937977/3182158ff4d1/btaf114f1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a13/11937977/3182158ff4d1/btaf114f1.jpg

相似文献

1
mastR: an R package for automated identification of tissue-specific gene signatures in multi-group differential expression analysis.mastR:一个用于在多组差异表达分析中自动识别组织特异性基因特征的R包。
Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf114.
2
A Comprehensive Survey of Statistical Approaches for Differential Expression Analysis in Single-Cell RNA Sequencing Studies.单细胞 RNA 测序研究中差异表达分析的统计方法综合综述。
Genes (Basel). 2021 Dec 2;12(12):1947. doi: 10.3390/genes12121947.
3
Using RNentropy to Detect Significant Variation in Gene Expression Across Multiple RNA-Seq or Single-Cell RNA-Seq Samples.使用 RNentropy 检测多个 RNA-Seq 或单细胞 RNA-Seq 样本中基因表达的显著变化。
Methods Mol Biol. 2021;2284:77-96. doi: 10.1007/978-1-0716-1307-8_6.
4
Improving replicability in single-cell RNA-Seq cell type discovery with Dune.利用 Dune 提高单细胞 RNA-Seq 细胞类型发现的可重复性。
BMC Bioinformatics. 2024 May 24;25(1):198. doi: 10.1186/s12859-024-05814-6.
5
easyRNASeq: a bioconductor package for processing RNA-Seq data.easyRNASeq:一个用于处理 RNA-Seq 数据的 Bioconductor 软件包。
Bioinformatics. 2012 Oct 1;28(19):2532-3. doi: 10.1093/bioinformatics/bts477. Epub 2012 Jul 30.
6
DIMM-SC: a Dirichlet mixture model for clustering droplet-based single cell transcriptomic data.DIMM-SC:一种基于 Dirichlet 混合模型的用于聚类基于液滴的单细胞转录组学数据的方法。
Bioinformatics. 2018 Jan 1;34(1):139-146. doi: 10.1093/bioinformatics/btx490.
7
Evaluation of Cell Type Annotation R Packages on Single-cell RNA-seq Data.单细胞 RNA-seq 数据中细胞类型注释 R 包评估。
Genomics Proteomics Bioinformatics. 2021 Apr;19(2):267-281. doi: 10.1016/j.gpb.2020.07.004. Epub 2020 Dec 24.
8
Random forest based similarity learning for single cell RNA sequencing data.基于随机森林的单细胞 RNA 测序数据相似性学习。
Bioinformatics. 2018 Jul 1;34(13):i79-i88. doi: 10.1093/bioinformatics/bty260.
9
adverSCarial: assessing the vulnerability of single-cell RNA-sequencing classifiers to adversarial attacks.对抗性攻击:评估单细胞RNA测序分类器对对抗性攻击的脆弱性
Bioinformatics. 2025 Mar 29;41(4). doi: 10.1093/bioinformatics/btaf168.
10
scShapes: a statistical framework for identifying distribution shapes in single-cell RNA-sequencing data.scShapes:单细胞 RNA 测序数据中识别分布形状的统计框架。
Gigascience. 2022 Dec 28;12. doi: 10.1093/gigascience/giac126. Epub 2023 Jan 24.

本文引用的文献

1
Best practices for single-cell analysis across modalities.多模态单细胞分析的最佳实践。
Nat Rev Genet. 2023 Aug;24(8):550-572. doi: 10.1038/s41576-023-00586-w. Epub 2023 Mar 31.
2
Benchmarking integration of single-cell differential expression.单细胞差异表达整合的基准测试
Nat Commun. 2023 Mar 21;14(1):1570. doi: 10.1038/s41467-023-37126-3.
3
Distinguishing cell-cell complexes from dual lineage cells using single-cell transcriptomics is not trivial.使用单细胞转录组学区分细胞-细胞复合物和双谱系细胞并非易事。
Cytometry A. 2022 Jul;101(7):547-551. doi: 10.1002/cyto.a.24656. Epub 2022 May 20.
4
Development of computational models using omics data for the identification of effective cancer metabolic biomarkers.利用组学数据开发计算模型以识别有效的癌症代谢生物标志物。
Mol Omics. 2021 Dec 6;17(6):881-893. doi: 10.1039/d1mo00337b.
5
Confronting false discoveries in single-cell differential expression.单细胞差异表达中虚假发现的应对策略。
Nat Commun. 2021 Sep 28;12(1):5692. doi: 10.1038/s41467-021-25960-2.
6
Integrated analysis of multimodal single-cell data.多模态单细胞数据的综合分析。
Cell. 2021 Jun 24;184(13):3573-3587.e29. doi: 10.1016/j.cell.2021.04.048. Epub 2021 May 31.
7
Identification of cell-type-specific marker genes from co-expression patterns in tissue samples.从组织样本的共表达模式中鉴定细胞类型特异性标记基因。
Bioinformatics. 2021 Oct 11;37(19):3228-3234. doi: 10.1093/bioinformatics/btab257.
8
A Detailed Catalogue of Multi-Omics Methodologies for Identification of Putative Biomarkers and Causal Molecular Networks in Translational Cancer Research.一种详细的多组学方法目录,用于鉴定转化癌症研究中的潜在生物标志物和因果分子网络。
Int J Mol Sci. 2021 Mar 10;22(6):2822. doi: 10.3390/ijms22062822.
9
Computational resources for identification of cancer biomarkers from omics data.从组学数据中鉴定癌症生物标志物的计算资源。
Brief Funct Genomics. 2021 Jul 17;20(4):213-222. doi: 10.1093/bfgp/elab021.
10
PanglaoDB: a web server for exploration of mouse and human single-cell RNA sequencing data.PanglaoDB:一个用于探索小鼠和人类单细胞 RNA 测序数据的网络服务器。
Database (Oxford). 2019 Jan 1;2019. doi: 10.1093/database/baz046.