Suppr超能文献

mastR:一个用于在多组差异表达分析中自动识别组织特异性基因特征的R包。

mastR: an R package for automated identification of tissue-specific gene signatures in multi-group differential expression analysis.

作者信息

Chen Jinjin, Mohamed Ahmed, Bhuva Dharmesh D, Davis Melissa J, Tan Chin Wee

机构信息

Bioinformatics Division, Walter and Eliza Hall Institute of Medical Research, Melbourne, VIC 3052, Australia.

Department of Medical Biology, Faculty of Medicine, Dentistry and Health Sciences, University of Melbourne, Parkville, VIC 3010, Australia.

出版信息

Bioinformatics. 2025 Mar 4;41(3). doi: 10.1093/bioinformatics/btaf114.

Abstract

MOTIVATION

Biomarker discovery is important and offers insight into potential underlying mechanisms of disease. While existing biomarker identification methods primarily focus on single cell RNA sequencing (scRNA-seq) data, there remains a need for automated methods designed for labeled bulk RNA-seq data from sorted cell populations or experiments. Current methods require curation of results or statistical thresholds and may not account for tissue background expression. Here we bridge these limitations with an automated marker identification method for labeled bulk RNA-seq data that explicitly considers background expressions.

RESULTS

We developed mastR, a novel tool for accurate marker identification using transcriptomic data. It leverages robust statistical pipelines like edgeR and limma to perform pairwise comparisons between groups, and aggregates results using rank-product-based permutation test. A signal-to-noise ratio approach is implemented to minimize background signals. We assessed the performance of mastR-derived NK cell signatures against published curated signatures and found that the mastR-derived signature performs as well, if not better than the published signatures. We further demonstrated the utility of mastR on simulated scRNA-seq data and in comparison with Seurat in terms of marker selection performance.

AVAILABILITY AND IMPLEMENTATION

mastR is freely available from https://bioconductor.org/packages/release/bioc/html/mastR.html. A vignette and guide are available at https://davislaboratory.github.io/mastR. All statistical analyses were carried out using R (version ≥4.3.0) and Bioconductor (version ≥3.17).

摘要

动机

生物标志物的发现很重要,它能深入了解疾病潜在的机制。虽然现有的生物标志物识别方法主要集中在单细胞RNA测序(scRNA-seq)数据上,但对于为来自分选细胞群体或实验的标记批量RNA测序数据设计的自动化方法仍有需求。目前的方法需要对结果进行整理或设定统计阈值,而且可能没有考虑组织背景表达。在这里,我们通过一种用于标记批量RNA测序数据的自动化标记识别方法克服了这些局限性,该方法明确考虑了背景表达。

结果

我们开发了mastR,这是一种利用转录组数据进行准确标记识别的新型工具。它利用edgeR和limma等强大的统计流程在组间进行成对比较,并使用基于秩乘积的置换检验汇总结果。采用信噪比方法来最小化背景信号。我们将mastR衍生的自然杀伤细胞特征与已发表的经过整理的特征进行了性能评估,发现mastR衍生的特征即使不比已发表的特征更好,也表现得一样好。我们进一步展示了mastR在模拟scRNA-seq数据上的效用,并在标记选择性能方面与Seurat进行了比较。

可用性和实现方式

mastR可从https://bioconductor.org/packages/release/bioc/html/mastR.html免费获取。在https://davislaboratory.github.io/mastR上可获取一个 vignette 和指南。所有统计分析均使用R(版本≥4.3.0)和Bioconductor(版本≥3.17)进行。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a13/11937977/3182158ff4d1/btaf114f1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验