使用Spoon解决空间分辨转录组学数据中的均值-方差关系。

Addressing the mean-variance relationship in spatially resolved transcriptomics data with spoon.

作者信息

Shah Kinnary, Guo Boyi, Hicks Stephanie C

机构信息

Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street, Baltimore, MD 21205, United States.

Department of Biomedical Engineering, Johns Hopkins School of Medicine, 733 N Broadway, Baltimore, MD 21205, United States.

出版信息

Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxaf012.

DOI:10.1093/biostatistics/kxaf012

PMID:40515599

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12166475/

Abstract

An important task in the analysis of spatially resolved transcriptomics (SRT) data is to identify spatially variable genes (SVGs), or genes that vary in a 2D space. Current approaches rank SVGs based on either $ P $-values or an effect size, such as the proportion of spatial variance. However, previous work in the analysis of RNA-sequencing data identified a technical bias with log-transformation, violating the "mean-variance relationship" of gene counts, where highly expressed genes are more likely to have a higher variance in counts but lower variance after log-transformation. Here, we demonstrate the mean-variance relationship in SRT data. Furthermore, we propose spoon, a statistical framework using empirical Bayes techniques to remove this bias, leading to more accurate prioritization of SVGs. We demonstrate the performance of spoon in both simulated and real SRT data. A software implementation of our method is available at https://bioconductor.org/packages/spoon.

摘要

空间分辨转录组学（SRT）数据分析中的一项重要任务是识别空间可变基因（SVG），即在二维空间中变化的基因。当前的方法基于P值或效应大小（如空间方差比例）对SVG进行排名。然而，先前在RNA测序数据分析中的工作发现了对数转换存在技术偏差，这违反了基因计数的“均值 - 方差关系”，即高表达基因在计数上更可能具有较高方差，但对数转换后方差较低。在这里，我们展示了SRT数据中的均值 - 方差关系。此外，我们提出了spoon，这是一个使用经验贝叶斯技术消除这种偏差的统计框架，从而更准确地对SVG进行优先级排序。我们在模拟和真实的SRT数据中展示了spoon的性能。我们方法的软件实现可在https://bioconductor.org/packages/spoon获取。