Shah Kinnary, Guo Boyi, Hicks Stephanie C
Department of Biostatistics, Johns Hopkins Bloomberg School of Public Health, 615 N Wolfe Street, Baltimore, MD 21205, United States.
Department of Biomedical Engineering, Johns Hopkins School of Medicine, 733 N Broadway, Baltimore, MD 21205, United States.
Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxaf012.
An important task in the analysis of spatially resolved transcriptomics (SRT) data is to identify spatially variable genes (SVGs), or genes that vary in a 2D space. Current approaches rank SVGs based on either $ P $-values or an effect size, such as the proportion of spatial variance. However, previous work in the analysis of RNA-sequencing data identified a technical bias with log-transformation, violating the "mean-variance relationship" of gene counts, where highly expressed genes are more likely to have a higher variance in counts but lower variance after log-transformation. Here, we demonstrate the mean-variance relationship in SRT data. Furthermore, we propose spoon, a statistical framework using empirical Bayes techniques to remove this bias, leading to more accurate prioritization of SVGs. We demonstrate the performance of spoon in both simulated and real SRT data. A software implementation of our method is available at https://bioconductor.org/packages/spoon.
空间分辨转录组学(SRT)数据分析中的一项重要任务是识别空间可变基因(SVG),即在二维空间中变化的基因。当前的方法基于P值或效应大小(如空间方差比例)对SVG进行排名。然而,先前在RNA测序数据分析中的工作发现了对数转换存在技术偏差,这违反了基因计数的“均值 - 方差关系”,即高表达基因在计数上更可能具有较高方差,但对数转换后方差较低。在这里,我们展示了SRT数据中的均值 - 方差关系。此外,我们提出了spoon,这是一个使用经验贝叶斯技术消除这种偏差的统计框架,从而更准确地对SVG进行优先级排序。我们在模拟和真实的SRT数据中展示了spoon的性能。我们方法的软件实现可在https://bioconductor.org/packages/spoon获取。