Suppr超能文献

戴森均衡器:用于低秩信号检测与恢复的自适应噪声稳定

The Dyson equalizer: adaptive noise stabilization for low-rank signal detection and recovery.

作者信息

Landa Boris, Kluger Yuval

机构信息

Department of Electrical Engineering, Yale University, New Haven, CT 06520, US.

Program in Applied Mathematics, Yale University, New Haven, CT 06520, US.

出版信息

Inf inference. 2025 Jan 16;14(1):iaae036. doi: 10.1093/imaiai/iaae036. eCollection 2025 Mar.

Abstract

Detecting and recovering a low-rank signal in a noisy data matrix is a fundamental task in data analysis. Typically, this task is addressed by inspecting and manipulating the spectrum of the observed data, e.g. thresholding the singular values of the data matrix at a certain critical level. This approach is well established in the case of homoskedastic noise, where the noise variance is identical across the entries. However, in numerous applications, the noise can be heteroskedastic, where the noise characteristics may vary considerably across the rows and columns of the data. In this scenario, the spectral behaviour of the noise can differ significantly from the homoskedastic case, posing various challenges for signal detection and recovery. To address these challenges, we develop an adaptive normalization procedure that equalizes the average noise variance across the rows and columns of a given data matrix. Our proposed procedure is data-driven and fully automatic, supporting a broad range of noise distributions, variance patterns and signal structures. Our approach relies on random matrix theory results that describe the resolvent of the noise via the so-called Dyson equation. By leveraging this relation, we can accurately infer the noise level in each row and each column directly from the resolvent of the data. We establish that in many cases, our normalization enforces the standard spectral behaviour of homoskedastic noise-the Marchenko-Pastur (MP) law, allowing for simple and reliable detection of signal components. Furthermore, we demonstrate that our approach can substantially improve signal recovery in heteroskedastic settings by manipulating the spectrum after normalization. Lastly, we apply our method to single-cell RNA sequencing and spatial transcriptomics data, showcasing accurate fits to the MP law after normalization.

摘要

在有噪声的数据矩阵中检测和恢复低秩信号是数据分析中的一项基本任务。通常,通过检查和处理观测数据的频谱来解决此任务,例如在某个临界水平对数据矩阵的奇异值进行阈值处理。在同方差噪声(即噪声方差在所有元素上相同)的情况下,这种方法已得到充分确立。然而,在许多应用中,噪声可能是异方差的,其中噪声特征可能在数据的行和列之间有很大差异。在这种情况下,噪声的频谱行为可能与同方差情况有显著不同,给信号检测和恢复带来各种挑战。为应对这些挑战,我们开发了一种自适应归一化程序,该程序可使给定数据矩阵的行和列的平均噪声方差相等。我们提出的程序是数据驱动且完全自动的,支持广泛的噪声分布、方差模式和信号结构。我们的方法依赖于随机矩阵理论结果,该结果通过所谓的戴森方程来描述噪声的预解式。通过利用这种关系,我们可以直接从数据的预解式准确推断每行和每列的噪声水平。我们证明,在许多情况下,我们的归一化强制实现同方差噪声的标准频谱行为——马尔琴科 - 帕斯特尔(MP)定律,从而允许对信号分量进行简单可靠的检测。此外,我们证明我们的方法可以通过在归一化后操纵频谱,在异方差设置中显著改善信号恢复。最后,我们将我们的方法应用于单细胞RNA测序和空间转录组学数据,展示了归一化后对MP定律的准确拟合。

相似文献

6
Interventions to reduce harm from continued tobacco use.减少持续吸烟危害的干预措施。
Cochrane Database Syst Rev. 2016 Oct 13;10(10):CD005231. doi: 10.1002/14651858.CD005231.pub3.

本文引用的文献

2
Biwhitening Reveals the Rank of a Count Matrix.双白化揭示计数矩阵的秩。
SIAM J Math Data Sci. 2022;4(4):1420-1446. doi: 10.1137/21m1456807.
4
Zero-preserving imputation of single-cell RNA-seq data.单细胞 RNA-seq 数据的零保留插补。
Nat Commun. 2022 Jan 11;13(1):192. doi: 10.1038/s41467-021-27729-z.
5
The triumphs and limitations of computational methods for scRNA-seq.单细胞 RNA 测序计算方法的成就与局限。
Nat Methods. 2021 Jul;18(7):723-732. doi: 10.1038/s41592-021-01171-x. Epub 2021 Jun 21.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验