Suppr超能文献

C-SHIFT 算法用于协方差标准化。

The C-SHIFT Algorithm for Normalizing Covariances.

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2023 Jan-Feb;20(1):720-730. doi: 10.1109/TCBB.2022.3151840. Epub 2023 Feb 3.

Abstract

Omics technologies are powerful tools for analyzing patterns in gene expression data for thousands of genes. Due to a number of systematic variations in experiments, the raw gene expression data is often obfuscated by undesirable technical noises. Various normalization techniques were designed in an attempt to remove these non-biological errors prior to any statistical analysis. One of the reasons for normalizing data is the need for recovering the covariance matrix used in gene network analysis. In this paper, we introduce a novel normalization technique, called the covariance shift (C-SHIFT) method. This normalization algorithm uses optimization techniques together with the blessing of dimensionality philosophy and energy minimization hypothesis for covariance matrix recovery under additive noise (in biology, known as the bias). Thus, it is perfectly suited for the analysis of logarithmic gene expression data. Numerical experiments on synthetic data demonstrate the method's advantage over the classical normalization techniques. Namely, the comparison is made with Rank, Quantile, cyclic LOESS (locally estimated scatterplot smoothing), and MAD (median absolute deviation) normalization methods. We also evaluate the performance of C-SHIFT algorithm on real biological data.

摘要

组学技术是分析数千个基因的基因表达数据模式的强大工具。由于实验中存在许多系统变化,原始基因表达数据常常被不理想的技术噪声所混淆。各种归一化技术旨在尝试在进行任何统计分析之前消除这些非生物学错误。归一化数据的原因之一是需要恢复基因网络分析中使用的协方差矩阵。在本文中,我们介绍了一种新的归一化技术,称为协方差移位(C-SHIFT)方法。这种归一化算法使用优化技术以及维度哲学和能量最小化假设,用于在加性噪声下(在生物学中称为偏差)恢复协方差矩阵。因此,它非常适合对数基因表达数据的分析。对合成数据的数值实验表明,该方法优于经典归一化技术。即,与秩、分位数、循环 LOESS(局部估计散点平滑)和 MAD(中位数绝对偏差)归一化方法进行比较。我们还评估了 C-SHIFT 算法在真实生物数据上的性能。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验