Suppr超能文献

MIXnorm:从福尔马林固定石蜡包埋样本中归一化 RNA-seq 数据。

MIXnorm: normalizing RNA-seq data from formalin-fixed paraffin-embedded samples.

机构信息

Department of Statistical Science, Southern Methodist University, Dallas, TX 75275-0332, USA.

Department of Population and Data Sciences, Quantitative Biomedical Research Center, The University of Texas Southwestern Medical Center, Dallas, TX 75390, USA.

出版信息

Bioinformatics. 2020 Jun 1;36(11):3401-3408. doi: 10.1093/bioinformatics/btaa153.

Abstract

MOTIVATION

Recent studies have shown that RNA-sequencing (RNA-seq) can be used to measure mRNA of sufficient quality extracted from formalin-fixed paraffin-embedded (FFPE) tissues to provide whole-genome transcriptome analysis. However, little attention has been given to the normalization of FFPE RNA-seq data, a key step that adjusts for unwanted biological and technical effects that can bias the signal of interest. Existing methods, developed based on fresh-frozen or similar-type samples, may cause suboptimal performance.

RESULTS

We proposed a new normalization method, labeled MIXnorm, for FFPE RNA-seq data. MIXnorm relies on a two-component mixture model, which models non-expressed genes by zero-inflated Poisson distributions and models expressed genes by truncated normal distributions. To obtain maximum likelihood estimates, we developed a nested EM algorithm, in which closed-form updates are available in each iteration. By eliminating the need for numerical optimization in the M-step, the algorithm is easy to implement and computationally efficient. We evaluated MIXnorm through simulations and cancer studies. MIXnorm makes a significant improvement over commonly used methods for RNA-seq expression data.

AVAILABILITY AND IMPLEMENTATION

R code available at https://github.com/S-YIN/MIXnorm.

CONTACT

swang@smu.edu.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

最近的研究表明,RNA 测序(RNA-seq)可用于测量从福尔马林固定石蜡包埋(FFPE)组织中提取的具有足够质量的 mRNA,从而提供全基因组转录组分析。然而,对于 FFPE RNA-seq 数据的归一化问题,即调整可能会影响感兴趣信号的非生物和技术效应的关键步骤,尚未引起太多关注。基于新鲜冷冻或类似类型样本开发的现有方法可能会导致性能不佳。

结果

我们提出了一种新的 FFPE RNA-seq 数据归一化方法,称为 MIXnorm。MIXnorm 依赖于双成分混合模型,该模型通过零膨胀泊松分布对非表达基因进行建模,并通过截断正态分布对表达基因进行建模。为了获得最大似然估计,我们开发了一种嵌套 EM 算法,其中每个迭代都有封闭形式的更新。通过在 M 步骤中消除对数值优化的需求,该算法易于实现且计算效率高。我们通过模拟和癌症研究评估了 MIXnorm。与 RNA-seq 表达数据的常用方法相比,MIXnorm 有显著的改进。

可用性和实现

R 代码可在 https://github.com/S-YIN/MIXnorm 上获得。

联系人

swang@smu.edu

补充信息

补充资料可在生物信息学在线获得。

相似文献

本文引用的文献

3
Overexpression of Functional SLC6A3 in Clear Cell Renal Cell Carcinoma.功能性 SLC6A3 在透明细胞肾细胞癌中的过表达。
Clin Cancer Res. 2017 Apr 15;23(8):2105-2115. doi: 10.1158/1078-0432.CCR-16-0496. Epub 2016 Sep 23.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验