Suppr超能文献

RNA测序数据标准化对阿尔茨海默病蛋白质相互作用组图谱绘制的影响

Effect of RNA-Seq data normalization on protein interactome mapping for Alzheimer's disease.

作者信息

Düz Elif, Çakır Tunahan

机构信息

Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, 41400, Turkey.

Department of Bioengineering, Gebze Technical University, Gebze, Kocaeli, 41400, Turkey.

出版信息

Comput Biol Chem. 2024 Apr;109:108028. doi: 10.1016/j.compbiolchem.2024.108028. Epub 2024 Feb 8.

Abstract

High throughput RNA sequencing brings new perspective to the elucidation of molecular mechanisms of diseases. Normalization is the first and most important step for RNA-Seq data, and it can differ based on the purpose of the analysis. Within-sample normalization methods (eg. TPM) are preferred when genes in a sample are compared with each other, and between-sample normalization methods (eg. deseq2, TMM, Voom) are used when the samples in a dataset are compared. Normalization approaches rescale the data, and, therefore, they affect the results of the analysis. Here, we selected two most commonly used Alzheimer's disease RNA-Seq datasets from ROSMAP and Mayo Clinic cohorts and mapped the differentially expressed genes on human protein interactome to discover disease-specific subnetworks. To this end, the raw count data were first processed with four different, commonly used RNA-Seq normalization methods (deseq2, TMM, Voom and TPM). Then, covariate adjustment was applied to the normalized data for gender, age of death and post-mortem interval. Each normalized dataset was separately mapped on the human protein-protein interaction network either in covariate-adjusted or non-adjusted form. Capturing known Alzheimer's disease genes and genes associated with the disease-related functional terms in the discovered subnetworks were the criteria to compare different normalization methods. Based on our results, applying covariate adjustment has a positive effect on normalization by removing the confounder effects. Covariate-adjusted TMM and covariate-adjusted deseq2 methods performed better in both transcriptome datasets.

摘要

高通量RNA测序为阐明疾病的分子机制带来了新视角。标准化是RNA测序数据的首要且最重要的步骤,并且它会因分析目的而异。当比较样本中的基因时,样本内标准化方法(例如TPM)是首选;而当比较数据集中的样本时,则使用样本间标准化方法(例如deseq2、TMM、Voom)。标准化方法会重新调整数据的比例,因此,它们会影响分析结果。在这里,我们从ROSMAP和梅奥诊所队列中选择了两个最常用的阿尔茨海默病RNA测序数据集,并将差异表达基因映射到人类蛋白质相互作用组上,以发现疾病特异性子网。为此,首先使用四种不同的常用RNA测序标准化方法(deseq2、TMM、Voom和TPM)对原始计数数据进行处理。然后,对标准化数据进行性别、死亡年龄和死后间隔的协变量调整。每个标准化数据集分别以协变量调整或未调整的形式映射到人类蛋白质-蛋白质相互作用网络上。在发现的子网中捕获已知的阿尔茨海默病基因以及与疾病相关功能术语相关的基因是比较不同标准化方法的标准。根据我们的结果,应用协变量调整通过消除混杂效应,对标准化有积极影响。协变量调整后的TMM和协变量调整后的deseq2方法在两个转录组数据集中表现更好。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验