Suppr超能文献

RNA测序差异基因表达分析的稳健性

Robustness of differential gene expression analysis of RNA-seq.

作者信息

Stupnikov A, McInerney C E, Savage K I, McIntosh S A, Emmert-Streib F, Kennedy R, Salto-Tellez M, Prise K M, McArt D G

机构信息

Department of Biological and Medical Physics, Moscow Institute of Physics and Technology, Dolgoprudny, Russian Federation.

Patrick G. Johnson Centre for Cancer Research, Queen's University, Belfast, Northern Ireland, UK.

出版信息

Comput Struct Biotechnol J. 2021 May 26;19:3470-3481. doi: 10.1016/j.csbj.2021.05.040. eCollection 2021.

Abstract

RNA-sequencing (RNA-seq) is a relatively new technology that lacks standardisation. RNA-seq can be used for Differential Gene Expression (DGE) analysis, however, no consensus exists as to which methodology ensures robust and reproducible results. Indeed, it is broadly acknowledged that DGE methods provide disparate results. Despite obstacles, RNA-seq assays are in advanced development for clinical use but further optimisation will be needed. Herein, five DGE models (DESeq2, voom + limma, edgeR, EBSeq, NOISeq) for gene-level detection were investigated for robustness to sequencing alterations using a controlled analysis of fixed count matrices. Two breast cancer datasets were analysed with full and reduced sample sizes. DGE model robustness was compared between filtering regimes and for different expression levels (high, low) using unbiased metrics. Test sensitivity estimated as relative False Discovery Rate (FDR), concordance between model outputs and comparisons of a 'population' of slopes of relative FDRs across different library sizes, generated using linear regressions, were examined. Patterns of relative DGE model robustness proved dataset-agnostic and reliable for drawing conclusions when sample sizes were sufficiently large. Overall, the non-parametric method NOISeq was the most robust followed by edgeR, voom, EBSeq and DESeq2. Our rigorous appraisal provides information for method selection for molecular diagnostics. Metrics may prove useful towards improving the standardisation of RNA-seq for precision medicine.

摘要

RNA测序(RNA-seq)是一项相对较新的技术,缺乏标准化。RNA-seq可用于差异基因表达(DGE)分析,然而,对于哪种方法能确保获得可靠且可重复的结果,目前尚无共识。事实上,人们普遍认为DGE方法会产生不同的结果。尽管存在障碍,但RNA-seq检测在临床应用方面正处于深入研发阶段,但仍需要进一步优化。在此,使用固定计数矩阵的对照分析,研究了五种用于基因水平检测的DGE模型(DESeq2、voom + limma、edgeR、EBSeq、NOISeq)对测序变化的稳健性。对两个乳腺癌数据集进行了全样本量和缩减样本量的分析。使用无偏度量比较了不同过滤方案之间以及不同表达水平(高、低)下DGE模型的稳健性。通过线性回归生成的相对错误发现率(FDR)估计测试灵敏度、模型输出之间的一致性以及不同文库大小下相对FDR“群体”斜率的比较。当样本量足够大时,相对DGE模型稳健性模式证明与数据集无关且可用于得出可靠结论。总体而言,非参数方法NOISeq最稳健,其次是edgeR、voom、EBSeq和DESeq2。我们的严格评估为分子诊断的方法选择提供了信息。这些度量可能有助于提高RNA-seq在精准医学中的标准化水平。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/907c/8214188/6f8645051d33/ga1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验