• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

为什么是权重?对样本和观测水平的变异性进行建模可提高RNA测序分析的效能。

Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.

作者信息

Liu Ruijie, Holik Aliaksei Z, Su Shian, Jansz Natasha, Chen Kelan, Leong Huei San, Blewitt Marnie E, Asselin-Labat Marie-Liesse, Smyth Gordon K, Ritchie Matthew E

机构信息

Molecular Medicine Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia.

Stem Cells and Cancer Division, The Walter and Eliza Hall Institute of Medical Research, 1G Royal Parade, Parkville, Victoria 3052, Australia Department of Medical Biology, The University of Melbourne, Parkville, Victoria 3010, Australia.

出版信息

Nucleic Acids Res. 2015 Sep 3;43(15):e97. doi: 10.1093/nar/gkv412. Epub 2015 Apr 29.

DOI:10.1093/nar/gkv412
PMID:25925576
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC4551905/
Abstract

Variations in sample quality are frequently encountered in small RNA-sequencing experiments, and pose a major challenge in a differential expression analysis. Removal of high variation samples reduces noise, but at a cost of reducing power, thus limiting our ability to detect biologically meaningful changes. Similarly, retaining these samples in the analysis may not reveal any statistically significant changes due to the higher noise level. A compromise is to use all available data, but to down-weight the observations from more variable samples. We describe a statistical approach that facilitates this by modelling heterogeneity at both the sample and observational levels as part of the differential expression analysis. At the sample level this is achieved by fitting a log-linear variance model that includes common sample-specific or group-specific parameters that are shared between genes. The estimated sample variance factors are then converted to weights and combined with observational level weights obtained from the mean-variance relationship of the log-counts-per-million using 'voom'. A comprehensive analysis involving both simulations and experimental RNA-sequencing data demonstrates that this strategy leads to a universally more powerful analysis and fewer false discoveries when compared to conventional approaches. This methodology has wide application and is implemented in the open-source 'limma' package.

摘要

在小RNA测序实验中,样本质量的变化经常出现,这在差异表达分析中构成了重大挑战。去除高变异性样本可降低噪声,但代价是降低了检测能力,从而限制了我们检测生物学上有意义变化的能力。同样,在分析中保留这些样本可能由于噪声水平较高而无法揭示任何具有统计学意义的变化。一种折衷方法是使用所有可用数据,但对来自变异性更大样本的观测值进行加权。我们描述了一种统计方法,通过在差异表达分析中对样本和观测水平的异质性进行建模来实现这一点。在样本水平上,这是通过拟合一个对数线性方差模型来实现的,该模型包括基因之间共享的常见样本特异性或组特异性参数。然后将估计的样本方差因子转换为权重,并与使用“voom”从每百万对数计数的均值-方差关系获得的观测水平权重相结合。一项涉及模拟和实验RNA测序数据的综合分析表明,与传统方法相比,该策略能带来普遍更强有力的分析,且错误发现更少。这种方法具有广泛的应用,并在开源的“limma”软件包中实现。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/37ded5f15fa6/gkv412fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/daa6a42575b3/gkv412fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/4b3b3aebc67c/gkv412fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/692113ceea3c/gkv412fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/ec7ac3f32e4c/gkv412fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/07f962e981a9/gkv412fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/b3e08774970d/gkv412fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/ed007eccb71d/gkv412fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/37ded5f15fa6/gkv412fig8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/daa6a42575b3/gkv412fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/4b3b3aebc67c/gkv412fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/692113ceea3c/gkv412fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/ec7ac3f32e4c/gkv412fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/07f962e981a9/gkv412fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/b3e08774970d/gkv412fig6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/ed007eccb71d/gkv412fig7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cdf5/4551905/37ded5f15fa6/gkv412fig8.jpg

相似文献

1
Why weight? Modelling sample and observational level variability improves power in RNA-seq analyses.为什么是权重?对样本和观测水平的变异性进行建模可提高RNA测序分析的效能。
Nucleic Acids Res. 2015 Sep 3;43(15):e97. doi: 10.1093/nar/gkv412. Epub 2015 Apr 29.
2
Differential expression analysis of RNA sequencing data by incorporating non-exonic mapped reads.通过纳入非外显子映射读数对RNA测序数据进行差异表达分析。
BMC Genomics. 2015;16 Suppl 7(Suppl 7):S14. doi: 10.1186/1471-2164-16-S7-S14. Epub 2015 Jun 11.
3
No counts, no variance: allowing for loss of degrees of freedom when assessing biological variability from RNA-seq data.无计数,无方差:评估RNA测序数据的生物学变异性时考虑自由度损失。
Stat Appl Genet Mol Biol. 2017 Apr 25;16(2):83-93. doi: 10.1515/sagmb-2017-0010.
4
It's DE-licious: A Recipe for Differential Expression Analyses of RNA-seq Experiments Using Quasi-Likelihood Methods in edgeR.美味无比:使用edgeR中拟似然方法进行RNA测序实验差异表达分析的方法
Methods Mol Biol. 2016;1418:391-416. doi: 10.1007/978-1-4939-3578-9_19.
5
Comparison of microarrays and RNA-seq for gene expression analyses of dose-response experiments.微阵列与RNA测序在剂量反应实验基因表达分析中的比较。
Toxicol Sci. 2014 Feb;137(2):385-403. doi: 10.1093/toxsci/kft249. Epub 2013 Nov 5.
6
Differential expression analysis of multifactor RNA-Seq experiments with respect to biological variation.针对生物变异的多因素 RNA-Seq 实验的差异表达分析。
Nucleic Acids Res. 2012 May;40(10):4288-97. doi: 10.1093/nar/gks042. Epub 2012 Jan 28.
7
A note on an exon-based strategy to identify differentially expressed genes in RNA-seq experiments.关于一种基于外显子的策略在RNA测序实验中鉴定差异表达基因的说明。
PLoS One. 2014 Dec 26;9(12):e115964. doi: 10.1371/journal.pone.0115964. eCollection 2014.
8
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
9
Experimental Design and Power Calculation for RNA-seq Experiments.RNA测序实验的实验设计与功效计算
Methods Mol Biol. 2016;1418:379-90. doi: 10.1007/978-1-4939-3578-9_18.
10
Empirical array quality weights in the analysis of microarray data.微阵列数据分析中的经验阵列质量权重
BMC Bioinformatics. 2006 May 19;7:261. doi: 10.1186/1471-2105-7-261.

引用本文的文献

1
Mepolizumab alters gene regulatory networks of nasal airway type-2 and epithelial inflammation in urban children with asthma.美泊利珠单抗改变城市哮喘儿童鼻气道2型和上皮炎症的基因调控网络。
Nat Commun. 2025 Sep 2;16(1):8191. doi: 10.1038/s41467-025-63629-2.
2
ACLY promotes NK cell effector function by regulating glycolysis and histone acetylation.ACLY通过调节糖酵解和组蛋白乙酰化来促进自然杀伤细胞的效应功能。
J Immunol. 2025 Aug 25. doi: 10.1093/jimmun/vkaf209.
3
Clinical Relevance of , , , and Gene Expression and Genetic Variants in HPV-Negative Oral Carcinomas.

本文引用的文献

1
limma powers differential expression analyses for RNA-sequencing and microarray studies.limma为RNA测序和微阵列研究提供差异表达分析的动力。
Nucleic Acids Res. 2015 Apr 20;43(7):e47. doi: 10.1093/nar/gkv007. Epub 2015 Jan 20.
2
svaseq: removing batch effects and other unwanted noise from sequencing data.svaseq:去除测序数据中的批次效应和其他不必要的噪声。
Nucleic Acids Res. 2014 Dec 1;42(21):e161. doi: 10.1093/nar/gku864. Epub 2014 Oct 7.
3
A comprehensive assessment of RNA-seq accuracy, reproducibility and information content by the Sequencing Quality Control Consortium.
人乳头瘤病毒阴性口腔癌中、、、基因表达及基因变异的临床相关性
Int J Mol Sci. 2025 Jul 25;26(15):7218. doi: 10.3390/ijms26157218.
4
Structural Plasticity of the Membrane-Bound Protein Degradation Assembly Supports Bacterial Adaptation to Stress.膜结合蛋白降解组装体的结构可塑性支持细菌对压力的适应。
bioRxiv. 2025 Jul 25:2025.07.21.662073. doi: 10.1101/2025.07.21.662073.
5
Exploring associations between breast tumor inflammatory gene expression and mammographic calcifications and masses in a community-based population.在一个基于社区的人群中探索乳腺肿瘤炎症基因表达与乳腺钼靶钙化及肿块之间的关联。
Sci Rep. 2025 Aug 6;15(1):28710. doi: 10.1038/s41598-025-09972-2.
6
Elucidating novel immune profiles for predicting infection in high-risk cohorts: a pilot study in patients with relapsed and refractory chronic lymphocytic leukaemia.阐明用于预测高危人群感染的新型免疫特征:复发难治性慢性淋巴细胞白血病患者的一项试点研究。
Clin Transl Immunology. 2025 Aug 3;14(8):e70049. doi: 10.1002/cti2.70049. eCollection 2025.
7
Coupling tRNAGly gene redundancy with staphylococcal cell wall integrity, antibiotic susceptibility, and virulence potential.将甘氨酰tRNA基因冗余与葡萄球菌细胞壁完整性、抗生素敏感性和毒力潜力联系起来。
Nucleic Acids Res. 2025 Jul 8;53(13). doi: 10.1093/nar/gkaf599.
8
Unveiling the Role of Histone Methyltransferases in Psoriasis Pathogenesis: Insights from Transcriptomic Analysis.揭示组蛋白甲基转移酶在银屑病发病机制中的作用:转录组分析的见解
Int J Mol Sci. 2025 Jun 30;26(13):6329. doi: 10.3390/ijms26136329.
9
Mechanisms of vaccine protection in chickens against challenge with virulent Mycoplasma synoviae.鸡接种疫苗抵抗强毒滑膜支原体攻击的保护机制。
Vet Res. 2025 Jul 9;56(1):146. doi: 10.1186/s13567-025-01571-3.
10
Guanylate-Binding Proteins Promote Host Defense Against by Balancing iNOS/Arg-1 in Myeloid Cells.鸟苷酸结合蛋白通过平衡髓系细胞中的诱导型一氧化氮合酶/精氨酸酶-1促进宿主对[病原体]的防御。 (注:原文中“by Balancing iNOS/Arg-1 in Myeloid Cells”前缺少具体病原体信息)
bioRxiv. 2025 Jun 30:2025.06.26.661809. doi: 10.1101/2025.06.26.661809.
测序质量控制联盟对RNA测序准确性、可重复性和信息含量的全面评估。
Nat Biotechnol. 2014 Sep;32(9):903-14. doi: 10.1038/nbt.2957. Epub 2014 Aug 24.
4
Normalization of RNA-seq data using factor analysis of control genes or samples.使用对照基因或样本的因子分析对RNA测序数据进行标准化。
Nat Biotechnol. 2014 Sep;32(9):896-902. doi: 10.1038/nbt.2931. Epub 2014 Aug 24.
5
Dissemination of scientific software with Galaxy ToolShed.通过Galaxy工具库传播科学软件。
Genome Biol. 2014 Feb 20;15(2):403. doi: 10.1186/gb4161.
6
Robustly detecting differential expression in RNA sequencing data using observation weights.利用观测权重稳健检测RNA测序数据中的差异表达。
Nucleic Acids Res. 2014 Jun;42(11):e91. doi: 10.1093/nar/gku310. Epub 2014 Apr 20.
7
voom: Precision weights unlock linear model analysis tools for RNA-seq read counts.voom:精确权重为RNA测序读数计数解锁线性模型分析工具。
Genome Biol. 2014 Feb 3;15(2):R29. doi: 10.1186/gb-2014-15-2-r29.
8
featureCounts: an efficient general purpose program for assigning sequence reads to genomic features.featureCounts:一个用于将序列读取分配给基因组特征的高效通用程序。
Bioinformatics. 2014 Apr 1;30(7):923-30. doi: 10.1093/bioinformatics/btt656. Epub 2013 Nov 13.
9
Smchd1 regulates a subset of autosomal genes subject to monoallelic expression in addition to being critical for X inactivation.Smchd1 除了对 X 染色体失活至关重要外,还调控一组常染色体基因的单等位基因表达。
Epigenetics Chromatin. 2013 Jul 2;6(1):19. doi: 10.1186/1756-8935-6-19.
10
Epigenetic functions of smchd1 repress gene clusters on the inactive X chromosome and on autosomes.SMCHD1 的表观遗传功能抑制失活 X 染色体和常染色体上的基因簇。
Mol Cell Biol. 2013 Aug;33(16):3150-65. doi: 10.1128/MCB.00145-13. Epub 2013 Jun 10.