• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

齐普夫定律能否用于对微阵列进行标准化?

Can Zipf's law be adapted to normalize microarrays?

作者信息

Lu Tim, Costello Christine M, Croucher Peter J P, Häsler Robert, Deuschl Günther, Schreiber Stefan

机构信息

Department of Medicine, Christian-Albrechts-University, Kiel, Germany.

出版信息

BMC Bioinformatics. 2005 Feb 23;6:37. doi: 10.1186/1471-2105-6-37.

DOI:10.1186/1471-2105-6-37
PMID:15727680
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC555536/
Abstract

BACKGROUND

Normalization is the process of removing non-biological sources of variation between array experiments. Recent investigations of data in gene expression databases for varying organisms and tissues have shown that the majority of expressed genes exhibit a power-law distribution with an exponent close to -1 (i.e. obey Zipf's law). Based on the observation that our single channel and two channel microarray data sets also followed a power-law distribution, we were motivated to develop a normalization method based on this law, and examine how it compares with existing published techniques. A computationally simple and intuitively appealing technique based on this observation is presented.

RESULTS

Using pairwise comparisons using MA plots (log ratio vs. log intensity), we compared this novel method to previously published normalization techniques, namely global normalization to the mean, the quantile method, and a variation on the loess normalization method designed specifically for boutique microarrays. Results indicated that, for single channel microarrays, the quantile method was superior with regard to eliminating intensity-dependent effects (banana curves), but Zipf's law normalization does minimize this effect by rotating the data distribution such that the maximal number of data points lie on the zero of the log ratio axis. For two channel boutique microarrays, the Zipf's law normalizations performed as well as, or better than existing techniques.

CONCLUSION

Zipf's law normalization is a useful tool where the Quantile method cannot be applied, as is the case with microarrays containing functionally specific gene sets (boutique arrays).

摘要

背景

标准化是消除阵列实验之间非生物学变异来源的过程。最近对不同生物体和组织的基因表达数据库中的数据进行的研究表明,大多数表达基因呈现幂律分布,指数接近 -1(即服从齐普夫定律)。基于我们的单通道和双通道微阵列数据集也遵循幂律分布这一观察结果,我们有动力开发一种基于该定律的标准化方法,并研究它与现有已发表技术相比如何。本文提出了一种基于这一观察结果的计算简单且直观吸引人的技术。

结果

使用MA图(对数比值与对数强度)进行成对比较,我们将这种新方法与先前发表的标准化技术进行了比较,即全局均值标准化、分位数法以及专门为精品微阵列设计的局部加权回归标准化方法的一种变体。结果表明,对于单通道微阵列,分位数法在消除强度依赖性效应(香蕉曲线)方面更优,但齐普夫定律标准化通过旋转数据分布,使最大数量的数据点位于对数比值轴的零处,从而最小化了这种效应。对于双通道精品微阵列,齐普夫定律标准化的性能与现有技术相当,甚至更好。

结论

在无法应用分位数法的情况下,例如对于包含功能特异性基因集的微阵列(精品阵列),齐普夫定律标准化是一种有用的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/42a747dc71be/1471-2105-6-37-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/740329870c83/1471-2105-6-37-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/0628ed0cfb9c/1471-2105-6-37-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/59089d308f84/1471-2105-6-37-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/46519fa4c84b/1471-2105-6-37-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/599e419bca3a/1471-2105-6-37-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/42a747dc71be/1471-2105-6-37-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/740329870c83/1471-2105-6-37-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/0628ed0cfb9c/1471-2105-6-37-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/59089d308f84/1471-2105-6-37-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/46519fa4c84b/1471-2105-6-37-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/599e419bca3a/1471-2105-6-37-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/07bf/555536/42a747dc71be/1471-2105-6-37-6.jpg

相似文献

1
Can Zipf's law be adapted to normalize microarrays?齐普夫定律能否用于对微阵列进行标准化?
BMC Bioinformatics. 2005 Feb 23;6:37. doi: 10.1186/1471-2105-6-37.
2
Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.基于疾病谱数据中错误发现率的七种生成Affymetrix表达分数方法的比较。
BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26.
3
The influence of missing value imputation on detection of differentially expressed genes from microarray data.缺失值插补对从微阵列数据中检测差异表达基因的影响。
Bioinformatics. 2005 Dec 1;21(23):4272-9. doi: 10.1093/bioinformatics/bti708. Epub 2005 Oct 10.
4
A non-transformation method for identifying differentially expressed genes from cDNA microarrays.一种从cDNA微阵列中鉴定差异表达基因的非转化方法。
Yi Chuan Xue Bao. 2006 Jan;33(1):80-8. doi: 10.1016/S0379-4172(06)60012-7.
5
Selection and validation of normalization methods for c-DNA microarrays using within-array replications.使用芯片内重复数据对c-DNA微阵列标准化方法进行选择与验证
Bioinformatics. 2007 Sep 15;23(18):2391-8. doi: 10.1093/bioinformatics/btm361. Epub 2007 Jul 27.
6
A new outlier removal approach for cDNA microarray normalization.一种用于cDNA微阵列标准化的新离群值去除方法。
Biotechniques. 2009 Aug;47(2):691-2, 694-700. doi: 10.2144/000113195.
7
A modified LOESS normalization applied to microRNA arrays: a comparative evaluation.应用于 microRNA 阵列的改良局部加权散点平滑标准化:比较评估。
Bioinformatics. 2009 Oct 15;25(20):2685-91. doi: 10.1093/bioinformatics/btp443. Epub 2009 Jul 23.
8
Assessing the need for sequence-based normalization in tiling microarray experiments.评估平铺式微阵列实验中基于序列标准化的必要性。
Bioinformatics. 2007 Apr 15;23(8):988-97. doi: 10.1093/bioinformatics/btm052. Epub 2007 Mar 25.
9
A normalization strategy applied to HiCEP (an AFLP-based expression profiling) analysis: toward the strict alignment of valid fragments across electrophoretic patterns.一种应用于HiCEP(基于扩增片段长度多态性的表达谱分析)分析的标准化策略:实现跨电泳图谱有效片段的严格比对。
BMC Bioinformatics. 2005 Mar 6;6:43. doi: 10.1186/1471-2105-6-43.
10
PQN and DQN: algorithms for expression microarrays.PQN和DQN:用于表达微阵列的算法
J Theor Biol. 2006 Nov 21;243(2):273-8. doi: 10.1016/j.jtbi.2006.06.017. Epub 2006 Jun 30.

引用本文的文献

1
Neural network learning defines glioblastoma features to be of neural crest perivascular or radial glia lineages.神经网络学习将胶质母细胞瘤的特征定义为神经嵴血管周围或放射状胶质细胞谱系。
Sci Adv. 2022 Jun 10;8(23):eabm6340. doi: 10.1126/sciadv.abm6340. Epub 2022 Jun 8.
2
Combination of novel and public RNA-seq datasets to generate an mRNA expression atlas for the domestic chicken.结合新型和公共 RNA-seq 数据集,为家鸡生成 mRNA 表达图谱。
BMC Genomics. 2018 Aug 7;19(1):594. doi: 10.1186/s12864-018-4972-7.
3
Response to letter of correspondence - Bastiaens et al.

本文引用的文献

1
Universality and flexibility in gene expression from bacteria to human.从细菌到人类基因表达的普遍性和灵活性。
Proc Natl Acad Sci U S A. 2004 Mar 16;101(11):3765-9. doi: 10.1073/pnas.0306244101. Epub 2004 Mar 3.
2
Zipf's law and human transcriptomes: an explanation with an evolutionary model.齐普夫定律与人类转录组:基于进化模型的解释
C R Biol. 2003 Oct-Nov;326(10-11):1097-101. doi: 10.1016/j.crvi.2003.09.031.
3
New normalization methods for cDNA microarray data.cDNA微阵列数据的新标准化方法。
对通信信件的回复——巴斯蒂亚恩斯等人
Nat Biotechnol. 2015 Apr;33(4):339-42. doi: 10.1038/nbt.3184.
4
Fungal gene expression levels do not display a common mode of distribution.真菌基因表达水平并未呈现出一种常见的分布模式。
BMC Res Notes. 2013 Dec 28;6:559. doi: 10.1186/1756-0500-6-559.
5
Universality in network dynamics.网络动力学中的普遍性。
Nat Phys. 2013;9:673-81. doi: 10.1038/nphys2741.
6
Methods for analyzing deep sequencing expression data: constructing the human and mouse promoterome with deepCAGE data.分析深度测序表达数据的方法:使用 deepCAGE 数据构建人类和小鼠启动子组。
Genome Biol. 2009;10(7):R79. doi: 10.1186/gb-2009-10-7-r79. Epub 2009 Jul 22.
7
"Hook"-calibration of GeneChip-microarrays: theory and algorithm.基因芯片微阵列的“Hook”校准:理论与算法
Algorithms Mol Biol. 2008 Aug 29;3:12. doi: 10.1186/1748-7188-3-12.
8
Using generalized procrustes analysis (GPA) for normalization of cDNA microarray data.使用广义普罗克汝斯分析(GPA)对cDNA微阵列数据进行标准化处理。
BMC Bioinformatics. 2008 Jan 16;9:25. doi: 10.1186/1471-2105-9-25.
9
Porcine transcriptome analysis based on 97 non-normalized cDNA libraries and assembly of 1,021,891 expressed sequence tags.基于97个非标准化cDNA文库和1,021,891个表达序列标签的组装对猪转录组进行分析。
Genome Biol. 2007;8(4):R45. doi: 10.1186/gb-2007-8-4-r45.
10
A simple method to combine multiple molecular biomarkers for dichotomous diagnostic classification.一种用于二分诊断分类的组合多种分子生物标志物的简单方法。
BMC Bioinformatics. 2006 Oct 10;7:442. doi: 10.1186/1471-2105-7-442.
Bioinformatics. 2003 Jul 22;19(11):1325-32. doi: 10.1093/bioinformatics/btg146.
4
Zipf's law in gene expression.基因表达中的齐普夫定律。
Phys Rev Lett. 2003 Feb 28;90(8):088102. doi: 10.1103/PhysRevLett.90.088102. Epub 2003 Feb 26.
5
A comparison of normalization methods for high density oligonucleotide array data based on variance and bias.基于方差和偏差的高密度寡核苷酸阵列数据标准化方法比较
Bioinformatics. 2003 Jan 22;19(2):185-93. doi: 10.1093/bioinformatics/19.2.185.
6
General statistics of stochastic process of gene expression in eukaryotic cells.真核细胞中基因表达随机过程的一般统计
Genetics. 2002 Jul;161(3):1321-32. doi: 10.1093/genetics/161.3.1321.
7
Ranking: a closer look on globalisation methods for normalisation of gene expression arrays.排名:深入探讨基因表达阵列标准化的全球化方法
Nucleic Acids Res. 2002 Jun 1;30(11):e50. doi: 10.1093/nar/30.11.e50.
8
Making sense of microarray data distributions.理解微阵列数据分布。
Bioinformatics. 2002 Apr;18(4):576-84. doi: 10.1093/bioinformatics/18.4.576.
9
Adjustments and measures of differential expression for microarray data.微阵列数据差异表达的调整与测量
Bioinformatics. 2002 Feb;18(2):251-60. doi: 10.1093/bioinformatics/18.2.251.
10
Feature extraction and normalization algorithms for high-density oligonucleotide gene expression array data.用于高密度寡核苷酸基因表达阵列数据的特征提取与归一化算法。
J Cell Biochem Suppl. 2001;Suppl 37:120-5. doi: 10.1002/jcb.10073.