用于分析 cDNA 芯片的 Audic-Claverie 统计量的基本性质和信息论。

Basic properties and information theory of Audic-Claverie statistic for analyzing cDNA arrays.

机构信息

School of Computer Science, The University of Birmingham, Birmingham, B15 2TT, UK.

出版信息

BMC Bioinformatics. 2009 Sep 23;10:310. doi: 10.1186/1471-2105-10-310.

DOI:10.1186/1471-2105-10-310

PMID:19775462

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC2761412/

Abstract

BACKGROUND

The Audic-Claverie method 1 has been and still continues to be a popular approach for detection of differentially expressed genes in the SAGE framework. The method is based on the assumption that under the null hypothesis tag counts of the same gene in two libraries come from the same but unknown Poisson distribution. The problem is that each SAGE library represents only a single measurement. We ask: Given that the tag count samples from SAGE libraries are extremely limited, how useful actually is the Audic-Claverie methodology? We rigorously analyze the A-C statistic that forms a backbone of the methodology and represents our knowledge of the underlying tag generating process based on one observation.

RESULTS

We show that the A-C statistic and the underlying Poisson distribution of the tag counts share the same mode structure. Moreover, the K-L divergence from the true unknown Poisson distribution to the A-C statistic is minimized when the A-C statistic is conditioned on the mode of the Poisson distribution. Most importantly, the expectation of this K-L divergence never exceeds 1/2 bit.

CONCLUSION

A rigorous underpinning of the Audic-Claverie methodology has been missing. Our results constitute a rigorous argument supporting the use of Audic-Claverie method even though the SAGE libraries represent very sparse samples.

摘要

背景

Audic-Claverie 方法 1 一直是并且仍然是 SAGE 框架中检测差异表达基因的流行方法。该方法基于这样的假设，即在零假设下，两个库中同一基因的标签计数来自相同但未知的泊松分布。问题是每个 SAGE 文库只代表一个单一的测量。我们问：鉴于 SAGE 文库中的标签计数样本非常有限，Audic-Claverie 方法实际上有多有用？我们严格分析了构成该方法基础的 A-C 统计量，该统计量基于一次观察代表了我们对潜在标签生成过程的了解。

结果

我们表明，A-C 统计量和标签计数的基础泊松分布具有相同的模式结构。此外，当 A-C 统计量根据泊松分布的模式进行条件处理时，与真实未知泊松分布的 K-L 散度最小化。最重要的是，这个 K-L 散度的期望从不超过 1/2 位。

结论

Audic-Claverie 方法的严格基础一直缺失。我们的结果构成了一个严格的论据，支持即使 SAGE 文库代表非常稀疏的样本，也可以使用 Audic-Claverie 方法。

相似文献

Basic properties and information theory of Audic-Claverie statistic for analyzing cDNA arrays.

BMC Bioinformatics. 2009 Sep 23;10:310. doi: 10.1186/1471-2105-10-310.

Statistical evaluation of SAGE libraries: consequences for experimental design.

Physiol Genomics. 2002 Oct 29;11(2):37-44. doi: 10.1152/physiolgenomics.00042.2002.

A comparative analysis of the information content in long and short SAGE libraries.

BMC Bioinformatics. 2006 Nov 16;7:504. doi: 10.1186/1471-2105-7-504.

Modeling Sage data with a truncated gamma-Poisson model.

BMC Bioinformatics. 2006 Mar 20;7:157. doi: 10.1186/1471-2105-7-157.

Statistical analysis and significance testing of serial analysis of gene expression data using a Poisson mixture model.

BMC Bioinformatics. 2007 Aug 2;8:282. doi: 10.1186/1471-2105-8-282.

[Transcriptomes for serial analysis of gene expression].

J Soc Biol. 2002;196(4):303-7.

Identitag, a relational database for SAGE tag identification and interspecies comparison of SAGE libraries.

BMC Bioinformatics. 2004 Oct 6;5:143. doi: 10.1186/1471-2105-5-143.

Statistical modeling of sequencing errors in SAGE libraries.

Bioinformatics. 2004 Aug 4;20 Suppl 1:i31-9. doi: 10.1093/bioinformatics/bth924.

A novel pairwise comparison method for in silico discovery of statistically significant cis-regulatory elements in eukaryotic promoter regions: application to Arabidopsis.

J Theor Biol. 2015 Jan 7;364:364-76. doi: 10.1016/j.jtbi.2014.09.038. Epub 2014 Oct 7.

Clustering analysis of SAGE transcription profiles using a Poisson approach.

Methods Mol Biol. 2008;387:185-98. doi: 10.1007/978-1-59745-454-4_14.

引用本文的文献

microRNA Targeting Cytochrome P450 Is Involved in Chlorfenapyr Tolerance in the Silkworm, (Lepidoptera: Bombycidae).

Insects. 2025 May 12;16(5):515. doi: 10.3390/insects16050515.

A simple polydopamine-based platform for engineering extracellular vesicles with brain-targeting peptide and imaging probes to improve stroke outcome.

J Extracell Vesicles. 2025 Jan;14(1):e70031. doi: 10.1002/jev2.70031.

Effect of inulin on small extracellular vesicles microRNAs in milk from dairy cows with subclinical mastitis.

J Anim Sci. 2023 Jan 3;101. doi: 10.1093/jas/skae366.

Novel miRNA biomarkers for alveolar echinococcosis: sequencing and clinical validation.

Parasitology. 2024 Nov;151(13):1473-1486. doi: 10.1017/S0031182024001367.

Exploration and validation of ceRNA regulatory networks in colorectal cancer based on associations whole transcriptome sequencing.

Sci Rep. 2024 Sep 2;14(1):20446. doi: 10.1038/s41598-024-71465-5.

Peripheral Blood miRNA Expression in Patients with Essential Hypertension in the Han Chinese Population in Hefei, China.

Biochem Genet. 2024 Jun 21. doi: 10.1007/s10528-024-10867-6.

Preparation of therapy-grade extracellular vesicles from adipose tissue to promote diabetic wound healing.

Front Bioeng Biotechnol. 2023 Mar 23;11:1129187. doi: 10.3389/fbioe.2023.1129187. eCollection 2023.

Role of MicroRNA-Like RNAs in the Regulation of Spore Morphological Differences in the Entomopathogenic Fungus .

Pol J Microbiol. 2022 Sep 24;71(3):309-324. doi: 10.33073/pjm-2022-028. eCollection 2022 Sep 1.

Identification and expression analysis of sex biased miRNAs in chinese hook snout carp .

Front Genet. 2022 Sep 2;13:990683. doi: 10.3389/fgene.2022.990683. eCollection 2022.

Transcriptome analysis revealed gene expression feminization of testis after exogenous tetrodotoxin administration in pufferfish Takifugu flavidus.

BMC Genomics. 2022 Aug 3;23(1):553. doi: 10.1186/s12864-022-08787-z.

本文引用的文献

SuperSAGE: the drought stress-responsive transcriptome of chickpea roots.

BMC Genomics. 2008 Nov 24;9:553. doi: 10.1186/1471-2164-9-553.

Pepper EST database: comprehensive in silico tool for analyzing the chili pepper (Capsicum annuum) transcriptome.

BMC Plant Biol. 2008 Oct 9;8:101. doi: 10.1186/1471-2229-8-101.

Comparative transcriptome analysis of in vivo- and in vitro-produced porcine blastocysts by small amplified RNA-serial analysis of gene expression (SAR-SAGE).

Mol Reprod Dev. 2008 Jun;75(6):976-88. doi: 10.1002/mrd.20844.

Gene expression in diplosporous and sexual Eragrostis curvula genotypes with differing ploidy levels.

Plant Mol Biol. 2008 May;67(1-2):11-23. doi: 10.1007/s11103-008-9305-9. Epub 2008 Mar 3.

Application of massively parallel sequencing to microRNA profiling and discovery in human embryonic stem cells.

Genome Res. 2008 Apr;18(4):610-21. doi: 10.1101/gr.7179508. Epub 2008 Feb 19.

No accelerated rate of protein evolution in male-biased Drosophila pseudoobscura genes.

Genetics. 2006 Sep;174(1):411-20. doi: 10.1534/genetics.106.057414. Epub 2006 Jul 2.

Characterization and quantification of mRNA transcripts in ejaculated spermatozoa of fertile men by serial analysis of gene expression.

Hum Reprod. 2006 Jun;21(6):1583-90. doi: 10.1093/humrep/del027. Epub 2006 Feb 24.

The plant energy-dissipating mitochondrial systems: depicting the genomic structure and the expression profiles of the gene families of uncoupling protein and alternative oxidase in monocots and dicots.

J Exp Bot. 2006;57(4):849-64. doi: 10.1093/jxb/erj070. Epub 2006 Feb 10.

Coffee and tomato share common gene repertoires as revealed by deep sequencing of seed and cherry transcripts.

Theor Appl Genet. 2005 Dec;112(1):114-30. doi: 10.1007/s00122-005-0112-2. Epub 2005 Nov 5.

A multistep bioinformatic approach detects putative regulatory elements in gene promoters.

BMC Bioinformatics. 2005 May 18;6:121. doi: 10.1186/1471-2105-6-121.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

用于分析 cDNA 芯片的 Audic-Claverie 统计量的基本性质和信息论。

Basic properties and information theory of Audic-Claverie statistic for analyzing cDNA arrays.

机构信息

出版信息

BACKGROUND

RESULTS

CONCLUSION

背景

结果

结论

相似文献

引用本文的文献

本文引用的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献