• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用来自冗余探针集的双样本t检验统计量来评估基因芯片研究中的不同探针集算法。

Utilization of two sample t-test statistics from redundant probe sets to evaluate different probe set algorithms in GeneChip studies.

作者信息

Hu Zihua, Willsky Gail R

机构信息

Center for Computational Research, Department of Biostatistics, University at Buffalo, Buffalo, NY 14260, USA.

出版信息

BMC Bioinformatics. 2006 Jan 10;7:12. doi: 10.1186/1471-2105-7-12.

DOI:10.1186/1471-2105-7-12
PMID:16403228
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC1361777/
Abstract

BACKGROUND

The choice of probe set algorithms for expression summary in a GeneChip study has a great impact on subsequent gene expression data analysis. Spiked-in cRNAs with known concentration are often used to assess the relative performance of probe set algorithms. Given the fact that the spiked-in cRNAs do not represent endogenously expressed genes in experiments, it becomes increasingly important to have methods to study whether a particular probe set algorithm is more appropriate for a specific dataset, without using such external reference data.

RESULTS

We propose the use of the probe set redundancy feature for evaluating the performance of probe set algorithms, and have presented three approaches for analyzing data variance and result bias using two sample t-test statistics from redundant probe sets. These approaches are as follows: 1) analyzing redundant probe set variance based on t-statistic rank order, 2) computing correlation of t-statistics between redundant probe sets, and 3) analyzing the co-occurrence of replicate redundant probe sets representing differentially expressed genes. We applied these approaches to expression summary data generated from three datasets utilizing individual probe set algorithms of MAS5.0, dChip, or RMA. We also utilized combinations of options from the three probe set algorithms. We found that results from the three approaches were similar within each individual expression summary dataset, and were also in good agreement with previously reported findings by others. We also demonstrate the validity of our findings by independent experimental methods.

CONCLUSION

All three proposed approaches allowed us to assess the performance of probe set algorithms using the probe set redundancy feature. The analyses of redundant probe set variance based on t-statistic rank order and correlation of t-statistics between redundant probe sets provide useful tools for data variance analysis, and the co-occurrence of replicate redundant probe sets representing differentially expressed genes allows estimation of result bias. The results also suggest that individual probe set algorithms have dataset-specific performance.

摘要

背景

在基因芯片研究中,用于表达汇总的探针集算法选择对后续基因表达数据分析有很大影响。已知浓度的掺入式cRNA常被用于评估探针集算法的相对性能。鉴于掺入式cRNA在实验中并不代表内源性表达基因,因此,在不使用此类外部参考数据的情况下,研究特定探针集算法是否更适合特定数据集的方法变得越来越重要。

结果

我们建议使用探针集冗余特征来评估探针集算法的性能,并提出了三种利用冗余探针集的双样本t检验统计量分析数据方差和结果偏差的方法。这些方法如下:1)基于t统计量排序分析冗余探针集方差;2)计算冗余探针集之间t统计量的相关性;3)分析代表差异表达基因的重复冗余探针集的共现情况。我们将这些方法应用于利用MAS5.0、dChip或RMA的单个探针集算法从三个数据集中生成的表达汇总数据。我们还利用了这三种探针集算法的选项组合。我们发现,在每个单独的表达汇总数据集中,这三种方法的结果相似,并且也与其他人先前报道的结果高度一致。我们还通过独立实验方法证明了我们发现的有效性。

结论

所有三种提出的方法都使我们能够利用探针集冗余特征评估探针集算法的性能。基于t统计量排序的冗余探针集方差分析和冗余探针集之间t统计量的相关性分析为数据方差分析提供了有用的工具,而代表差异表达基因的重复冗余探针集的共现情况则可以估计结果偏差。结果还表明,单个探针集算法具有数据集特定的性能。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/3f3bd3ac4806/1471-2105-7-12-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/8d0c8b6948b0/1471-2105-7-12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/50e474d4b9de/1471-2105-7-12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/49ee7e7ae107/1471-2105-7-12-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/5a8c6829a6ce/1471-2105-7-12-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/e7ddd2468d68/1471-2105-7-12-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/aa231a16bbd3/1471-2105-7-12-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/d7b1d6f5a89a/1471-2105-7-12-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/3f3bd3ac4806/1471-2105-7-12-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/8d0c8b6948b0/1471-2105-7-12-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/50e474d4b9de/1471-2105-7-12-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/49ee7e7ae107/1471-2105-7-12-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/5a8c6829a6ce/1471-2105-7-12-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/e7ddd2468d68/1471-2105-7-12-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/aa231a16bbd3/1471-2105-7-12-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/d7b1d6f5a89a/1471-2105-7-12-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8b4c/1361777/3f3bd3ac4806/1471-2105-7-12-8.jpg

相似文献

1
Utilization of two sample t-test statistics from redundant probe sets to evaluate different probe set algorithms in GeneChip studies.利用来自冗余探针集的双样本t检验统计量来评估基因芯片研究中的不同探针集算法。
BMC Bioinformatics. 2006 Jan 10;7:12. doi: 10.1186/1471-2105-7-12.
2
Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比:Affymetrix微阵列中特定项目的算法选择和检测p值加权
Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.
3
Normality of oligonucleotide microarray data and implications for parametric statistical analyses.寡核苷酸微阵列数据的正态性及其对参数统计分析的影响。
Bioinformatics. 2003 Nov 22;19(17):2254-62. doi: 10.1093/bioinformatics/btg311.
4
Comparison of seven methods for producing Affymetrix expression scores based on False Discovery Rates in disease profiling data.基于疾病谱数据中错误发现率的七种生成Affymetrix表达分数方法的比较。
BMC Bioinformatics. 2005 Feb 10;6:26. doi: 10.1186/1471-2105-6-26.
5
Empirical validation of the S-Score algorithm in the analysis of gene expression data.S评分算法在基因表达数据分析中的实证验证。
BMC Bioinformatics. 2006 Mar 17;7:154. doi: 10.1186/1471-2105-7-154.
6
How to decide? Different methods of calculating gene expression from short oligonucleotide array data will give different results.如何做出决定?从短寡核苷酸阵列数据计算基因表达的不同方法会得出不同的结果。
BMC Bioinformatics. 2006 Mar 15;7:137. doi: 10.1186/1471-2105-7-137.
7
The utility of MAS5 expression summary and detection call algorithms.MAS5表达汇总及检测调用算法的效用。
BMC Bioinformatics. 2007 Jul 30;8:273. doi: 10.1186/1471-2105-8-273.
8
The statistics of identifying differentially expressed genes in Expresso and TM4: a comparison.在Expresso和TM4中鉴定差异表达基因的统计:一项比较。
BMC Bioinformatics. 2006 Apr 20;7:215. doi: 10.1186/1471-2105-7-215.
9
Empirical Bayes models for multiple probe type microarrays at the probe level.探针水平上多探针类型微阵列的经验贝叶斯模型。
BMC Bioinformatics. 2008 Mar 20;9:156. doi: 10.1186/1471-2105-9-156.
10
Leveraging two-way probe-level block design for identifying differential gene expression with high-density oligonucleotide arrays.利用双向探针水平块设计通过高密度寡核苷酸阵列鉴定差异基因表达。
BMC Bioinformatics. 2004 Apr 20;5:42. doi: 10.1186/1471-2105-5-42.

引用本文的文献

1
DCGL v2.0: an R package for unveiling differential regulation from differential co-expression.DCGL v2.0:一个用于从差异共表达中揭示差异调控的R软件包。
PLoS One. 2013 Nov 20;8(11):e79729. doi: 10.1371/journal.pone.0079729. eCollection 2013.
2
The expression of embryonic liver development genes in hepatitis C induced cirrhosis and hepatocellular carcinoma.丙型肝炎诱导的肝硬化和肝细胞癌中胚胎肝脏发育基因的表达。
Cancers (Basel). 2012 Sep 1;4(3):945-68. doi: 10.3390/cancers4030945.

本文引用的文献

1
Diabetes-altered gene expression in rat skeletal muscle corrected by oral administration of vanadyl sulfate.口服硫酸氧钒可纠正糖尿病大鼠骨骼肌中基因表达的改变。
Physiol Genomics. 2006 Aug 16;26(3):192-201. doi: 10.1152/physiolgenomics.00196.2005. Epub 2006 May 9.
2
A genome-wide transcriptional analysis using Arabidopsis thaliana Affymetrix gene chips determined plant responses to phosphate deprivation.利用拟南芥Affymetrix基因芯片进行的全基因组转录分析确定了植物对磷缺乏的反应。
Proc Natl Acad Sci U S A. 2005 Aug 16;102(33):11934-9. doi: 10.1073/pnas.0505266102. Epub 2005 Aug 5.
3
A sequence-based identification of the genes detected by probesets on the Affymetrix U133 plus 2.0 array.
基于序列对Affymetrix U133 plus 2.0芯片上探针集检测到的基因进行鉴定。
Nucleic Acids Res. 2005 Feb 18;33(3):e31. doi: 10.1093/nar/gni027.
4
Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function.基因表达的复杂性状分析揭示了调节神经系统功能的多基因和多效性网络。
Nat Genet. 2005 Mar;37(3):233-42. doi: 10.1038/ng1518. Epub 2005 Feb 13.
5
Preferred analysis methods for Affymetrix GeneChips revealed by a wholly defined control dataset.由完全定义的对照数据集揭示的Affymetrix基因芯片的首选分析方法。
Genome Biol. 2005;6(2):R16. doi: 10.1186/gb-2005-6-2-r16. Epub 2005 Jan 28.
6
Interactively optimizing signal-to-noise ratios in expression profiling: project-specific algorithm selection and detection p-value weighting in Affymetrix microarrays.在表达谱分析中交互式优化信噪比:Affymetrix微阵列中特定项目的算法选择和检测p值加权
Bioinformatics. 2004 Nov 1;20(16):2534-44. doi: 10.1093/bioinformatics/bth280. Epub 2004 Apr 29.
7
A benchmark for Affymetrix GeneChip expression measures.Affymetrix基因芯片表达量测量的一个基准。
Bioinformatics. 2004 Feb 12;20(3):323-31. doi: 10.1093/bioinformatics/btg410.
8
Comparative analysis of algorithms for signal quantitation from oligonucleotide microarrays.来自寡核苷酸微阵列的信号定量算法的比较分析
Bioinformatics. 2004 Apr 12;20(6):839-46. doi: 10.1093/bioinformatics/btg487. Epub 2004 Jan 29.
9
Exploration, normalization, and summaries of high density oligonucleotide array probe level data.高密度寡核苷酸阵列探针水平数据的探索、标准化及汇总
Biostatistics. 2003 Apr;4(2):249-64. doi: 10.1093/biostatistics/4.2.249.
10
DAVID: Database for Annotation, Visualization, and Integrated Discovery.DAVID:注释、可视化与整合发现数据库。
Genome Biol. 2003;4(5):P3. Epub 2003 Apr 3.