Suppr超能文献

基于特征可变性的微阵列乳腺癌分类的综合敏感性分析。

A comprehensive sensitivity analysis of microarray breast cancer classification under feature variability.

机构信息

Molecular Diagnostics Department, Eindhoven, the Netherlands.

出版信息

BMC Bioinformatics. 2009 Nov 26;10:389. doi: 10.1186/1471-2105-10-389.

Abstract

BACKGROUND

Large discrepancies in signature composition and outcome concordance have been observed between different microarray breast cancer expression profiling studies. This is often ascribed to differences in array platform as well as biological variability. We conjecture that other reasons for the observed discrepancies are the measurement error associated with each feature and the choice of preprocessing method. Microarray data are known to be subject to technical variation and the confidence intervals around individual point estimates of expression levels can be wide. Furthermore, the estimated expression values also vary depending on the selected preprocessing scheme. In microarray breast cancer classification studies, however, these two forms of feature variability are almost always ignored and hence their exact role is unclear.

RESULTS

We have performed a comprehensive sensitivity analysis of microarray breast cancer classification under the two types of feature variability mentioned above. We used data from six state of the art preprocessing methods, using a compendium consisting of eight different datasets, involving 1131 hybridizations, containing data from both one and two-color array technology. For a wide range of classifiers, we performed a joint study on performance, concordance and stability. In the stability analysis we explicitly tested classifiers for their noise tolerance by using perturbed expression profiles that are based on uncertainty information directly related to the preprocessing methods. Our results indicate that signature composition is strongly influenced by feature variability, even if the array platform and the stratification of patient samples are identical. In addition, we show that there is often a high level of discordance between individual class assignments for signatures constructed on data coming from different preprocessing schemes, even if the actual signature composition is identical.

CONCLUSION

Feature variability can have a strong impact on breast cancer signature composition, as well as the classification of individual patient samples. We therefore strongly recommend that feature variability is considered in analyzing data from microarray breast cancer expression profiling experiments.

摘要

背景

不同的微阵列乳腺癌表达谱研究之间存在签名组成和结果一致性的巨大差异。这通常归因于阵列平台以及生物变异性的差异。我们推测观察到的差异的其他原因是与每个特征相关的测量误差以及预处理方法的选择。微阵列数据已知受到技术变化的影响,表达水平的个体点估计的置信区间可能很宽。此外,估计的表达值还取决于所选预处理方案。然而,在微阵列乳腺癌分类研究中,这两种形式的特征可变性几乎总是被忽略,因此其确切作用尚不清楚。

结果

我们对上述两种类型的特征可变性进行了微阵列乳腺癌分类的全面敏感性分析。我们使用了来自六种最先进的预处理方法的数据,使用了由八个不同数据集组成的汇编,涉及 1131 次杂交,包含来自双色和单色彩集技术的数据。对于广泛的分类器,我们对性能、一致性和稳定性进行了联合研究。在稳定性分析中,我们通过使用基于与预处理方法直接相关的不确定性信息的扰动表达谱来明确测试分类器的噪声容忍度。我们的结果表明,即使阵列平台和患者样本分层相同,签名组成也受到特征可变性的强烈影响。此外,我们表明,即使实际签名组成相同,来自不同预处理方案的数据构建的签名的个体分类分配之间通常存在高度不一致。

结论

特征可变性会对乳腺癌签名组成以及个体患者样本的分类产生重大影响。因此,我们强烈建议在分析微阵列乳腺癌表达谱实验数据时考虑特征可变性。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/e1d3/2789744/f5d649ab35b5/1471-2105-10-389-3.jpg

相似文献

1
2
Integration of RNA-Seq data with heterogeneous microarray data for breast cancer profiling.
BMC Bioinformatics. 2017 Nov 21;18(1):506. doi: 10.1186/s12859-017-1925-0.
3
Micro-Analyzer: automatic preprocessing of Affymetrix microarray data.
Comput Methods Programs Biomed. 2013 Aug;111(2):402-9. doi: 10.1016/j.cmpb.2013.04.006. Epub 2013 May 31.
4
Classification across gene expression microarray studies.
BMC Bioinformatics. 2009 Dec 30;10:453. doi: 10.1186/1471-2105-10-453.
8
Effects of sample size on robustness and prediction accuracy of a prognostic gene signature.
BMC Bioinformatics. 2009 May 16;10:147. doi: 10.1186/1471-2105-10-147.
10
Outcome prediction based on microarray analysis: a critical perspective on methods.
BMC Bioinformatics. 2009 Feb 7;10:53. doi: 10.1186/1471-2105-10-53.

引用本文的文献

2
Biological network-driven gene selection identifies a stromal immune module as a key determinant of triple-negative breast carcinoma prognosis.
Oncoimmunology. 2015 Jun 24;5(1):e1061176. doi: 10.1080/2162402X.2015.1061176. eCollection 2016.
3
Prediction of breast cancer metastasis by gene expression profiles: a comparison of metagenes and single genes.
Cancer Inform. 2012;11:193-217. doi: 10.4137/CIN.S10375. Epub 2012 Dec 10.
4
An integrated approach for identifying wrongly labelled samples when performing classification in microarray data.
PLoS One. 2012;7(10):e46700. doi: 10.1371/journal.pone.0046700. Epub 2012 Oct 17.
5
Single sample expression-anchored mechanisms predict survival in head and neck cancer.
PLoS Comput Biol. 2012 Jan;8(1):e1002350. doi: 10.1371/journal.pcbi.1002350. Epub 2012 Jan 26.
7
An evaluation protocol for subtype-specific breast cancer event prediction.
PLoS One. 2011;6(7):e21681. doi: 10.1371/journal.pone.0021681. Epub 2011 Jul 8.

本文引用的文献

1
puma: a Bioconductor package for propagating uncertainty in microarray analysis.
BMC Bioinformatics. 2009 Jul 9;10:211. doi: 10.1186/1471-2105-10-211.
2
Effects of sample size on robustness and prediction accuracy of a prognostic gene signature.
BMC Bioinformatics. 2009 May 16;10:147. doi: 10.1186/1471-2105-10-147.
4
NCBI GEO: archive for high-throughput functional genomic data.
Nucleic Acids Res. 2009 Jan;37(Database issue):D885-90. doi: 10.1093/nar/gkn764. Epub 2008 Oct 21.
6
Consolidated strategy for the analysis of microarray spike-in data.
Nucleic Acids Res. 2008 Oct;36(17):e108. doi: 10.1093/nar/gkn430. Epub 2008 Aug 1.
8
Reproducibility of microarray data: a further analysis of microarray quality control (MAQC) data.
BMC Bioinformatics. 2007 Oct 25;8:412. doi: 10.1186/1471-2105-8-412.
9
A comparison of background correction methods for two-colour microarrays.
Bioinformatics. 2007 Oct 15;23(20):2700-7. doi: 10.1093/bioinformatics/btm412. Epub 2007 Aug 25.

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验