Karn Thomas, Rody Achim, Müller Volkmar, Schmidt Marcus, Becker Sven, Holtrich Uwe, Pusztai Lajos
Department of Gynecology, Goethe-University Frankfurt, Frankfurt am Main, Germany.
Department of Obstetrics and Gynecology, University Hospital Lübeck, Germany.
Genom Data. 2014 Oct 23;2:354-6. doi: 10.1016/j.gdata.2014.09.014. eCollection 2014 Dec.
Heterogenous subtypes of breast cancer need to be analyzed separately. Pooling of datasets can provide reasonable sample sizes but dataset bias is an important concern. We assembled a combined dataset of 579 Affymetrix microarrays from triple negative breast cancer (TNBC) in Gene Expression Omnibus (GEO) series GSE31519. We developed a method for selecting comparable datasets and to control for the amount of dataset bias of individual probesets.
乳腺癌的异质性亚型需要分别进行分析。数据集的合并可以提供合理的样本量,但数据集偏差是一个重要问题。我们从基因表达综合数据库(GEO)系列GSE31519中收集了579个三阴性乳腺癌(TNBC)的Affymetrix微阵列的组合数据集。我们开发了一种方法来选择可比数据集,并控制单个探针集的数据集偏差量。