Department of Electrical and Electronics, Polytechnic of Bari, Via E, Orabona, 4, 70125 Bari, Italy.
BMC Bioinformatics. 2012 May 8;13 Suppl 7(Suppl 7):S9. doi: 10.1186/1471-2105-13-S7-S9.
DNA microarray data are used to identify genes which could be considered prognostic markers. However, due to the limited sample size of each study, the signatures are unstable in terms of the composing genes and may be limited in terms of performances. It is therefore of great interest to integrate different studies, thus increasing sample size.
In the past, several studies explored the issue of microarray data merging, but the arrival of new techniques and a focus on SVM based classification needed further investigation. We used distant metastasis prediction based on SVM attribute selection and classification to three breast cancer data sets.
The results showed that breast cancer classification does not benefit from data merging, confirming the results found by other studies with different techniques.
DNA 微阵列数据被用于鉴定可被视为预后标志物的基因。然而,由于每个研究的样本量有限,因此特征在组成基因方面不稳定,并且在性能方面可能受到限制。因此,整合不同的研究以增加样本量是非常有意义的。
过去,有几项研究探讨了微阵列数据合并的问题,但新技术的出现和基于 SVM 的分类的关注需要进一步调查。我们使用基于 SVM 属性选择和分类的远处转移预测来对三个乳腺癌数据集进行了分析。
结果表明,乳腺癌分类并不能从数据合并中受益,这与其他使用不同技术的研究结果一致。