Phan John H, Young Andrew N, Wang May D
Dept. of Biomed. Eng., Georgia Inst. of Technol., Atlanta, GA 30332, USA.
Conf Proc IEEE Eng Med Biol Soc. 2006;2006:3317-20. doi: 10.1109/IEMBS.2006.259746.
The challenge of biomarker identification for bionanotechnology is that we need to find less than ten potential biomarkers from high throughput data so that quantum dot synthesis and imaging can be effective. Among all the extensive biomarker research, the novelty of our research is to reduce the number the biomarkers by studying the efficacy of several classifiers and error estimation methods. Specifically, we are using renal cancer expression data. The dataset consists of 31 microarray samples divided into four classes -- clear cell, oncocytoma/chromophobe, papillary, and angiomyolipoma. Each class is compared to all other classes using error estimation methods for support vector machines (SVM), Fisher's discriminant (FD), and signed distance function (SDF). Prior knowledge of significant biomarker from a previous study is used to score the effectiveness of each classifier in correctly identifying these biomarkers. We have achieved intelligent model selection for biomarker identification so that the total number of nano-imaging targets is small.
生物纳米技术中生物标志物识别面临的挑战在于,我们需要从高通量数据中找出不到十种潜在的生物标志物,以便量子点合成和成像能够有效进行。在所有广泛的生物标志物研究中,我们研究的新颖之处在于,通过研究几种分类器和误差估计方法的有效性来减少生物标志物的数量。具体而言,我们正在使用肾癌表达数据。该数据集由31个微阵列样本组成,分为四类——透明细胞癌、嗜酸性细胞瘤/嫌色细胞癌、乳头状癌和血管平滑肌脂肪瘤。使用支持向量机(SVM)、费舍尔判别法(FD)和符号距离函数(SDF)的误差估计方法,将每一类与所有其他类进行比较。利用先前研究中重要生物标志物的先验知识来评估每个分类器正确识别这些生物标志物的有效性。我们已经实现了用于生物标志物识别的智能模型选择,从而使纳米成像靶点的总数很少。