University of Western Australia, School of Physics, Crawley 6009, Australia.
J Biomed Opt. 2012 Jan;17(1):016005. doi: 10.1117/1.JBO.17.1.016005.
We investigate the efficacy of using data reduction techniques to aid classification of terahertz (THz) pulse data obtained from tumor and normal breast tissue. Fifty-one samples were studied from patients undergoing breast surgery at Addenbrooke's Hospital in Cambridge and Guy's Hospital in London. Three methods of data reduction were used: ten heuristic parameters, principal components of the pulses, and principal components of the ten parameter space. Classification was performed using the support vector machine approach with a radial basis function. The best classification accuracy, when using all ten components, came from using the principal components on the pulses and principal components on the parameter, with an accuracy of 92%. When less than ten components were used, the principal components on the parameter space outperformed the other methods. As a visual demonstration of the classification technique, we apply the data reduction/classification to several example images and demonstrate that, aside from some interpatient variability and edge effects, the algorithm gives good classification on terahertz data from breast tissue. The results indicate that under controlled conditions data reduction and SVM classification can be used with good accuracy to classify tumor and normal breast tissue.
我们研究了使用数据降维技术辅助分类太赫兹(THz)脉冲数据的效果,这些数据来自剑桥的阿登布鲁克医院和伦敦的盖伊医院接受乳房手术的患者。研究了 51 个样本。使用了三种数据降维方法:十个启发式参数、脉冲的主成分和十个参数空间的主成分。使用支持向量机方法和径向基函数进行分类。使用所有十个成分时,使用脉冲上的主成分和参数上的主成分的分类精度最高,准确率为 92%。当使用的成分少于十个时,参数空间上的主成分优于其他方法。作为分类技术的可视化演示,我们将数据降维和分类应用于几个示例图像,并证明除了一些个体间的可变性和边缘效应外,该算法在乳房组织的太赫兹数据上的分类效果很好。结果表明,在受控条件下,数据降维和 SVM 分类可以以较高的精度用于分类肿瘤和正常乳房组织。