Corredor Germán, Whitney Jon, Arias Viviana, Madabhushi Anant, Romero Eduardo
Universidad Nacional de Colombia, Computer Imaging and Medical Applications Lab, Department of Medical Imaging, Bogota, Colombia; Case Western Reserve University, Center of Computational Imaging and Personalized Diagnostics, Department of Biomedical Engineering, Cleveland, Ohio, United States.
Case Western Reserve University , Center of Computational Imaging and Personalized Diagnostics, Department of Biomedical Engineering, Cleveland, Ohio, United States.
J Med Imaging (Bellingham). 2017 Apr;4(2):021105. doi: 10.1117/1.JMI.4.2.021105. Epub 2017 Mar 11.
Computational histomorphometric approaches typically use low-level image features for building machine learning classifiers. However, these approaches usually ignore high-level expert knowledge. A computational model (M_im) combines low-, mid-, and high-level image information to predict the likelihood of cancer in whole slide images. Handcrafted low- and mid-level features are computed from area, color, and spatial nuclei distributions. High-level information is implicitly captured from the recorded navigations of pathologists while exploring whole slide images during diagnostic tasks. This model was validated by predicting the presence of cancer in a set of unseen fields of view. The available database was composed of 24 cases of basal-cell carcinoma, from which 17 served to estimate the model parameters and the remaining 7 comprised the evaluation set. A total of 274 fields of view of size [Formula: see text] were extracted from the evaluation set. Then 176 patches from this set were used to train a support vector machine classifier to predict the presence of cancer on a patch-by-patch basis while the remaining 98 image patches were used for independent testing, ensuring that the training and test sets do not comprise patches from the same patient. A baseline model (M_ex) estimated the cancer likelihood for each of the image patches. M_ex uses the same visual features as M_im, but its weights are estimated from nuclei manually labeled as cancerous or noncancerous by a pathologist. M_im achieved an accuracy of 74.49% and an [Formula: see text]-measure of 80.31%, while M_ex yielded corresponding accuracy and F-measures of 73.47% and 77.97%, respectively.
计算组织形态计量学方法通常使用低级图像特征来构建机器学习分类器。然而,这些方法通常会忽略高级专家知识。一种计算模型(M_im)结合了低、中、高级图像信息,以预测全玻片图像中癌症的可能性。手工制作的低级和中级特征是根据面积、颜色和细胞核空间分布计算得出的。高级信息是在病理学家进行诊断任务时探索全玻片图像的过程中,从记录的导航中隐式获取的。该模型通过预测一组未见视野中癌症的存在进行了验证。可用数据库由24例基底细胞癌病例组成,其中17例用于估计模型参数,其余7例组成评估集。从评估集中总共提取了274个大小为[公式:见正文]的视野。然后,从该集合中选取176个图像块来训练支持向量机分类器,以便逐块预测癌症的存在,而其余98个图像块用于独立测试,确保训练集和测试集不包含来自同一患者的图像块。一个基线模型(M_ex)估计每个图像块的癌症可能性。M_ex使用与M_im相同的视觉特征,但其权重是根据病理学家手动标记为癌性或非癌性的细胞核估计得出的。M_im的准确率为74.49%,F值为80.31%,而M_ex的相应准确率和F值分别为73.47%和77.97%。