Goh Jia Yin, Khang Tsung Fei
Institute of Mathematical Sciences, Faculty of Science, Universiti Malaya, Kuala Lumpur, Malaysia.
Universiti Malaya Centre for Data Analytics, Universiti Malaya, Kuala Lumpur, Malaysia.
PeerJ Comput Sci. 2021 Sep 9;7:e698. doi: 10.7717/peerj-cs.698. eCollection 2021.
In image analysis, orthogonal moments are useful mathematical transformations for creating new features from digital images. Moreover, orthogonal moment invariants produce image features that are resistant to translation, rotation, and scaling operations. Here, we show the result of a case study in biological image analysis to help researchers judge the potential efficacy of image features derived from orthogonal moments in a machine learning context. In taxonomic classification of forensically important flies from the Sarcophagidae and the Calliphoridae family ( = 74), we found the GUIDE random forests model was able to completely classify samples from 15 different species correctly based on Krawtchouk moment invariant features generated from fly wing images, with zero out-of-bag error probability. For the more challenging problem of classifying breast masses based solely on digital mammograms from the CBIS-DDSM database ( = 1,151), we found that image features generated from the Generalized pseudo-Zernike moments and the Krawtchouk moments only enabled the GUIDE kernel model to achieve modest classification performance. However, using the predicted probability of malignancy from GUIDE as a feature together with five expert features resulted in a reasonably good model that has mean sensitivity of 85%, mean specificity of 61%, and mean accuracy of 70%. We conclude that orthogonal moments have high potential as informative image features in taxonomic classification problems where the patterns of biological variations are not overly complex. For more complicated and heterogeneous patterns of biological variations such as those present in medical images, relying on orthogonal moments alone to reach strong classification performance is unrealistic, but integrating prediction result using them with carefully selected expert features may still produce reasonably good prediction models.
在图像分析中,正交矩是用于从数字图像创建新特征的有用数学变换。此外,正交矩不变量产生的图像特征对平移、旋转和缩放操作具有抗性。在此,我们展示了一个生物图像分析案例研究的结果,以帮助研究人员判断在机器学习背景下从正交矩导出的图像特征的潜在功效。在对法医重要的麻蝇科和丽蝇科苍蝇( = 74)进行分类时,我们发现GUIDE随机森林模型能够根据从苍蝇翅膀图像生成的Krawtchouk矩不变特征,正确地完全分类来自15个不同物种的样本,袋外误差概率为零。对于仅基于CBIS - DDSM数据库中的数字乳腺X线照片( = 1,151)对乳腺肿块进行分类这一更具挑战性的问题,我们发现从广义伪泽尼克矩和Krawtchouk矩生成的图像特征仅使GUIDE核模型能够实现适度的分类性能。然而,将GUIDE的恶性预测概率作为一个特征与五个专家特征一起使用,得到了一个相当不错的模型,其平均灵敏度为85%,平均特异性为61%,平均准确率为70%。我们得出结论,在生物变异模式不过于复杂的分类问题中,正交矩作为信息丰富的图像特征具有很高的潜力。对于更复杂和异质的生物变异模式,如医学图像中存在的那些,仅依靠正交矩来达到强大的分类性能是不现实的,但将使用它们的预测结果与精心选择的专家特征相结合,仍可能产生相当不错的预测模型。