Malek Aminah Abdul, Alias Mohd Almie, Razak Fatimah Abdul, Noorani Mohd Salmi Md, Mahmud Rozi, Zulkepli Nur Fariha Syaqina
Department of Mathematical Sciences, Faculty of Science & Technology, Universiti Kebangsaan Malaysia (UKM), Bangi 43600, Selangor, Malaysia.
Mathematical Sciences Studies, College of Computing, Informatics and Media, Universiti Teknologi MARA (UiTM) Negeri Sembilan Branch, Seremban Campus, Seremban 70300, Negeri Sembilan, Malaysia.
Cancers (Basel). 2023 May 4;15(9):2606. doi: 10.3390/cancers15092606.
Microcalcifications in mammogram images are primary indicators for detecting the early stages of breast cancer. However, dense tissues and noise in the images make it challenging to classify the microcalcifications. Currently, preprocessing procedures such as noise removal techniques are applied directly on the images, which may produce a blurry effect and loss of image details. Further, most of the features used in classification models focus on local information of the images and are often burdened with details, resulting in data complexity. This research proposed a filtering and feature extraction technique using persistent homology (PH), a powerful mathematical tool used to study the structure of complex datasets and patterns. The filtering process is not performed directly on the image matrix but through the diagrams arising from PH. These diagrams will enable us to distinguish prominent characteristics of the image from noise. The filtered diagrams are then vectorised using PH features. Supervised machine learning models are trained on the MIAS and DDSM datasets to evaluate the extracted features' efficacy in discriminating between benign and malignant classes and to obtain the optimal filtering level. This study reveals that appropriate PH filtering levels and features can improve classification accuracy in early cancer detection.
乳房X光图像中的微钙化是检测乳腺癌早期阶段的主要指标。然而,图像中的致密组织和噪声使得对微钙化进行分类具有挑战性。目前,诸如噪声去除技术等预处理程序直接应用于图像,这可能会产生模糊效果并导致图像细节丢失。此外,分类模型中使用的大多数特征都集中在图像的局部信息上,并且常常负担着细节,导致数据复杂性。本研究提出了一种使用持久同调(PH)的滤波和特征提取技术,PH是一种用于研究复杂数据集和模式结构的强大数学工具。滤波过程不是直接在图像矩阵上执行,而是通过PH产生的图表来进行。这些图表将使我们能够从噪声中区分出图像的突出特征。然后使用PH特征将滤波后的图表矢量化。在MIAS和DDSM数据集上训练监督机器学习模型,以评估提取的特征在区分良性和恶性类别方面的有效性,并获得最佳滤波水平。这项研究表明,适当的PH滤波水平和特征可以提高早期癌症检测中的分类准确率。