School of Electronic Engineering, Beijing University of Posts and Telecommunications, Beijing 100876, China.
Department of Breast Center, Peking University People's Hospital, Beijing, 100044, China.
Spectrochim Acta A Mol Biomol Spectrosc. 2022 Dec 15;283:121715. doi: 10.1016/j.saa.2022.121715. Epub 2022 Aug 5.
Early detection of breast cancer is of great value in improving the prognosis. The current detection methods of breast cancer have their own limitations. In this study, we investigated the feasibility of Fourier Transform Infrared (FT-IR) spectroscopy combined with different classification algorithms for the early detection of breast cancer in a large sample of 526 patients, including 308 invasive breast cancer, 101 ductal carcinoma in situ, and 117 healthy controls. The serum was measured with FT-IR spectroscopy. Kennard-Stone (KS) algorithm was used to divide the data into the training set and testing set. Support vector machine (SVM) model and back propagation neural network (BPNN) model were used to distinguish ductal carcinoma in situ, invasive breast cancer from healthy controls. The accuracies of the SVM model and BPNN model were 92.9% and 94.2%. To determine the effect of different material absorption bands on early detection, the band was divided into four parts including 900-1425 cm, 1475-1710 cm, 2800-3000 cm, and 3090-3700 cm, to be modeled and detected respectively. The final results showed that the ranges 900-1425 cm and 1475-1710 cm had superior classification accuracies. The region 900-1425 cm corresponded to the lipids, proteins, sugar, and nucleic acids, and the region 1475-1710 cm corresponded to the proteins. The biochemical substances in other bands also contributed some unique potential to the classification, so the classification accuracy was the best in the full band. The study indicates that serum FT-IR spectroscopy combined with SVM and BPNN models is an effective tool for the early detection of breast cancer.
早期发现乳腺癌对改善预后具有重要价值。目前乳腺癌的检测方法各有局限性。在这项研究中,我们研究了傅里叶变换红外(FT-IR)光谱结合不同分类算法在 526 例患者(包括 308 例浸润性乳腺癌、101 例导管原位癌和 117 例健康对照)大样本中用于早期检测乳腺癌的可行性。采用 FT-IR 光谱法测量血清。肯纳德-斯通(KS)算法将数据分为训练集和测试集。支持向量机(SVM)模型和反向传播神经网络(BPNN)模型用于区分导管原位癌、浸润性乳腺癌和健康对照。SVM 模型和 BPNN 模型的准确率分别为 92.9%和 94.2%。为了确定不同物质吸收带对早期检测的影响,将波段分为 900-1425cm、1475-1710cm、2800-3000cm 和 3090-3700cm 四个部分,分别建模和检测。最终结果表明,900-1425cm 和 1475-1710cm 两个波段的分类准确率较高。900-1425cm 波段对应脂质、蛋白质、糖和核酸,1475-1710cm 波段对应蛋白质。其他波段的生化物质也为分类提供了一些独特的潜在信息,因此全波段的分类准确率最高。本研究表明,血清 FT-IR 光谱结合 SVM 和 BPNN 模型是一种有效的乳腺癌早期检测工具。