Department of Hematology, Xiangya Hospital, Central South University, Changsha, 410008, China; National Clinical Research Center for Geriatric Diseases (Xiangya Hospital), China; Hunan Hematology Oncology Clinical Medical Research Center, China.
Department of Thoracic Surgery, National Cancer Center/National Clinical Research Center for Cancer/Cancer Hospital, Chinese Academy of Medical Sciences and Peking Union Medical College, Beijing, 100021, China.
Surg Oncol. 2024 Feb;52:102009. doi: 10.1016/j.suronc.2023.102009. Epub 2023 Oct 16.
In the 21st century, the development of medical science has entered the era of big data, and machine learning has become an essential tool for mining medical big data. The establishment of the SEER database has provided a wealth of epidemiological data for cancer clinical research, and the number of studies based on SEER and machine learning has been growing in recent years. This article reviews recent research based on SEER and machine learning and finds that the current focus of such studies is primarily on the development and validation of models using machine learning algorithms, with the main directions being lymph node metastasis prediction, distant metastasis prediction, and prognosis-related research. Compared to traditional models, machine learning algorithms have the advantage of stronger adaptability, but also suffer from disadvantages such as overfitting and poor interpretability, which need to be weighed in practical applications. At present, machine learning algorithms, as the foundation of artificial intelligence, have just begun to emerge in the field of cancer clinical research. The future development of oncology will enter a more precise era of cancer research, characterized by larger data, higher dimensions, and more frequent information exchange. Machine learning is bound to shine brightly in this field.
在 21 世纪,医学科学的发展已经进入大数据时代,机器学习已经成为挖掘医学大数据的必要工具。SEER 数据库的建立为癌症临床研究提供了丰富的流行病学数据,近年来基于 SEER 和机器学习的研究数量一直在增长。本文综述了基于 SEER 和机器学习的最新研究,发现目前此类研究的重点主要集中在开发和验证使用机器学习算法的模型上,主要方向是淋巴结转移预测、远处转移预测和预后相关研究。与传统模型相比,机器学习算法具有更强的适应性,但也存在过拟合和可解释性差等缺点,在实际应用中需要权衡。目前,机器学习算法作为人工智能的基础,刚刚在癌症临床研究领域崭露头角。肿瘤学的未来发展将进入癌症研究更精确的时代,其特点是数据更大、维度更高、信息交换更频繁。机器学习在这一领域必将大放异彩。