Sukhadia Shrey S, Muller Kristen E, Workman Adrienne A, Nagaraj Shivashankar H
Centre for Genomics and Personalised Health, Queensland University of Technology, Brisbane, QLD 4059, Australia.
Department of Pathology and Laboratory Medicine, Dartmouth Hitchcock Medical Center, Lebanon, NH 03766, USA.
Cancers (Basel). 2023 Aug 3;15(15):3960. doi: 10.3390/cancers15153960.
Breast cancer is the most common type of cancer worldwide. Alarmingly, approximately 30% of breast cancer cases result in disease recurrence at distant organs after treatment. Distant recurrence is more common in some subtypes such as invasive breast carcinoma (IBC). While clinicians have utilized several clinicopathological measurements to predict distant recurrences in IBC, no studies have predicted distant recurrences by combining clinicopathological evaluations of IBC tumors pre- and post-therapy with machine learning (ML) models. The goal of our study was to determine whether classification-based ML techniques could predict distant recurrences in IBC patients using key clinicopathological measurements, including pathological staging of the tumor and surrounding lymph nodes assessed both pre- and post-neoadjuvant therapy, response to therapy via standard-of-care imaging, and binary status of adjuvant therapy administered to patients. We trained and tested four clinicopathological ML models using a dataset (144 and 17 patients for training and testing, respectively) from Duke University and validated the best-performing model using an external dataset (8 patients) from Dartmouth Hitchcock Medical Center. The random forest model performed better than the C-support vector classifier, multilayer perceptron, and logistic regression models, yielding AUC values of 1.0 in the testing set and 0.75 in the validation set ( < 0.002) across both institutions, thereby demonstrating the cross-institutional portability and validity of ML models in the field of clinical research in cancer. The top-ranking clinicopathological measurement impacting the prediction of distant recurrences in IBC were identified to be tumor response to neoadjuvant therapy as evaluated via SOC imaging and pathology, which included tumor as well as node staging.
乳腺癌是全球最常见的癌症类型。令人担忧的是,约30%的乳腺癌病例在治疗后会出现远处器官的疾病复发。远处复发在某些亚型中更为常见,如浸润性乳腺癌(IBC)。虽然临床医生已经利用多种临床病理测量方法来预测IBC的远处复发,但尚无研究通过将IBC肿瘤治疗前后的临床病理评估与机器学习(ML)模型相结合来预测远处复发。我们研究的目的是确定基于分类的ML技术能否使用关键的临床病理测量方法来预测IBC患者的远处复发,这些测量方法包括肿瘤和周围淋巴结在新辅助治疗前后的病理分期、通过标准护理成像评估的治疗反应以及给予患者的辅助治疗的二元状态。我们使用来自杜克大学的数据集(分别有144例和17例患者用于训练和测试)训练并测试了四种临床病理ML模型,并使用来自达特茅斯希区柯克医疗中心的外部数据集(8例患者)验证了表现最佳的模型。随机森林模型的表现优于C支持向量分类器、多层感知器和逻辑回归模型,在两个机构的测试集中AUC值为1.0,在验证集中为0.75(<0.002),从而证明了ML模型在癌症临床研究领域的跨机构可移植性和有效性。影响IBC远处复发预测的排名靠前的临床病理测量方法被确定为通过标准护理成像和病理学评估的肿瘤对新辅助治疗的反应,其中包括肿瘤以及淋巴结分期。