Bae Kideog, Jeon Young Seok, Hwangbo Yul, Yoo Chong Woo, Han Nayoung, Feng Mengling
Healthcare AI Team, Healthcare Platform Center, National Cancer Center, Goyang-si, Gyeonggi-do, Republic of Korea.
Institute of Data Science, National University of Singapore, Singapore, Singapore.
JMIR Cancer. 2023 Sep 5;9:e45547. doi: 10.2196/45547.
Breast cancer subtyping is a crucial step in determining therapeutic options, but the molecular examination based on immunohistochemical staining is expensive and time-consuming. Deep learning opens up the possibility to predict the subtypes based on the morphological information from hematoxylin and eosin staining, a much cheaper and faster alternative. However, training the predictive model conventionally requires a large number of histology images, which is challenging to collect by a single institute.
We aimed to develop a data-efficient computational pathology platform, 3DHistoNet, which is capable of learning from z-stacked histology images to accurately predict breast cancer subtypes with a small sample size.
We retrospectively examined 401 cases of patients with primary breast carcinoma diagnosed between 2018 and 2020 at the Department of Pathology, National Cancer Center, South Korea. Pathology slides of the patients with breast carcinoma were prepared according to the standard protocols. Age, gender, histologic grade, hormone receptor (estrogen receptor [ER], progesterone receptor [PR], and androgen receptor [AR]) status, erb-B2 receptor tyrosine kinase 2 (HER2) status, and Ki-67 index were evaluated by reviewing medical charts and pathological records.
The area under the receiver operating characteristic curve and decision curve were analyzed to evaluate the performance of our 3DHistoNet platform for predicting the ER, PR, AR, HER2, and Ki67 subtype biomarkers with 5-fold cross-validation. We demonstrated that 3DHistoNet can predict all clinically important biomarkers (ER, PR, AR, HER2, and Ki67) with performance exceeding the conventional multiple instance learning models by a considerable margin (area under the receiver operating characteristic curve: 0.75-0.91 vs 0.67-0.8). We further showed that our z-stack histology scanning method can make up for insufficient training data sets without any additional cost incurred. Finally, 3DHistoNet offered an additional capability to generate attention maps that reveal correlations between Ki67 and histomorphological features, which renders the hematoxylin and eosin image in higher fidelity to the pathologist.
Our stand-alone, data-efficient pathology platform that can both generate z-stacked images and predict key biomarkers is an appealing tool for breast cancer diagnosis. Its development would encourage morphology-based diagnosis, which is faster, cheaper, and less error-prone compared to the protein quantification method based on immunohistochemical staining.
乳腺癌亚型分类是确定治疗方案的关键步骤,但基于免疫组织化学染色的分子检测昂贵且耗时。深度学习为基于苏木精和伊红染色的形态学信息预测亚型开辟了可能性,这是一种成本更低、速度更快的替代方法。然而,传统上训练预测模型需要大量组织学图像,单个机构收集这些图像具有挑战性。
我们旨在开发一个数据高效的计算病理学平台3DHistoNet,它能够从z轴堆叠的组织学图像中学习,以小样本量准确预测乳腺癌亚型。
我们回顾性研究了2018年至2020年期间韩国国立癌症中心病理科诊断的401例原发性乳腺癌患者。根据标准方案制备乳腺癌患者的病理切片。通过查阅病历和病理记录评估年龄、性别、组织学分级、激素受体(雌激素受体[ER]、孕激素受体[PR]和雄激素受体[AR])状态、erb-B2受体酪氨酸激酶2(HER2)状态和Ki-67指数。
通过5折交叉验证,分析受试者操作特征曲线下面积和决策曲线,以评估我们的3DHistoNet平台预测ER、PR、AR、HER2和Ki67亚型生物标志物的性能。我们证明3DHistoNet可以预测所有临床上重要的生物标志物(ER、PR、AR、HER2和Ki67),其性能比传统的多实例学习模型有显著提高(受试者操作特征曲线下面积:0.75 - 0.91对0.67 - 0.8)。我们进一步表明,我们的z轴堆叠组织学扫描方法可以在不产生任何额外成本的情况下弥补训练数据集的不足。最后,3DHistoNet提供了一种额外的能力,即生成注意力图,揭示Ki67与组织形态学特征之间的相关性,这使苏木精和伊红图像对病理学家来说具有更高的保真度。
我们的独立、数据高效的病理学平台既能生成z轴堆叠图像又能预测关键生物标志物,是乳腺癌诊断的一个有吸引力的工具。它的开发将鼓励基于形态学的诊断,与基于免疫组织化学染色的蛋白质定量方法相比,这种诊断更快、更便宜且出错率更低。