Tissue Image Analytics Centre, University of Warwick, Coventry, UK.
Pathology, University of Nottingham, Nottingham, UK.
J Pathol Clin Res. 2022 Mar;8(2):116-128. doi: 10.1002/cjp2.256. Epub 2022 Jan 10.
Recent advances in whole-slide imaging (WSI) technology have led to the development of a myriad of computer vision and artificial intelligence-based diagnostic, prognostic, and predictive algorithms. Computational Pathology (CPath) offers an integrated solution to utilise information embedded in pathology WSIs beyond what can be obtained through visual assessment. For automated analysis of WSIs and validation of machine learning (ML) models, annotations at the slide, tissue, and cellular levels are required. The annotation of important visual constructs in pathology images is an important component of CPath projects. Improper annotations can result in algorithms that are hard to interpret and can potentially produce inaccurate and inconsistent results. Despite the crucial role of annotations in CPath projects, there are no well-defined guidelines or best practices on how annotations should be carried out. In this paper, we address this shortcoming by presenting the experience and best practices acquired during the execution of a large-scale annotation exercise involving a multidisciplinary team of pathologists, ML experts, and researchers as part of the Pathology image data Lake for Analytics, Knowledge and Education (PathLAKE) consortium. We present a real-world case study along with examples of different types of annotations, diagnostic algorithm, annotation data dictionary, and annotation constructs. The analyses reported in this work highlight best practice recommendations that can be used as annotation guidelines over the lifecycle of a CPath project.
近年来,全玻片成像(WSI)技术的进步催生了大量基于计算机视觉和人工智能的诊断、预后和预测算法。计算病理学(CPath)提供了一种集成的解决方案,可利用病理学 WSI 中嵌入的信息,超越通过视觉评估所能获得的信息。为了对 WSI 进行自动分析并验证机器学习(ML)模型,需要在幻灯片、组织和细胞水平进行标注。病理学图像中重要视觉结构的标注是 CPath 项目的重要组成部分。不恰当的标注可能导致算法难以解释,并可能产生不准确和不一致的结果。尽管标注在 CPath 项目中起着至关重要的作用,但目前尚无关于如何进行标注的明确定义的指南或最佳实践。本文通过介绍在 Pathology image data Lake for Analytics, Knowledge and Education(PathLAKE)联盟中,病理学家、机器学习专家和研究人员组成的多学科团队执行大规模标注工作所获得的经验和最佳实践,解决了这一不足。我们提供了一个真实案例研究,并展示了不同类型的标注、诊断算法、标注数据字典和标注结构的示例。本工作中的分析报告强调了可在 CPath 项目生命周期中用作标注指南的最佳实践建议。