Department of Gastroenterology and Hepatology, Amsterdam Gastroenterology Endocrinology Metabolism, Amsterdam University Medical Centres, location Academic Medical Center, University of Amsterdam, Amsterdam, the Netherlands.
Department of Gastroenterology and Hepatology, Radboud University Nijmegen Medical Center, Radboud University of Nijmegen, Nijmegen, the Netherlands.
Gastrointest Endosc. 2023 Feb;97(2):184-199.e16. doi: 10.1016/j.gie.2022.08.043. Epub 2022 Sep 7.
Publicly available databases containing colonoscopic imaging data are valuable resources for artificial intelligence (AI) research. Currently, little is known regarding the available number and content of these databases. This review aimed to describe the availability, accessibility, and usability of publicly available colonoscopic imaging databases, focusing on polyp detection, polyp characterization, and quality of colonoscopy.
A systematic literature search was performed in MEDLINE and Embase to identify AI studies describing publicly available colonoscopic imaging databases published after 2010. Second, a targeted search using Google's Dataset Search, Google Search, GitHub, and Figshare was done to identify databases directly. Databases were included if they contained data about polyp detection, polyp characterization, or quality of colonoscopy. To assess accessibility of databases, the following categories were defined: open access, open access with barriers, and regulated access. To assess the potential usability of the included databases, essential details of each database were extracted using a checklist derived from the Checklist for Artificial Intelligence in Medical Imaging.
We identified 22 databases with open access, 3 databases with open access with barriers, and 15 databases with regulated access. The 22 open access databases contained 19,463 images and 952 videos. Nineteen of these databases focused on polyp detection, localization, and/or segmentation; 6 on polyp characterization, and 3 on quality of colonoscopy. Only half of these databases have been used by other researcher to develop, train, or benchmark their AI system. Although technical details were in general well reported, important details such as polyp and patient demographics and the annotation process were under-reported in almost all databases.
This review provides greater insight on public availability of colonoscopic imaging databases for AI research. Incomplete reporting of important details limits the ability of researchers to assess the usability of current databases.
包含结肠镜成像数据的公开数据库是人工智能(AI)研究的宝贵资源。目前,对于这些数据库的数量和内容知之甚少。本综述旨在描述可用于 AI 研究的公开结肠镜成像数据库的可用性、可及性和可用性,重点关注息肉检测、息肉特征描述和结肠镜质量。
在 MEDLINE 和 Embase 中进行了系统的文献检索,以确定描述 2010 年后发布的包含关于息肉检测、息肉特征描述或结肠镜质量的公开结肠镜成像数据库的 AI 研究。其次,使用 Google 的 Dataset Search、Google Search、GitHub 和 Figshare 进行了有针对性的搜索,以直接查找数据库。如果数据库包含有关息肉检测、息肉特征描述或结肠镜质量的数据,则将其纳入。为了评估数据库的可及性,定义了以下类别:开放获取、开放获取有障碍和受监管访问。为了评估所包含数据库的潜在可用性,使用源自医学成像人工智能检查表的清单提取了每个数据库的基本详细信息。
我们确定了 22 个具有开放访问权限的数据库、3 个具有开放访问权限障碍的数据库和 15 个具有监管访问权限的数据库。这 22 个开放访问数据库包含 19463 张图像和 952 个视频。其中 19 个数据库专注于息肉检测、定位和/或分割;6 个用于息肉特征描述,3 个用于结肠镜质量。这些数据库中只有一半被其他研究人员用于开发、培训或基准测试他们的 AI 系统。尽管技术细节通常报告得很好,但几乎所有数据库都没有报告重要细节,如息肉和患者人口统计学以及注释过程。
本综述提供了更多关于 AI 研究用结肠镜成像数据库的公开可用性的信息。重要细节报告不完整,限制了研究人员评估当前数据库可用性的能力。