Suppr超能文献

眼科成像公共可用数据集的全球回顾:获取、可用性和可推广性的障碍。

A global review of publicly available datasets for ophthalmological imaging: barriers to access, usability, and generalisability.

机构信息

Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK.

Academic Unit of Ophthalmology, Institute of Inflammation & Ageing, College of Medical and Dental Sciences, University of Birmingham, Birmingham, UK; Ophthalmology Department, University Hospitals Birmingham NHS Foundation Trust, Birmingham, UK; Moorfields Eye Hospital NHS Foundation Trust, London, UK; Health Data Research UK, London, UK; Centre for Regulatory Science and Innovation, Birmingham Health Partners, Birmingham, UK.

出版信息

Lancet Digit Health. 2021 Jan;3(1):e51-e66. doi: 10.1016/S2589-7500(20)30240-5. Epub 2020 Oct 1.

Abstract

Health data that are publicly available are valuable resources for digital health research. Several public datasets containing ophthalmological imaging have been frequently used in machine learning research; however, the total number of datasets containing ophthalmological health information and their respective content is unclear. This Review aimed to identify all publicly available ophthalmological imaging datasets, detail their accessibility, describe which diseases and populations are represented, and report on the completeness of the associated metadata. With the use of MEDLINE, Google's search engine, and Google Dataset Search, we identified 94 open access datasets containing 507 724 images and 125 videos from 122 364 patients. Most datasets originated from Asia, North America, and Europe. Disease populations were unevenly represented, with glaucoma, diabetic retinopathy, and age-related macular degeneration disproportionately overrepresented in comparison with other eye diseases. The reporting of basic demographic characteristics such as age, sex, and ethnicity was poor, even at the aggregate level. This Review provides greater visibility for ophthalmological datasets that are publicly available as powerful resources for research. Our paper also exposes an increasing divide in the representation of different population and disease groups in health data repositories. The improved reporting of metadata would enable researchers to access the most appropriate datasets for their needs and maximise the potential of such resources.

摘要

公开的健康数据是数字健康研究的宝贵资源。有几个包含眼科成像的公共数据集在机器学习研究中经常被使用;然而,包含眼科健康信息的数据集的总数及其各自的内容尚不清楚。本综述旨在确定所有可公开获取的眼科成像数据集,详细说明其可访问性,描述所代表的疾病和人群,并报告相关元数据的完整性。我们使用 MEDLINE、谷歌搜索引擎和谷歌数据集搜索,共确定了 94 个包含 507724 张图像和 125 个视频的开放获取数据集,这些数据集来自 122364 名患者。大多数数据集来自亚洲、北美和欧洲。疾病人群的代表性不均衡,与其他眼部疾病相比,青光眼、糖尿病视网膜病变和年龄相关性黄斑变性的比例过高。即使在总体水平上,基本人口统计学特征(如年龄、性别和种族)的报告也很差。本综述为可公开获取的眼科数据集提供了更多的可见性,这些数据集是研究的强大资源。我们的论文还揭示了健康数据存储库中不同人群和疾病群体代表性的差距越来越大。改进元数据的报告将使研究人员能够访问最适合其需求的数据集,并最大限度地发挥这些资源的潜力。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验