Zulueta-Coarasa Teresa, Jug Florian, Mathur Aastha, Moore Josh, Muñoz-Barrutia Arrate, Anita Liviu, Babalola Kolawole, Bankhead Peter, Gilloteaux Perrine, Gogoberidze Nodar, Jones Martin L, Kleywegt Gerard J, Korir Paul, Kreshuk Anna, Küpcü Yoldaş Aybüke, Marconato Luca, Narayan Kedar, Norlin Nils, Oezdemir Bugra, Riesterer Jessica L, Russell Craig, Rzepka Norman, Sarkans Ugis, Serrano-Solano Beatriz, Tischer Christian, Uhlmann Virginie, Ulman Vladimír, Hartley Matthew
European Molecular Biology Laboratory, European Bioinformatics Institute, Hinxton, UK.
Fondazione Human Technopole, V.le Rita Levi-Montalcini, Milan, Italy.
Nat Methods. 2025 Sep 15. doi: 10.1038/s41592-025-02835-8.
Artificial intelligence (AI) methods are powerful tools for biological image analysis and processing. High-quality annotated images are key to training and developing new algorithms, but access to such data is often hindered by the lack of standards for sharing datasets. We discuss the barriers to sharing annotated image datasets and suggest specific guidelines to improve the reuse of bioimages and annotations for AI applications. These include standards on data formats, metadata, data presentation and sharing, and incentives to generate new datasets. We are sure that the Metadata, Incentives, Formats and Accessibility (MIFA) recommendations will accelerate the development of AI tools for bioimage analysis by facilitating access to high-quality training and benchmarking data.
人工智能(AI)方法是生物图像分析和处理的强大工具。高质量的注释图像是训练和开发新算法的关键,但由于缺乏数据集共享标准,获取此类数据往往受到阻碍。我们讨论了共享注释图像数据集的障碍,并提出了具体指南,以促进生物图像和注释在人工智能应用中的重用。这些指南包括数据格式、元数据、数据呈现与共享方面的标准,以及生成新数据集的激励措施。我们确信,元数据、激励措施、格式与可访问性(MIFA)建议将通过促进获取高质量训练和基准数据,加速用于生物图像分析的人工智能工具的开发。