Suppr超能文献

dtoolAI:深度学习的可重复性。

dtoolAI: Reproducibility for Deep Learning.

作者信息

Hartley Matthew, Olsson Tjelvar S G

机构信息

Computational Systems Biology, John Innes Centre, Norwich, Norfolk NR4 7UH, UK.

出版信息

Patterns (N Y). 2020 Jul 23;1(5):100073. doi: 10.1016/j.patter.2020.100073. eCollection 2020 Aug 14.

Abstract

Deep learning, a set of approaches using artificial neural networks, has generated rapid recent advancements in machine learning. Deep learning does, however, have the potential to reduce the reproducibility of scientific results. Model outputs are critically dependent on the data and processing approach used to initially generate the model, but this provenance information is usually lost during model training. To avoid a future reproducibility crisis, we need to improve our deep-learning model management. The FAIR principles for data stewardship and software/workflow implementation give excellent high-level guidance on ensuring effective reuse of data and software. We suggest some specific guidelines for the generation and use of deep-learning models in science and explain how these relate to the FAIR principles. We then present dtoolAI, a Python package that we have developed to implement these guidelines. The package implements automatic capture of provenance information during model training and simplifies model distribution.

摘要

深度学习是一组使用人工神经网络的方法,近年来在机器学习领域取得了快速进展。然而,深度学习有可能降低科学结果的可重复性。模型输出严重依赖于最初用于生成模型的数据和处理方法,但这些来源信息在模型训练过程中通常会丢失。为避免未来出现可重复性危机,我们需要改进深度学习模型管理。数据管理以及软件/工作流程实施的FAIR原则为确保数据和软件的有效重用提供了出色的高层次指导。我们提出了一些关于科学领域深度学习模型生成和使用的具体指南,并解释了这些指南与FAIR原则的关系。然后,我们介绍了dtoolAI,这是一个我们为实施这些指南而开发的Python包。该包在模型训练期间实现了来源信息的自动捕获,并简化了模型分发。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验