Suppr超能文献

人工智能在生物医学科学中的稳健性和可重复性:RENOIR。

Robustness and reproducibility for AI learning in biomedical sciences: RENOIR.

机构信息

Nuffield Department of Surgical Sciences, Medical Sciences Division, University of Oxford, Old Road Campus Research Building, Roosevelt Drive, Oxford, OX3 7DQ, UK.

Computational Biology and Integrative Genomics Lab, Department of Oncology, Medical Sciences Division, University of Oxford, Oxford, OX3 7DQ, UK.

出版信息

Sci Rep. 2024 Jan 22;14(1):1933. doi: 10.1038/s41598-024-51381-4.

Abstract

Artificial intelligence (AI) techniques are increasingly applied across various domains, favoured by the growing acquisition and public availability of large, complex datasets. Despite this trend, AI publications often suffer from lack of reproducibility and poor generalisation of findings, undermining scientific value and contributing to global research waste. To address these issues and focusing on the learning aspect of the AI field, we present RENOIR (REpeated random sampliNg fOr machIne leaRning), a modular open-source platform for robust and reproducible machine learning (ML) analysis. RENOIR adopts standardised pipelines for model training and testing, introducing elements of novelty, such as the dependence of the performance of the algorithm on the sample size. Additionally, RENOIR offers automated generation of transparent and usable reports, aiming to enhance the quality and reproducibility of AI studies. To demonstrate the versatility of our tool, we applied it to benchmark datasets from health, computer science, and STEM (Science, Technology, Engineering, and Mathematics) domains. Furthermore, we showcase RENOIR's successful application in recently published studies, where it identified classifiers for SET2D and TP53 mutation status in cancer. Finally, we present a use case where RENOIR was employed to address a significant pharmacological challenge-predicting drug efficacy. RENOIR is freely available at https://github.com/alebarberis/renoir .

摘要

人工智能(AI)技术在各个领域的应用越来越广泛,这得益于越来越多的大型、复杂数据集的获取和公开。尽管有这种趋势,但人工智能出版物往往存在可重复性差和发现结果泛化能力差的问题,这破坏了科学价值,并导致全球研究浪费。为了解决这些问题,并专注于人工智能领域的学习方面,我们提出了 RENOIR(重复随机抽样的机器学习),这是一个用于稳健和可重复机器学习(ML)分析的模块化开源平台。RENOIR 采用标准化的模型训练和测试管道,并引入了新颖性元素,例如算法性能对样本大小的依赖性。此外,RENOIR 提供了透明且可用的报告的自动生成,旨在提高 AI 研究的质量和可重复性。为了展示我们工具的多功能性,我们将其应用于来自健康、计算机科学和 STEM(科学、技术、工程和数学)领域的基准数据集。此外,我们展示了 RENOIR 在最近发表的研究中的成功应用,其中它确定了癌症中 SET2D 和 TP53 突变状态的分类器。最后,我们提出了一个使用案例,其中 RENOIR 用于解决一个重要的药理学挑战——预测药物疗效。RENOIR 可在 https://github.com/alebarberis/renoir 上免费获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/366d/10810363/6aaf0e7a0cc9/41598_2024_51381_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验