Du Xinsong, Dastmalchi Farhad, Ye Hao, Garrett Timothy J, Diller Matthew A, Liu Mei, Hogan William R, Brochhausen Mathias, Lemas Dominick J
Department of Health Outcomes and Biomedical Informatics, University of Florida College of Medicine, Gainesville, FL, USA.
Health Science Center Libraries, University of Florida, Florida, USA.
Metabolomics. 2023 Feb 6;19(2):11. doi: 10.1007/s11306-023-01974-3.
Liquid chromatography-high resolution mass spectrometry (LC-HRMS) is a popular approach for metabolomics data acquisition and requires many data processing software tools. The FAIR Principles - Findability, Accessibility, Interoperability, and Reusability - were proposed to promote open science and reusable data management, and to maximize the benefit obtained from contemporary and formal scholarly digital publishing. More recently, the FAIR principles were extended to include Research Software (FAIR4RS).
This study facilitates open science in metabolomics by providing an implementation solution for adopting FAIR4RS in the LC-HRMS metabolomics data processing software. We believe our evaluation guidelines and results can help improve the FAIRness of research software.
We evaluated 124 LC-HRMS metabolomics data processing software obtained from a systematic review and selected 61 software for detailed evaluation using FAIR4RS-related criteria, which were extracted from the literature along with internal discussions. We assigned each criterion one or more FAIR4RS categories through discussion. The minimum, median, and maximum percentages of criteria fulfillment of software were 21.6%, 47.7%, and 71.8%. Statistical analysis revealed no significant improvement in FAIRness over time. We identified four criteria covering multiple FAIR4RS categories but had a low %fulfillment: (1) No software had semantic annotation of key information; (2) only 6.3% of evaluated software were registered to Zenodo and received DOIs; (3) only 14.5% of selected software had official software containerization or virtual machine; (4) only 16.7% of evaluated software had a fully documented functions in code. According to the results, we discussed improvement strategies and future directions.
液相色谱 - 高分辨率质谱(LC - HRMS)是代谢组学数据采集的常用方法,需要许多数据处理软件工具。为促进开放科学和可重复使用的数据管理,并最大限度地从当代正式学术数字出版中获益,人们提出了FAIR原则,即可发现性、可访问性、互操作性和可重用性。最近,FAIR原则扩展到包括研究软件(FAIR4RS)。
本研究通过为在LC - HRMS代谢组学数据处理软件中采用FAIR4RS提供实施解决方案,促进代谢组学中的开放科学。我们相信我们的评估指南和结果有助于提高研究软件的FAIR性。
我们评估了通过系统综述获得的124个LC - HRMS代谢组学数据处理软件,并使用从文献和内部讨论中提取的与FAIR4RS相关的标准,选择了61个软件进行详细评估。通过讨论,我们为每个标准分配了一个或多个FAIR4RS类别。软件满足标准的最低、中位数和最高百分比分别为21.6%、47.7%和71.8%。统计分析显示,随着时间的推移,FAIR性没有显著提高。我们确定了四个涵盖多个FAIR4RS类别的标准,但满足率较低:(1)没有软件对关键信息进行语义注释;(2)只有6.3%的评估软件在Zenodo上注册并获得数字对象标识符(DOI);(3)只有14.5%的选定软件有官方软件容器化或虚拟机;(4)只有16.7%的评估软件在代码中有完整记录的功能。根据结果,我们讨论了改进策略和未来方向。