Kern Fabian, Fehlmann Tobias, Keller Andreas
Chair for Clinical Bioinformatics, Saarland University, Saarbrücken 66123, Germany.
Center for Bioinformatics, Saarland Informatics Campus, Saarbrücken 66123, Germany.
Nucleic Acids Res. 2020 Dec 16;48(22):12523-12533. doi: 10.1093/nar/gkaa1125.
Web services are used through all disciplines in life sciences and the online landscape is growing by hundreds of novel servers annually. However, availability varies, and maintenance practices are largely inconsistent. We screened the availability of 2396 web tools published during the past 10 years. All servers were accessed over 133 days and 318 668 index files were stored in a local database. The number of accessible tools almost linearly increases in time with highest availability for 2019 and 2020 (∼90%) and lowest for tools published in 2010 (∼50%). In a 133-day test frame, 31% of tools were always working, 48.4% occasionally and 20.6% never. Consecutive downtimes were typically below 5 days with a median of 1 day, and unevenly distributed over the weekdays. A rescue experiment on 47 tools that were published from 2019 onwards but never accessible showed that 51.1% of the tools could be restored in due time. We found a positive association between the number of citations and the probability of a web server being reachable. We then determined common challenges and formulated categorical recommendations for researchers planning to develop web-based resources. As implication of our study, we propose to develop a repository for automatic API testing and sustainability indexing.
网络服务在生命科学的各个学科中都有应用,并且每年都有数百个新的服务器在网络环境中涌现。然而,其可用性各不相同,维护方式也很大程度上不一致。我们筛选了过去10年中发布的2396个网络工具的可用性。所有服务器在133天内被访问,318668个索引文件存储在本地数据库中。可访问工具的数量几乎随时间呈线性增加,2019年和2020年的可用性最高(约90%),2010年发布的工具可用性最低(约50%)。在133天的测试期内,31%的工具始终可用,48.4%偶尔可用,20.6%从未可用。连续停机时间通常低于5天,中位数为1天,且在工作日分布不均。对2019年以后发布但从未可访问的47个工具进行的救援实验表明,51.1%的工具可以及时恢复。我们发现引用次数与网络服务器可访问的概率之间存在正相关。然后,我们确定了常见挑战,并为计划开发基于网络资源的研究人员制定了分类建议。作为我们研究的启示,我们建议开发一个用于自动API测试和可持续性索引的存储库。