Post Andrew R, Ho Nancy, Rasmussen Erik, Post Ivan, Cho Aika, Hofer John, Maness Arthur T, Parnell Timothy, Nix David A
Huntsman Cancer Institute, University of Utah, Salt Lake City, UT 84112, United States.
Department of Biomedical Informatics, University of Utah, Salt Lake City, UT 84112, United States.
JAMIA Open. 2023 Oct 17;6(4):ooad089. doi: 10.1093/jamiaopen/ooad089. eCollection 2023 Dec.
Using agile software development practices, develop and evaluate an architecture and implementation for reliable and user-friendly self-service management of bioinformatic data stored in the cloud.
Comprehensive Oncology Research Environment (CORE) Browser is a new open-source web application for cancer researchers to manage sequencing data organized in a flexible format in Amazon Simple Storage Service (S3) buckets. It has a microservices- and hypermedia-based architecture, which we integrated with Test-Driven Development (TDD), the iterative writing of computable specifications for how software should work prior to development. Relying on repeating patterns found in hypermedia-based architectures, we hypothesized that hypermedia would permit developing test "templates" that can be parameterized and executed for each microservice, maximizing code coverage while minimizing effort.
After one-and-a-half years of development, the CORE Browser backend had 121 test templates and 875 custom tests that were parameterized and executed 3031 times, providing 78% code coverage.
Architecting to permit test reuse through a hypermedia approach was a key success factor for our testing efforts. CORE Browser's application of hypermedia and TDD illustrates one way to integrate software engineering methods into data-intensive networked applications. Separating bioinformatic data management from analysis distinguishes this platform from others in bioinformatics and may provide stable data management while permitting analysis methods to advance more rapidly.
Software engineering practices are underutilized in informatics. Similar informatics projects will more likely succeed through application of good architecture and automated testing. Our approach is broadly applicable to data management tools involving cloud data storage.
运用敏捷软件开发实践,开发并评估一种用于可靠且用户友好地自助管理存储在云端的生物信息数据的架构及实现方式。
综合肿瘤研究环境(CORE)浏览器是一款面向癌症研究人员的新型开源网络应用程序,用于管理以灵活格式存储在亚马逊简单存储服务(S3)存储桶中的测序数据。它具有基于微服务和超媒体的架构,我们将其与测试驱动开发(TDD)集成,即在开发之前迭代编写关于软件应如何工作的可计算规范。依靠在基于超媒体的架构中发现的重复模式,我们假设超媒体将允许开发可参数化并针对每个微服务执行的测试“模板”,在最小化工作量的同时最大化代码覆盖率。
经过一年半的开发,CORE浏览器后端有121个测试模板和875个自定义测试,这些测试被参数化并执行了3031次,提供了78% 的代码覆盖率。
通过超媒体方法构建以允许测试重用是我们测试工作的关键成功因素。CORE浏览器对超媒体和TDD的应用说明了将软件工程方法集成到数据密集型网络应用程序中的一种方式。将生物信息数据管理与分析分离使该平台在生物信息学领域区别于其他平台,并且可能在允许分析方法更快发展的同时提供稳定的数据管理。
软件工程实践在信息学中未得到充分利用。类似的信息学项目通过应用良好的架构和自动化测试更有可能取得成功。我们的方法广泛适用于涉及云数据存储的数据管理工具。