Department of Molecular Oncology, British Columbia Cancer Agency, 675 West 10th Ave, V5Z 1L3 Vancouver, BC, Canada.
Department of Pathology and Laboratory Medicine, University of British Columbia, 2211 Wesbrook Mall, V6T 2B5 Vancouver, BC, Canada.
Gigascience. 2017 Jul 1;6(7):1-10. doi: 10.1093/gigascience/gix042.
The field of next-generation sequencing informatics has matured to a point where algorithmic advances in sequence alignment and individual feature detection methods have stabilized. Practical and robust implementation of complex analytical workflows (where such tools are structured into "best practices" for automated analysis of next-generation sequencing datasets) still requires significant programming investment and expertise.
We present Kronos, a software platform for facilitating the development and execution of modular, auditable, and distributable bioinformatics workflows. Kronos obviates the need for explicit coding of workflows by compiling a text configuration file into executable Python applications. Making analysis modules would still require programming. The framework of each workflow includes a run manager to execute the encoded workflows locally (or on a cluster or cloud), parallelize tasks, and log all runtime events. The resulting workflows are highly modular and configurable by construction, facilitating flexible and extensible meta-applications that can be modified easily through configuration file editing. The workflows are fully encoded for ease of distribution and can be instantiated on external systems, a step toward reproducible research and comparative analyses. We introduce a framework for building Kronos components that function as shareable, modular nodes in Kronos workflows.
The Kronos platform provides a standard framework for developers to implement custom tools, reuse existing tools, and contribute to the community at large. Kronos is shipped with both Docker and Amazon Web Services Machine Images. It is free, open source, and available through the Python Package Index and at https://github.com/jtaghiyar/kronos.
下一代测序信息学领域已经成熟到这样一个地步,即序列比对和单个特征检测方法的算法改进已经稳定下来。复杂分析工作流程的实用且稳健的实现(在这些工具被构建为下一代测序数据集的自动化分析的“最佳实践”)仍然需要大量的编程投资和专业知识。
我们提出了 Kronos,这是一个用于促进模块化、可审核和可分发的生物信息学工作流程的开发和执行的软件平台。Kronos 通过将文本配置文件编译成可执行的 Python 应用程序来避免工作流程的显式编码的需要。制作分析模块仍然需要编程。每个工作流程的框架都包括一个运行管理器,用于在本地(或在集群或云中)执行编码的工作流程、并行化任务以及记录所有运行时事件。由此产生的工作流程具有高度的模块化和可配置性,通过配置文件编辑很容易实现灵活和可扩展的元应用程序。工作流程完全编码,便于分发,可以在外部系统上实例化,这是实现可重复研究和比较分析的一步。我们引入了一个用于构建 Kronos 组件的框架,这些组件可以作为 Kronos 工作流程中的可共享、模块化节点。
Kronos 平台为开发人员提供了一个标准框架,用于实现自定义工具、重用现有工具并为整个社区做出贡献。Kronos 同时提供 Docker 和 Amazon Web Services 机器映像。它是免费的、开源的,并可通过 Python 包索引和 https://github.com/jtaghiyar/kronos 获得。