• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

Kronos:一个用于基因组分析和信息学的工作流组装器。

Kronos: a workflow assembler for genome analytics and informatics.

机构信息

Department of Molecular Oncology, British Columbia Cancer Agency, 675 West 10th Ave, V5Z 1L3 Vancouver, BC, Canada.

Department of Pathology and Laboratory Medicine, University of British Columbia, 2211 Wesbrook Mall, V6T 2B5 Vancouver, BC, Canada.

出版信息

Gigascience. 2017 Jul 1;6(7):1-10. doi: 10.1093/gigascience/gix042.

DOI:10.1093/gigascience/gix042
PMID:28655203
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC5569921/
Abstract

BACKGROUND

The field of next-generation sequencing informatics has matured to a point where algorithmic advances in sequence alignment and individual feature detection methods have stabilized. Practical and robust implementation of complex analytical workflows (where such tools are structured into "best practices" for automated analysis of next-generation sequencing datasets) still requires significant programming investment and expertise.

RESULTS

We present Kronos, a software platform for facilitating the development and execution of modular, auditable, and distributable bioinformatics workflows. Kronos obviates the need for explicit coding of workflows by compiling a text configuration file into executable Python applications. Making analysis modules would still require programming. The framework of each workflow includes a run manager to execute the encoded workflows locally (or on a cluster or cloud), parallelize tasks, and log all runtime events. The resulting workflows are highly modular and configurable by construction, facilitating flexible and extensible meta-applications that can be modified easily through configuration file editing. The workflows are fully encoded for ease of distribution and can be instantiated on external systems, a step toward reproducible research and comparative analyses. We introduce a framework for building Kronos components that function as shareable, modular nodes in Kronos workflows.

CONCLUSIONS

The Kronos platform provides a standard framework for developers to implement custom tools, reuse existing tools, and contribute to the community at large. Kronos is shipped with both Docker and Amazon Web Services Machine Images. It is free, open source, and available through the Python Package Index and at https://github.com/jtaghiyar/kronos.

摘要

背景

下一代测序信息学领域已经成熟到这样一个地步,即序列比对和单个特征检测方法的算法改进已经稳定下来。复杂分析工作流程的实用且稳健的实现(在这些工具被构建为下一代测序数据集的自动化分析的“最佳实践”)仍然需要大量的编程投资和专业知识。

结果

我们提出了 Kronos,这是一个用于促进模块化、可审核和可分发的生物信息学工作流程的开发和执行的软件平台。Kronos 通过将文本配置文件编译成可执行的 Python 应用程序来避免工作流程的显式编码的需要。制作分析模块仍然需要编程。每个工作流程的框架都包括一个运行管理器,用于在本地(或在集群或云中)执行编码的工作流程、并行化任务以及记录所有运行时事件。由此产生的工作流程具有高度的模块化和可配置性,通过配置文件编辑很容易实现灵活和可扩展的元应用程序。工作流程完全编码,便于分发,可以在外部系统上实例化,这是实现可重复研究和比较分析的一步。我们引入了一个用于构建 Kronos 组件的框架,这些组件可以作为 Kronos 工作流程中的可共享、模块化节点。

结论

Kronos 平台为开发人员提供了一个标准框架,用于实现自定义工具、重用现有工具并为整个社区做出贡献。Kronos 同时提供 Docker 和 Amazon Web Services 机器映像。它是免费的、开源的,并可通过 Python 包索引和 https://github.com/jtaghiyar/kronos 获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/98c9b34e8456/gix042fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/1f70016c824e/gix042fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/d8458b41b63f/gix042fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/38b410e380fe/gix042fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/542b48288bff/gix042fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/98c9b34e8456/gix042fig5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/1f70016c824e/gix042fig1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/d8458b41b63f/gix042fig2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/38b410e380fe/gix042fig3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/542b48288bff/gix042fig4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6b71/5569921/98c9b34e8456/gix042fig5.jpg

相似文献

1
Kronos: a workflow assembler for genome analytics and informatics.Kronos:一个用于基因组分析和信息学的工作流组装器。
Gigascience. 2017 Jul 1;6(7):1-10. doi: 10.1093/gigascience/gix042.
2
Optimizing performance of GATK workflows using Apache Arrow In-Memory data framework.使用 Apache Arrow 内存数据框架优化 GATK 工作流程的性能。
BMC Genomics. 2020 Nov 18;21(Suppl 10):683. doi: 10.1186/s12864-020-07013-y.
3
Watchdog - a workflow management system for the distributed analysis of large-scale experimental data.Watchdog - 一种用于大规模实验数据分析的分布式工作流管理系统。
BMC Bioinformatics. 2018 Mar 13;19(1):97. doi: 10.1186/s12859-018-2107-4.
4
ILIAD: a suite of automated Snakemake workflows for processing genomic data for downstream applications.ILIAD:一套用于处理基因组数据以用于下游应用的自动化 Snakemake 工作流程套件。
BMC Bioinformatics. 2023 Nov 8;24(1):424. doi: 10.1186/s12859-023-05548-x.
5
Closha: bioinformatics workflow system for the analysis of massive sequencing data.Closha:用于大规模测序数据分析的生物信息学工作流系统。
BMC Bioinformatics. 2018 Feb 19;19(Suppl 1):43. doi: 10.1186/s12859-018-2019-3.
6
Natrix: a Snakemake-based workflow for processing, clustering, and taxonomically assigning amplicon sequencing reads.Natrix:一个基于 SnakeMake 的工作流程,用于处理、聚类和分类分配扩增子测序reads。
BMC Bioinformatics. 2020 Nov 16;21(1):526. doi: 10.1186/s12859-020-03852-4.
7
KNIME4NGS: a comprehensive toolbox for next generation sequencing analysis.KNIME4NGS:下一代测序分析的综合工具包。
Bioinformatics. 2017 May 15;33(10):1565-1567. doi: 10.1093/bioinformatics/btx003.
8
DolphinNext: a distributed data processing platform for high throughput genomics.海豚下一代:一个用于高通量基因组学的分布式数据处理平台。
BMC Genomics. 2020 Apr 19;21(1):310. doi: 10.1186/s12864-020-6714-x.
9
systemPipeR: NGS workflow and report generation environment.systemPipeR:二代测序工作流程与报告生成环境。
BMC Bioinformatics. 2016 Sep 20;17:388. doi: 10.1186/s12859-016-1241-0.
10
Workflows for microarray data processing in the Kepler environment.在 Kepler 环境中进行微阵列数据处理的工作流程。
BMC Bioinformatics. 2012 May 17;13:102. doi: 10.1186/1471-2105-13-102.

引用本文的文献

1
Impact of concurrency on the performance of a whole exome sequencing pipeline.并发对全外显子组测序流程性能的影响。
BMC Bioinformatics. 2021 Feb 9;22(1):60. doi: 10.1186/s12859-020-03780-3.
2
Quantifying the influence of mutation detection on tumour subclonal reconstruction.量化突变检测对肿瘤亚克隆重建的影响。
Nat Commun. 2020 Dec 7;11(1):6247. doi: 10.1038/s41467-020-20055-w.
3
GenPipes: an open-source framework for distributed and scalable genomic analyses.GenPipes:一个用于分布式和可扩展基因组分析的开源框架。

本文引用的文献

1
Experiences with workflows for automating data-intensive bioinformatics.自动化数据密集型生物信息学工作流程的经验。
Biol Direct. 2015 Aug 19;10:43. doi: 10.1186/s13062-015-0071-8.
2
Combining tumor genome simulation with crowdsourcing to benchmark somatic single-nucleotide-variant detection.将肿瘤基因组模拟与众包相结合,以评估体细胞单核苷酸变异检测。
Nat Methods. 2015 Jul;12(7):623-30. doi: 10.1038/nmeth.3407. Epub 2015 May 18.
3
Omics Pipe: a community-based framework for reproducible multi-omics data analysis.组学管道:一个基于社区的可重复多组学数据分析框架。
Gigascience. 2019 Jun 1;8(6). doi: 10.1093/gigascience/giz037.
4
Enhancing knowledge discovery from cancer genomics data with Galaxy.利用Galaxy增强从癌症基因组学数据中进行的知识发现。
Gigascience. 2017 May 1;6(5):1-13. doi: 10.1093/gigascience/gix015.
Bioinformatics. 2015 Jun 1;31(11):1724-8. doi: 10.1093/bioinformatics/btv061. Epub 2015 Jan 30.
4
The Taverna workflow suite: designing and executing workflows of Web Services on the desktop, web or in the cloud.Taverna 工作流套件:在桌面、网络或云端设计和执行 Web 服务工作流。
Nucleic Acids Res. 2013 Jul;41(Web Server issue):W557-61. doi: 10.1093/nar/gkt328. Epub 2013 May 2.
5
STAR: ultrafast universal RNA-seq aligner.STAR:超快通用 RNA-seq 对齐工具。
Bioinformatics. 2013 Jan 1;29(1):15-21. doi: 10.1093/bioinformatics/bts635. Epub 2012 Oct 25.
6
Snakemake--a scalable bioinformatics workflow engine.Snakemake——一个可扩展的生物信息学工作流引擎。
Bioinformatics. 2012 Oct 1;28(19):2520-2. doi: 10.1093/bioinformatics/bts480. Epub 2012 Aug 20.
7
A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w1118; iso-2; iso-3.一个用于注释和预测单核苷酸多态性影响的程序,即SnpEff:黑腹果蝇品系w1118、iso-2、iso-3基因组中的单核苷酸多态性。
Fly (Austin). 2012 Apr-Jun;6(2):80-92. doi: 10.4161/fly.19695.
8
Integrative analysis of genome-wide loss of heterozygosity and monoallelic expression at nucleotide resolution reveals disrupted pathways in triple-negative breast cancer.全基因组杂合性缺失和核苷酸分辨率下单等位基因表达的综合分析揭示了三阴性乳腺癌中失调的通路。
Genome Res. 2012 Oct;22(10):1995-2007. doi: 10.1101/gr.137570.112. Epub 2012 May 25.
9
Strelka: accurate somatic small-variant calling from sequenced tumor-normal sample pairs.Strelka:从测序的肿瘤-正常样本对中准确调用体细胞小变异。
Bioinformatics. 2012 Jul 15;28(14):1811-7. doi: 10.1093/bioinformatics/bts271. Epub 2012 May 10.
10
Bpipe: a tool for running and managing bioinformatics pipelines.Bpipe:一种用于运行和管理生物信息学流程的工具。
Bioinformatics. 2012 Jun 1;28(11):1525-6. doi: 10.1093/bioinformatics/bts167. Epub 2012 Apr 12.