• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SnakeLines:一套用于测序读取的集成计算管道。

SnakeLines: integrated set of computational pipelines for sequencing reads.

机构信息

Geneton Ltd., 841 04 Bratislava, Slovakia.

Slovak Centre of Scientific and Technical Information, 811 04 Bratislava, Slovakia.

出版信息

J Integr Bioinform. 2023 Aug 21;20(3). doi: 10.1515/jib-2022-0059. eCollection 2023 Sep 1.

DOI:10.1515/jib-2022-0059
PMID:37602733
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10757078/
Abstract

With the rapid growth of massively parallel sequencing technologies, still more laboratories are utilising sequenced DNA fragments for genomic analyses. Interpretation of sequencing data is, however, strongly dependent on bioinformatics processing, which is often too demanding for clinicians and researchers without a computational background. Another problem represents the reproducibility of computational analyses across separated computational centres with inconsistent versions of installed libraries and bioinformatics tools. We propose an easily extensible set of computational pipelines, called SnakeLines, for processing sequencing reads; including mapping, assembly, variant calling, viral identification, transcriptomics, and metagenomics analysis. Individual steps of an analysis, along with methods and their parameters can be readily modified in a single configuration file. Provided pipelines are embedded in virtual environments that ensure isolation of required resources from the host operating system, rapid deployment, and reproducibility of analysis across different Unix-based platforms. SnakeLines is a powerful framework for the automation of bioinformatics analyses, with emphasis on a simple set-up, modifications, extensibility, and reproducibility. The framework is already routinely used in various research projects and their applications, especially in the Slovak national surveillance of SARS-CoV-2.

摘要

随着大规模平行测序技术的快速发展,越来越多的实验室正在利用测序 DNA 片段进行基因组分析。然而,测序数据的解释强烈依赖于生物信息学处理,而对于没有计算背景的临床医生和研究人员来说,这往往要求过高。另一个问题是,在具有不一致安装库和生物信息学工具版本的分离计算中心之间,计算分析的可重复性。我们提出了一组称为 SnakeLines 的易于扩展的计算管道,用于处理测序reads;包括映射、组装、变体调用、病毒识别、转录组学和宏基因组学分析。分析的各个步骤,以及方法及其参数,可以在单个配置文件中轻松修改。提供的管道被嵌入虚拟环境中,确保从主机操作系统隔离所需的资源、快速部署以及跨不同基于 Unix 的平台的分析的可重复性。SnakeLines 是一个用于生物信息学分析自动化的强大框架,重点是简单的设置、修改、可扩展性和可重复性。该框架已经在各种研究项目及其应用中得到了常规使用,特别是在斯洛伐克的 SARS-CoV-2 国家监测中。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/9383daa9ca1d/j_jib-2022-0059_fig_004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/9ce3b2bdd368/j_jib-2022-0059_fig_001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/24606c6dbfc0/j_jib-2022-0059_fig_002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/fee9da20f72b/j_jib-2022-0059_fig_003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/9383daa9ca1d/j_jib-2022-0059_fig_004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/9ce3b2bdd368/j_jib-2022-0059_fig_001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/24606c6dbfc0/j_jib-2022-0059_fig_002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/fee9da20f72b/j_jib-2022-0059_fig_003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d2f9/10757078/9383daa9ca1d/j_jib-2022-0059_fig_004.jpg

相似文献

1
SnakeLines: integrated set of computational pipelines for sequencing reads.SnakeLines:一套用于测序读取的集成计算管道。
J Integr Bioinform. 2023 Aug 21;20(3). doi: 10.1515/jib-2022-0059. eCollection 2023 Sep 1.
2
Challenges in exome analysis by LifeScope and its alternative computational pipelines.LifeScope及其替代计算流程在全外显子组分析中的挑战。
BMC Res Notes. 2015 Sep 7;8:421. doi: 10.1186/s13104-015-1385-4.
3
INSaFLU-TELEVIR: an open web-based bioinformatics suite for viral metagenomic detection and routine genomic surveillance.INSaFLU-TELEVIR:一个基于网络的开放式生物信息学套件,用于病毒宏基因组检测和常规基因组监测。
Genome Med. 2024 Apr 25;16(1):61. doi: 10.1186/s13073-024-01334-3.
4
Bioinformatics tools for analysing viral genomic data.用于分析病毒基因组数据的生物信息学工具。
Rev Sci Tech. 2016 Apr;35(1):271-85. doi: 10.20506/rst.35.1.2432.
5
Systematic comparison of germline variant calling pipelines cross multiple next-generation sequencers.跨多种下一代测序仪的种系变异调用管道的系统比较。
Sci Rep. 2019 Jun 27;9(1):9345. doi: 10.1038/s41598-019-45835-3.
6
SEQprocess: a modularized and customizable pipeline framework for NGS processing in R package.SEQprocess:一个用于 R 包中 NGS 处理的模块化和可定制的管道框架。
BMC Bioinformatics. 2019 Feb 20;20(1):90. doi: 10.1186/s12859-019-2676-x.
7
MutAid: Sanger and NGS Based Integrated Pipeline for Mutation Identification, Validation and Annotation in Human Molecular Genetics.MutAid:基于桑格测序法和新一代测序技术的综合流程,用于人类分子遗传学中的突变鉴定、验证及注释
PLoS One. 2016 Feb 3;11(2):e0147697. doi: 10.1371/journal.pone.0147697. eCollection 2016.
8
NGSpop: A desktop software that supports population studies by identifying sequence variations from next-generation sequencing data.NGSpop:一款桌面软件,可通过识别来自下一代测序数据的序列变异来支持群体研究。
PLoS One. 2022 Nov 17;17(11):e0260908. doi: 10.1371/journal.pone.0260908. eCollection 2022.
9
Reusable, extensible, and modifiable R scripts and Kepler workflows for comprehensive single set ChIP-seq analysis.用于全面单组ChIP-seq分析的可重复使用、可扩展且可修改的R脚本和开普勒工作流程。
BMC Bioinformatics. 2016 Jul 5;17(1):270. doi: 10.1186/s12859-016-1125-3.
10
A comparison of sequencing platforms and bioinformatics pipelines for compositional analysis of the gut microbiome.用于肠道微生物组组成分析的测序平台和生物信息学管道的比较。
BMC Microbiol. 2017 Sep 13;17(1):194. doi: 10.1186/s12866-017-1101-8.

引用本文的文献

1
Evaluation and limitations of different approaches among COVID-19 fatal cases using whole-exome sequencing data.利用全外显子组测序数据评估和比较 COVID-19 死亡病例的不同方法。
BMC Genomics. 2023 Jan 10;24(1):12. doi: 10.1186/s12864-022-09084-5.
2
Systematic Genomic Surveillance of SARS-CoV-2 Virus on Illumina Sequencing Platforms in the Slovak Republic-One Year Experience.斯洛伐克共和国 Illumina 测序平台上的 SARS-CoV-2 病毒的系统基因组监测-一年经验。
Viruses. 2022 Nov 2;14(11):2432. doi: 10.3390/v14112432.

本文引用的文献

1
Pango lineage designation and assignment using SARS-CoV-2 spike gene nucleotide sequences.使用 SARS-CoV-2 刺突基因核苷酸序列对 Pango 谱系进行指定和分配。
BMC Genomics. 2022 Feb 11;23(1):121. doi: 10.1186/s12864-022-08358-2.
2
A manifesto for reproducible science.可重复科学宣言。
Nat Hum Behav. 2017 Jan 10;1(1):0021. doi: 10.1038/s41562-016-0021.
3
Repression of a large number of genes requires interplay between homologous recombination and HIRA.大量基因的抑制需要同源重组和 HIRA 之间的相互作用。
Nucleic Acids Res. 2021 Feb 26;49(4):1914-1934. doi: 10.1093/nar/gkab027.
4
Extracellular DNA Correlates with Intestinal Inflammation in Chemically Induced Colitis in Mice.细胞外 DNA 与化学诱导的结肠炎小鼠的肠道炎症相关。
Cells. 2021 Jan 6;10(1):81. doi: 10.3390/cells10010081.
5
Editorial: The Genetic and Environmental Basis for Diseases in Understudied Populations.社论:未充分研究人群中疾病的遗传和环境基础。
Front Genet. 2020 Sep 23;11:559956. doi: 10.3389/fgene.2020.559956. eCollection 2020.
6
Comparison of microbial diversity during two different wine fermentation processes.两种不同葡萄酒发酵过程中微生物多样性的比较。
FEMS Microbiol Lett. 2020 Sep 25;367(18). doi: 10.1093/femsle/fnaa150.
7
Seasonal changes of circulating 25-hydroxyvitamin D correlate with the lower gut microbiome composition in inflammatory bowel disease patients.炎症性肠病患者循环 25-羟维生素 D 的季节性变化与下肠道微生物组组成相关。
Sci Rep. 2020 Apr 7;10(1):6024. doi: 10.1038/s41598-020-62811-4.
8
Comparative Transcriptome Analysis of Two Cucumber Cultivars with Different Sensitivity to Cucumber Mosaic Virus Infection.对黄瓜花叶病毒感染具有不同敏感性的两个黄瓜品种的比较转录组分析
Pathogens. 2020 Feb 21;9(2):145. doi: 10.3390/pathogens9020145.
9
Reproducible, interactive, scalable and extensible microbiome data science using QIIME 2.使用QIIME 2进行可重复、交互式、可扩展和可延伸的微生物组数据科学研究。
Nat Biotechnol. 2019 Aug;37(8):852-857. doi: 10.1038/s41587-019-0209-9.
10
snakePipes: facilitating flexible, scalable and integrative epigenomic analysis.snakePipes:实现灵活、可扩展和集成的表观基因组分析。
Bioinformatics. 2019 Nov 1;35(22):4757-4759. doi: 10.1093/bioinformatics/btz436.