• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

狼兔:一种快速、通用的系统发育基因组学和高通量测序模拟程序。

jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator.

机构信息

Department of Integrative Biology, University of Wisconsin, Madison, WI, USA.

出版信息

Mol Ecol Resour. 2020 Jul;20(4):1132-1140. doi: 10.1111/1755-0998.13173. Epub 2020 May 20.

DOI:10.1111/1755-0998.13173
PMID:32320523
Abstract

High-throughput sequencing (HTS) is central to the study of population genomics and has an increasingly important role in constructing phylogenies. Choices in research design for sequencing projects can include a wide range of factors, such as sequencing platform, depth of coverage and bioinformatic tools. Simulating HTS data better informs these decisions, as users can validate software by comparing output to the known simulation parameters. However, current standalone HTS simulators cannot generate variant haplotypes under even somewhat complex evolutionary scenarios, such as recombination or demographic change. This greatly reduces their usefulness for fields such as population genomics and phylogenomics. Here I present the R package jackalope that simply and efficiently simulates (i) sets of variant haplotypes from a reference genome and (ii) reads from both Illumina and Pacific Biosciences platforms. Haplotypes can be simulated using phylogenies, gene trees, coalescent-simulation output, population-genomic summary statistics, and Variant Call Format (VCF) files. jackalope can simulate single, paired-end or mate-pair Illumina reads, as well as reads from Pacific Biosciences. These simulations include sequencing errors, mapping qualities, multiplexing and optical/PCR duplicates. It can read reference genomes from fasta files and can simulate new ones, and all outputs can be written to standard file formats. jackalope is available for Mac, Windows and Linux systems.

摘要

高通量测序 (HTS) 是群体基因组学研究的核心,并且在构建系统发育树方面发挥着越来越重要的作用。测序项目的研究设计选择可以包括许多因素,例如测序平台、覆盖深度和生物信息学工具。通过模拟 HTS 数据,可以更好地做出这些决策,因为用户可以通过将输出与已知的模拟参数进行比较来验证软件。然而,当前的独立 HTS 模拟器甚至在一些复杂的进化场景下,如重组或种群变化,都无法生成变异单倍型。这大大降低了它们在群体基因组学和系统发生基因组学等领域的用途。在这里,我介绍了 R 包 jackalope,它可以简单有效地模拟 (i) 来自参考基因组的一组变异单倍型,以及 (ii) 来自 Illumina 和 Pacific Biosciences 平台的读取数据。可以使用系统发育树、基因树、合并模拟输出、群体基因组学汇总统计数据和 Variant Call Format (VCF) 文件来模拟单倍型。jackalope 可以模拟单端、配对末端或 mate-pair Illumina 读取,以及 Pacific Biosciences 的读取。这些模拟包括测序错误、映射质量、多重和光学/PCR 重复。它可以从 fasta 文件读取参考基因组并可以模拟新的基因组,并且所有输出都可以写入标准文件格式。jackalope 可用于 Mac、Windows 和 Linux 系统。

相似文献

1
jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator.狼兔:一种快速、通用的系统发育基因组学和高通量测序模拟程序。
Mol Ecol Resour. 2020 Jul;20(4):1132-1140. doi: 10.1111/1755-0998.13173. Epub 2020 May 20.
2
SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution.SVEngine:一种高效、通用的基因组结构变异模拟器,具有癌症克隆进化特征。
Gigascience. 2018 Jul 1;7(7). doi: 10.1093/gigascience/giy081.
3
pIRS: Profile-based Illumina pair-end reads simulator.pIRS:基于谱的 Illumina 双端读取模拟器。
Bioinformatics. 2012 Jun 1;28(11):1533-5. doi: 10.1093/bioinformatics/bts187. Epub 2012 Apr 15.
4
SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.SInC:一种准确且快速的基于错误模型的 SNP、Indel 和 CNV 模拟器,结合了用于短读序列数据的读取生成器。
BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.
5
QuorUM: An Error Corrector for Illumina Reads.QuorUM:Illumina测序读数的纠错工具
PLoS One. 2015 Jun 17;10(6):e0130821. doi: 10.1371/journal.pone.0130821. eCollection 2015.
6
NanoSim: nanopore sequence read simulator based on statistical characterization.NanoSim:基于统计特征的纳米孔序列读取模拟器。
Gigascience. 2017 Apr 1;6(4):1-6. doi: 10.1093/gigascience/gix010.
7
Comparison of mapping algorithms used in high-throughput sequencing: application to Ion Torrent data.高通量测序中使用的映射算法比较:应用于Ion Torrent数据
BMC Genomics. 2014 Apr 5;15:264. doi: 10.1186/1471-2164-15-264.
8
Blue: correcting sequencing errors using consensus and context.蓝色:使用一致性和上下文来纠正测序错误。
Bioinformatics. 2014 Oct;30(19):2723-32. doi: 10.1093/bioinformatics/btu368. Epub 2014 Jun 11.
9
NGSphy: phylogenomic simulation of next-generation sequencing data.NGSphy:下一代测序数据的系统发育模拟。
Bioinformatics. 2018 Jul 15;34(14):2506-2507. doi: 10.1093/bioinformatics/bty146.
10
Simulating Next-Generation Sequencing Datasets from Empirical Mutation and Sequencing Models.根据经验性突变和测序模型模拟下一代测序数据集。
PLoS One. 2016 Nov 28;11(11):e0167047. doi: 10.1371/journal.pone.0167047. eCollection 2016.

引用本文的文献

1
J-SPACE: a Julia package for the simulation of spatial models of cancer evolution and of sequencing experiments.J-SPACE:一个用于模拟癌症进化和测序实验的空间模型的 Julia 包。
BMC Bioinformatics. 2022 Jul 8;23(1):269. doi: 10.1186/s12859-022-04779-8.
2
Benchmarking the topological accuracy of bacterial phylogenomic workflows using evolution.使用进化基准测试细菌基因组系统发生工作流程的拓扑准确性。
Microb Genom. 2022 Mar;8(3). doi: 10.1099/mgen.0.000799.