• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SVSR:一个用于模拟结构变异并为多个平台生成测序读数的程序。

SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms.

作者信息

Yuan Xiguo, Gao Meihong, Bai Jun, Duan Junbo

出版信息

IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):1082-1091. doi: 10.1109/TCBB.2018.2876527. Epub 2018 Oct 17.

DOI:10.1109/TCBB.2018.2876527
PMID:30334804
Abstract

Structural variation accounts for a major fraction of mutations in the human genome and confers susceptibility to complex diseases. Next generation sequencing along with the rapid development of computational methods provides a cost-effective procedure to detect such variations. Simulation of structural variations and sequencing reads with real characteristics is essential for benchmarking the computational methods. Here, we develop a new program, SVSR, to simulate five types of structural variations (indels, tandem duplication, CNVs, inversions, and translocations) and SNPs for the human genome and to generate sequencing reads with features from popular platforms (Illumina, SOLiD, 454, and Ion Torrent). We adopt a selection model trained from real data to predict copy number states, starting from the first site of a particular genome to the end. Furthermore, we utilize references of microbial genomes to produce insertion fragments and design probabilistic models to imitate inversions and translocations. Moreover, we create platform-specific errors and base quality profiles to generate normal, tumor, or normal-tumor mixture reads. Experimental results show that SVSR could capture more features that are realistic and generate datasets with satisfactory quality scores. SVSR is able to evaluate the performance of structural variation detection methods and guide the development of new computational methods.

摘要

结构变异占人类基因组突变的很大一部分,并赋予对复杂疾病的易感性。随着计算方法的快速发展,下一代测序提供了一种经济高效的程序来检测此类变异。模拟具有真实特征的结构变异和测序读数对于评估计算方法至关重要。在这里,我们开发了一个新程序SVSR,用于模拟人类基因组的五种结构变异(插入缺失、串联重复、拷贝数变异、倒位和易位)和单核苷酸多态性,并生成具有流行平台(Illumina、SOLiD、454和Ion Torrent)特征的测序读数。我们采用从真实数据训练的选择模型来预测拷贝数状态,从特定基因组的第一个位点到最后一个位点。此外,我们利用微生物基因组的参考来产生插入片段,并设计概率模型来模拟倒位和易位。此外,我们创建特定于平台的错误和碱基质量概况,以生成正常、肿瘤或正常-肿瘤混合读数。实验结果表明,SVSR可以捕获更多现实的特征,并生成具有令人满意质量分数的数据集。SVSR能够评估结构变异检测方法的性能,并指导新计算方法的开发。

相似文献

1
SVSR: A Program to Simulate Structural Variations and Generate Sequencing Reads for Multiple Platforms.SVSR:一个用于模拟结构变异并为多个平台生成测序读数的程序。
IEEE/ACM Trans Comput Biol Bioinform. 2020 May-Jun;17(3):1082-1091. doi: 10.1109/TCBB.2018.2876527. Epub 2018 Oct 17.
2
SInC: an accurate and fast error-model based simulator for SNPs, Indels and CNVs coupled with a read generator for short-read sequence data.SInC:一种准确且快速的基于错误模型的 SNP、Indel 和 CNV 模拟器,结合了用于短读序列数据的读取生成器。
BMC Bioinformatics. 2014 Feb 5;15:40. doi: 10.1186/1471-2105-15-40.
3
Identification of indels in next-generation sequencing data.下一代测序数据中插入缺失的鉴定。
BMC Bioinformatics. 2015 Feb 13;16(1):42. doi: 10.1186/s12859-015-0483-6.
4
IntSIM: An Integrated Simulator of Next-Generation Sequencing Data.IntSIM:下一代测序数据集成模拟器
IEEE Trans Biomed Eng. 2017 Feb;64(2):441-451. doi: 10.1109/TBME.2016.2560939. Epub 2016 Apr 29.
5
ERINS: Novel Sequence Insertion Detection by Constructing an Extended Reference.ERINS:通过构建扩展参考来检测新的序列插入。
IEEE/ACM Trans Comput Biol Bioinform. 2021 Sep-Oct;18(5):1893-1901. doi: 10.1109/TCBB.2019.2954315. Epub 2021 Oct 7.
6
Toolkit for automated and rapid discovery of structural variants.用于自动化和快速发现结构变体的工具包。
Methods. 2017 Oct 1;129:3-7. doi: 10.1016/j.ymeth.2017.05.030. Epub 2017 Jun 2.
7
iCopyDAV: Integrated platform for copy number variations-Detection, annotation and visualization.iCopyDAV:用于拷贝数变异检测、注释和可视化的集成平台。
PLoS One. 2018 Apr 5;13(4):e0195334. doi: 10.1371/journal.pone.0195334. eCollection 2018.
8
jackalope: A swift, versatile phylogenomic and high-throughput sequencing simulator.狼兔:一种快速、通用的系统发育基因组学和高通量测序模拟程序。
Mol Ecol Resour. 2020 Jul;20(4):1132-1140. doi: 10.1111/1755-0998.13173. Epub 2020 May 20.
9
SVEngine: an efficient and versatile simulator of genome structural variations with features of cancer clonal evolution.SVEngine:一种高效、通用的基因组结构变异模拟器,具有癌症克隆进化特征。
Gigascience. 2018 Jul 1;7(7). doi: 10.1093/gigascience/giy081.
10
SM-RCNV: a statistical method to detect recurrent copy number variations in sequenced samples.SM-RCNV:一种用于检测测序样本中重现性拷贝数变异的统计方法。
Genes Genomics. 2019 May;41(5):529-536. doi: 10.1007/s13258-019-00788-9. Epub 2019 Feb 18.

引用本文的文献

1
HBOS-CNV: A New Approach to Detect Copy Number Variations From Next-Generation Sequencing Data.HBOS-CNV:一种从下一代测序数据中检测拷贝数变异的新方法。
Front Genet. 2021 Jun 7;12:642473. doi: 10.3389/fgene.2021.642473. eCollection 2021.
2
Integrating Somatic Mutations for Breast Cancer Survival Prediction Using Machine Learning Methods.使用机器学习方法整合体细胞突变以预测乳腺癌生存情况
Front Genet. 2021 Jan 18;11:632901. doi: 10.3389/fgene.2020.632901. eCollection 2020.
3
A Density Peak-Based Method to Detect Copy Number Variations From Next-Generation Sequencing Data.
一种基于密度峰值的方法,用于从下一代测序数据中检测拷贝数变异。
Front Genet. 2021 Jan 13;11:632311. doi: 10.3389/fgene.2020.632311. eCollection 2020.
4
RKDOSCNV: A Local Kernel Density-Based Approach to the Detection of Copy Number Variations by Using Next-Generation Sequencing Data.RKDOSCNV:一种基于局部核密度的方法,用于利用下一代测序数据检测拷贝数变异。
Front Genet. 2020 Nov 4;11:569227. doi: 10.3389/fgene.2020.569227. eCollection 2020.
5
UMI-Gen: A UMI-based read simulator for variant calling evaluation in paired-end sequencing NGS libraries.UMI-Gen:一种基于单分子唯一分子标识符(UMI)的读段模拟器,用于双端测序NGS文库中的变异检测评估。
Comput Struct Biotechnol J. 2020 Aug 27;18:2270-2280. doi: 10.1016/j.csbj.2020.08.011. eCollection 2020.
6
MFCNV: A New Method to Detect Copy Number Variations From Next-Generation Sequencing Data.MFCNV:一种从下一代测序数据中检测拷贝数变异的新方法。
Front Genet. 2020 May 15;11:434. doi: 10.3389/fgene.2020.00434. eCollection 2020.