Department of Biology, Western University, London, ON N6A 5B7, Canada.
School of Computer Science, University of Waterloo, Waterloo, ON N2L 3G1, Canada.
Bioinformatics. 2022 Apr 28;38(9):2619-2620. doi: 10.1093/bioinformatics/btac128.
SomaticSiMu is an in silico simulator of single and double base substitutions, and single base insertions and deletions in an input genomic sequence to mimic mutational signatures. SomaticSiMu outputs simulated DNA sequences and mutational catalogues with imposed mutational signatures. The tool is the first mutational signature simulator featuring a graphical user interface, control of mutation rates and built-in visualization tools of the simulated mutations. Simulated datasets are useful as a ground truth to test the accuracy and sensitivity of DNA sequence classification tools and mutational signature extraction tools under different experimental scenarios. The reliability of SomaticSiMu was affirmed by (i) supervised machine learning classification of simulated sequences with different mutation types and burdens, and (ii) mutational signature extraction from simulated mutational catalogues.
SomaticSiMu is written in Python 3.8.3. The open-source code, documentation and tutorials are available at https://github.com/HillLab/SomaticSiMu under the terms of the CreativeCommonsAttribution4.0InternationalLicense.
Supplementary data are available at Bioinformatics online.
SomaticSiMu 是一种计算机模拟工具,可模拟输入基因组序列中单碱基替换、双碱基替换、单碱基插入和缺失,从而模拟突变特征。SomaticSiMu 会输出带有指定突变特征的模拟 DNA 序列和突变目录。该工具是第一个具有图形用户界面的突变特征模拟器,可控制突变率并内置模拟突变的可视化工具。模拟数据集可用作测试不同实验场景下 DNA 序列分类工具和突变特征提取工具准确性和灵敏度的基准。SomaticSiMu 的可靠性通过以下方式得到证实:(i)使用不同突变类型和负担的模拟序列进行监督机器学习分类,以及(ii)从模拟突变目录中提取突变特征。
SomaticSiMu 是用 Python 3.8.3 编写的。其开源代码、文档和教程可在 https://github.com/HillLab/SomaticSiMu 上获得,遵循 CreativeCommonsAttribution4.0InternationalLicense。
补充数据可在 Bioinformatics 在线获取。