Suppr超能文献

bsgenova:一种用于亚硫酸氢盐测序数据的准确、稳健、快速的基因型调用程序。

bsgenova: an accurate, robust, and fast genotype caller for bisulfite-sequencing data.

机构信息

Agricultural Genomics Institute at Shenzhen, Chinese Academy of Agricultural Sciences, Shenzhen, China.

HIM-BGI Omics Center, Zhejiang Cancer Hospital, Hangzhou Institute of Medicine (HIM), Chinese Academy of Sciences (CAS), Hangzhou, China.

出版信息

BMC Bioinformatics. 2024 Jun 5;25(1):206. doi: 10.1186/s12859-024-05821-7.

Abstract

BACKGROUND

Bisulfite sequencing (BS-Seq) is a fundamental technique for characterizing DNA methylation profiles. Genotype calling from bisulfite-converted BS-Seq data allows allele-specific methylation analysis and the concurrent exploration of genetic and epigenetic profiles. Despite various methods have been proposed, single nucleotide polymorphisms (SNPs) calling from BS-Seq data, particularly for SNPs on chromosome X and in the presence of contaminative data, poses ongoing challenges.

RESULTS

We introduce bsgenova, a novel SNP caller tailored for bisulfite sequencing data, employing a Bayesian multinomial model. The performance of bsgenova is assessed by comparing SNPs called from real-world BS-Seq data with those from corresponding whole-genome sequencing (WGS) data across three human cell lines. bsgenova is both sensitive and precise, especially for chromosome X, compared with three existing methods. Moreover, in the presence of low-quality reads, bsgenova outperforms other methods notably. In addition, bsgenova is meticulously implemented, leveraging matrix imputation and multi-process parallelization. Compared to existing methods, bsgenova stands out for its speed and efficiency in memory and disk usage. Furthermore, bsgenova integrates bsextractor, a methylation extractor, enhancing its flexibility and expanding its utility.

CONCLUSIONS

We introduce bsgenova for SNP calling from bisulfite-sequencing data. The source code is available at https://github.com/hippo-yf/bsgenova under license GPL-3.0.

摘要

背景

亚硫酸氢盐测序(BS-Seq)是一种用于描述 DNA 甲基化谱的基本技术。从经亚硫酸氢盐转化的 BS-Seq 数据中进行基因型调用允许等位基因特异性甲基化分析,并同时探索遗传和表观遗传谱。尽管已经提出了各种方法,但从 BS-Seq 数据中调用单核苷酸多态性(SNP),特别是在存在污染数据的情况下,仍然存在挑战。

结果

我们引入了 bsgenova,这是一种针对亚硫酸氢盐测序数据的新型 SNP 调用器,采用贝叶斯多项式模型。通过将从三个人类细胞系的真实 BS-Seq 数据中调用的 SNP 与来自相应全基因组测序(WGS)数据的 SNP 进行比较,评估了 bsgenova 的性能。与三种现有方法相比,bsgenova 具有较高的敏感性和准确性,特别是对于染色体 X。此外,在存在低质量读取的情况下,bsgenova 的表现明显优于其他方法。此外,bsgenova 经过精心实现,利用矩阵插补和多进程并行化。与现有方法相比,bsgenova 具有速度快、内存和磁盘使用效率高的特点。此外,bsgenova 集成了 bsextractor,这是一种甲基化提取器,增强了其灵活性并扩展了其用途。

结论

我们引入了 bsgenova 用于从亚硫酸氢盐测序数据中调用 SNP。源代码可在 https://github.com/hippo-yf/bsgenova 上获得,许可证为 GPL-3.0。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/8ead/11151569/ec5e1e97d6b4/12859_2024_5821_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验