Suppr超能文献

全面比较全基因组亚硫酸氢盐数据图谱绘制软件:从读取序列比对到 DNA 甲基化分析。

Comprehensive benchmarking of software for mapping whole genome bisulfite data: from read alignment to DNA methylation analysis.

机构信息

ecSeq Bioinformatics GmbH, Sternwartenstraße 29, 04103, Saxony, Germany.

Institut für Informatik, Universität Leipzig, Härtelstraße 16-18, 04107, Saxony, Germany.

出版信息

Brief Bioinform. 2021 Sep 2;22(5). doi: 10.1093/bib/bbab021.

Abstract

Whole genome bisulfite sequencing is currently at the forefront of epigenetic analysis, facilitating the nucleotide-level resolution of 5-methylcytosine (5mC) on a genome-wide scale. Specialized software have been developed to accommodate the unique difficulties in aligning such sequencing reads to a given reference, building on the knowledge acquired from model organisms such as human, or Arabidopsis thaliana. As the field of epigenetics expands its purview to non-model plant species, new challenges arise which bring into question the suitability of previously established tools. Herein, nine short-read aligners are evaluated: Bismark, BS-Seeker2, BSMAP, BWA-meth, ERNE-BS5, GEM3, GSNAP, Last and segemehl. Precision-recall of simulated alignments, in comparison to real sequencing data obtained from three natural accessions, reveals on-balance that BWA-meth and BSMAP are able to make the best use of the data during mapping. The influence of difficult-to-map regions, characterized by deviations in sequencing depth over repeat annotations, is evaluated in terms of the mean absolute deviation of the resulting methylation calls in comparison to a realistic methylome. Downstream methylation analysis is responsive to the handling of multi-mapping reads relative to mapping quality (MAPQ), and potentially susceptible to bias arising from the increased sequence complexity of densely methylated reads.

摘要

全基因组亚硫酸氢盐测序目前处于表观遗传学分析的前沿,能够在全基因组范围内实现核苷酸水平上的 5- 甲基胞嘧啶(5mC)分辨率。已经开发了专门的软件来适应将此类测序reads 与给定参考序列对齐的独特困难,这些软件的建立基于从人类或拟南芥等模式生物中获得的知识。随着表观遗传学领域将其研究范围扩大到非模式植物物种,新的挑战出现了,这使得以前建立的工具的适用性受到质疑。在此,评估了九种短读序列比对器:Bismark、BS-Seeker2、BSMAP、BWA-meth、ERNE-BS5、GEM3、GSNAP、Last 和 segemehl。与从三个自然品系获得的真实测序数据相比,模拟比对的精度-召回率表明,BWA-meth 和 BSMAP 在映射过程中能够最好地利用数据。通过比较实际甲基组,以测序深度相对于重复注释的偏差为特征的难以映射区域的影响,以得到的甲基化调用的平均绝对偏差来评估。下游甲基化分析对相对于映射质量(MAPQ)的多映射读取的处理敏感,并且可能容易受到高度甲基化读取的序列复杂性增加引起的偏差的影响。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/cffc/8425420/9f516157f409/bbab021f1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验