Genomics Core Facility, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, Heidelberg, Germany.
Genome Biology Unit, European Molecular Biology Laboratory (EMBL), Meyerhofstrasse 1, Heidelberg, Germany.
Bioinformatics. 2019 Jul 15;35(14):2489-2491. doi: 10.1093/bioinformatics/bty1007.
Harmonizing quality control (QC) of large-scale second and third-generation sequencing datasets is key for enabling downstream computational and biological analyses. We present Alfred, an efficient and versatile command-line application that computes multi-sample QC metrics in a read-group aware manner, across a wide variety of sequencing assays and technologies. In addition to standard QC metrics such as GC bias, base composition, insert size and sequencing coverage distributions it supports haplotype-aware and allele-specific feature counting and feature annotation. The versatility of Alfred allows for easy pipeline integration in high-throughput settings, including DNA sequencing facilities and large-scale research initiatives, enabling continuous monitoring of sequence data quality and characteristics across samples. Alfred supports haplo-tagging of BAM/CRAM files to conduct haplotype-resolved analyses in conjunction with a variety of next-generation sequencing based assays. Alfred's companion web application enables interactive exploration of results and comparison to public datasets.
Alfred is open-source and freely available at https://tobiasrausch.com/alfred/.
Supplementary data are available at Bioinformatics online.
协调大规模第二代和第三代测序数据集的质量控制(QC)对于实现下游计算和生物学分析至关重要。我们介绍了 Alfred,这是一种高效且功能多样的命令行应用程序,能够以读取组感知的方式计算各种测序分析和技术的多样本 QC 指标。除了 GC 偏倚、碱基组成、插入大小和测序覆盖度分布等标准 QC 指标外,它还支持单倍型感知和等位基因特异性特征计数和特征注释。Alfred 的多功能性允许在高通量环境中轻松集成流水线,包括 DNA 测序设施和大型研究计划,从而能够跨样本持续监测序列数据质量和特征。Alfred 支持 BAM/CRAM 文件的单倍型标记,以结合各种基于下一代测序的分析进行单倍型解析分析。Alfred 的配套 Web 应用程序可实现结果的交互式探索,并与公共数据集进行比较。
Alfred 是开源的,可在 https://tobiasrausch.com/alfred/ 免费获得。
补充数据可在 Bioinformatics 在线获得。