Department of Genetics, Faculty of Science, University of Granada, 18071 Granada, Spain.
Bioinformatics Laboratory, Biotechnology Institute, Centro de Investigación Biomédica, PTS, Avda. del Conocimiento s/n, 18100-Granada. Spain.
Nucleic Acids Res. 2020 Jul 2;48(W1):W262-W267. doi: 10.1093/nar/gkaa452.
Although miRNA-seq is extensively used in many different fields, its quality control is frequently restricted to a PhredScore-based filter. Other important quality related aspects like microRNA yield, the fraction of putative degradation products (such as rRNA fragments) or the percentage of adapter-dimers are hard to assess using absolute thresholds. Here we present mirnaQC, a webserver that relies on 34 quality parameters to assist in miRNA-seq quality control. To improve their interpretability, quality attributes are ranked using a reference distribution obtained from over 36 000 publicly available miRNA-seq datasets. Accepted input formats include FASTQ and SRA accessions. The results page contains several sections that deal with putative technical artefacts related to library preparation, sequencing, contamination or yield. Different visualisations, including PCA and heatmaps, are available to help users identify underlying issues. Finally, we show the usefulness of this approach by analysing two publicly available datasets and discussing the different quality issues that can be detected using mirnaQC.
尽管 miRNA-seq 在许多不同的领域中得到了广泛的应用,但它的质量控制通常仅限于基于 PhredScore 的筛选。其他重要的与质量相关的方面,如 miRNA 的产量、假定降解产物(如 rRNA 片段)的分数或接头二聚体的百分比,使用绝对阈值很难评估。在这里,我们介绍了 mirnaQC,这是一个依赖于 34 个质量参数来协助 miRNA-seq 质量控制的网络服务器。为了提高它们的可解释性,质量属性是使用从超过 36000 个公开可用的 miRNA-seq 数据集获得的参考分布进行排名的。接受的输入格式包括 FASTQ 和 SRA 访问号。结果页面包含几个部分,涉及与文库制备、测序、污染或产量相关的潜在技术伪影。提供了不同的可视化效果,包括 PCA 和热图,以帮助用户识别潜在的问题。最后,我们通过分析两个公开可用的数据集并讨论使用 mirnaQC 可以检测到的不同质量问题,展示了这种方法的有用性。