Suppr超能文献

下一代测序技术质量控制的统计指南。

Statistical guidelines for quality control of next-generation sequencing techniques.

机构信息

Faculty of Biology, Johannes Gutenberg-Universität Mainz, Biozentrum I, Mainz, Germany.

Faculty of Biology, Johannes Gutenberg-Universität Mainz, Biozentrum I, Mainz, Germany

出版信息

Life Sci Alliance. 2021 Aug 30;4(11). doi: 10.26508/lsa.202101113. Print 2021 Nov.

Abstract

More and more next-generation sequencing (NGS) data are made available every day. However, the quality of this data is not always guaranteed. Available quality control tools require profound knowledge to correctly interpret the multiplicity of quality features. Moreover, it is usually difficult to know if quality features are relevant in all experimental conditions. Therefore, the NGS community would highly benefit from condition-specific data-driven guidelines derived from many publicly available experiments, which reflect routinely generated NGS data. In this work, we have characterized well-known quality guidelines and related features in big datasets and concluded that they are too limited for assessing the quality of a given NGS file accurately. Therefore, we present new data-driven guidelines derived from the statistical analysis of many public datasets using quality features calculated by common bioinformatics tools. Thanks to this approach, we confirm the high relevance of genome mapping statistics to assess the quality of the data, and we demonstrate the limited scope of some quality features that are not relevant in all conditions. Our guidelines are available at https://cbdm.uni-mainz.de/ngs-guidelines.

摘要

每天都有越来越多的下一代测序 (NGS) 数据可用。然而,这些数据的质量并不总是有保证的。现有的质量控制工具需要深入的知识才能正确解释质量特征的多样性。此外,通常很难知道质量特征在所有实验条件下是否相关。因此,NGS 社区将非常受益于从许多公开实验中得出的特定于条件的数据驱动指南,这些指南反映了常规生成的 NGS 数据。在这项工作中,我们对大型数据集进行了特征分析,对知名的质量指导方针和相关特征进行了特征分析,得出的结论是,这些指导方针对于准确评估给定 NGS 文件的质量来说过于有限。因此,我们提出了新的数据驱动的指导方针,这些指导方针是从使用常见生物信息学工具计算的质量特征对许多公共数据集进行统计分析得出的。通过这种方法,我们证实了基因组映射统计数据对于评估数据质量的高度相关性,并证明了一些在所有条件下都不相关的质量特征的范围有限。我们的指南可在 https://cbdm.uni-mainz.de/ngs-guidelines 上获得。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/35ed/8408346/66017ff160c6/LSA-2021-01113_Fig1.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验