Suppr超能文献

通过模拟确定光学图谱误差。

Determining optical mapping errors by simulations.

作者信息

Vašinek Michal, Běhálek Marek, Gajdoš Petr, Fillerová Regina, Kriegová Eva

机构信息

Department of Computer Science, Faculty of Electrical Engineering and Computer Science, VSB-Technical University of Ostrava, Ostrava 708 00, Czech Republic.

Department of Immunology, Faculty of Medicine and Dentistry, Palacky University and University Hospital, Olomouc 779 00, Czech Republic.

出版信息

Bioinformatics. 2021 Oct 25;37(20):3391-3397. doi: 10.1093/bioinformatics/btab259.

Abstract

MOTIVATION

Optical mapping is a complementary technology to traditional DNA sequencing technologies, such as next-generation sequencing (NGS). It provides genome-wide, high-resolution restriction maps from single, stained molecules of DNA. It can be used to detect large and small structural variants, copy number variations and complex rearrangements. Optical mapping is affected by different kinds of errors in comparison with traditional DNA sequencing technologies. It is important to understand the source of these errors and how they affect the obtained data. This article proposes a novel approach to modeling errors in the data obtained from the Bionano Genomics Inc. Saphyr system with Direct Label and Stain (DLS) chemistry. Some studies have already addressed this issue for older instruments with nicking enzymes, but we are unaware of a study that addresses this new system.

RESULTS

The main result is a framework for studying errors in the data obtained from the Saphyr instrument with DLS chemistry. The framework's main component is a simulation that computes how major sources of errors for this instrument (a false site, a missing site and resolution errors) affect the distribution of fragment lengths in optical maps. The simulation is parametrized by variables describing these errors and we are using a differential evolution algorithm to evaluate parameters that best fit the data from the instrument. Results of the experiments manifest that this approach can be used to study errors in the optical mapping data analysis.

AVAILABILITY AND IMPLEMENTATION

Source codes supporting the presented results are available at: https://github.com/mvasinek/olgen-om-error-prediction. The data underlying this article are available on the Bionano Genomics Inc. website, at: https://bionanogenomics.com/library/datasets/.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

光学图谱是一种与传统DNA测序技术(如下一代测序(NGS))互补的技术。它能从单个染色的DNA分子中提供全基因组的高分辨率限制性图谱。它可用于检测大小结构变异、拷贝数变异和复杂重排。与传统DNA测序技术相比,光学图谱受不同类型误差的影响。了解这些误差的来源以及它们如何影响所获得的数据非常重要。本文提出了一种新方法,用于对从Bionano Genomics公司的Saphyr系统采用直接标记和染色(DLS)化学方法获得的数据中的误差进行建模。一些研究已经针对使用切口酶的旧仪器解决了这个问题,但我们不知道有研究涉及这个新系统。

结果

主要结果是一个用于研究采用DLS化学方法从Saphyr仪器获得的数据中的误差的框架。该框架的主要组成部分是一个模拟,它计算该仪器的主要误差来源(假位点、缺失位点和分辨率误差)如何影响光学图谱中片段长度的分布。该模拟由描述这些误差的变量进行参数化,并且我们正在使用差分进化算法来评估最符合该仪器数据的参数。实验结果表明,这种方法可用于研究光学图谱数据分析中的误差。

可用性和实现方式

支持所呈现结果的源代码可在以下网址获取:https://github.com/mvasinek/olgen-om-error-prediction。本文所依据的数据可在Bionano Genomics公司网站获取,网址为:https://bionanogenomics.com/library/datasets/。

补充信息

补充数据可在《生物信息学》在线版获取。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验