Russell Berrie Nanotechnology Institute, Technion, Haifa 320003, Israel.
Department of Computer Science, Technion, Haifa 320003, Israel.
Bioinformatics. 2023 Oct 3;39(10). doi: 10.1093/bioinformatics/btad601.
Optical genome mapping (OGM) is a technique that extracts partial genomic information from optically imaged and linearized DNA fragments containing fluorescently labeled short sequence patterns. This information can be used for various genomic analyses and applications, such as the detection of structural variations and copy-number variations, epigenomic profiling, and microbial species identification. Currently, the choice of labeled patterns is based on the available biochemical methods and is not necessarily optimized for the application.
In this work, we develop a model of OGM based on information theory, which enables the design of optimal labeling patterns for specific applications and target organism genomes. We validated the model through experimental OGM on human DNA and simulations on bacterial DNA. Our model predicts up to 10-fold improved accuracy by optimal choice of labeling patterns, which may guide future development of OGM biochemical labeling methods and significantly improve its accuracy and yield for applications such as epigenomic profiling and cultivation-free pathogen identification in clinical samples.
光学基因组图谱(OGM)是一种从光学成像和线性化的 DNA 片段中提取部分基因组信息的技术,这些片段包含荧光标记的短序列模式。这些信息可用于各种基因组分析和应用,例如结构变异和拷贝数变异的检测、表观基因组分析和微生物物种鉴定。目前,标记模式的选择基于现有的生化方法,不一定针对特定的应用和目标生物基因组进行优化。
在这项工作中,我们基于信息论开发了一种 OGM 模型,该模型可以针对特定应用和目标生物基因组设计最佳的标记模式。我们通过人类 DNA 的实验性 OGM 和细菌 DNA 的模拟实验验证了该模型。我们的模型通过最优选择标记模式预测可以提高多达 10 倍的准确性,这可能会指导未来 OGM 生化标记方法的发展,并显著提高其在表观基因组分析和临床样本中无培养病原体鉴定等应用中的准确性和产量。