Jonker Martijs J, de Leeuw Wim C, Marinković Marino, Wittink Floyd R A, Rauwerda Han, Bruning Oskar, Ensink Wim A, Fluit Ad C, Boel C H, Jong Mark de, Breit Timo M
MicroArray Department & Integrative Bioinformatics Unit (MAD-IBU), Swammerdam Institute for Life Sciences (SILS), Faculty of Science (FNWI), University of Amsterdam (UvA), 1098 XH, Amsterdam, the Netherlands Netherlands Bioinformatics Centre (NBIC), 6525 GA, Nijmegen, the Netherlands.
MicroArray Department & Integrative Bioinformatics Unit (MAD-IBU), Swammerdam Institute for Life Sciences (SILS), Faculty of Science (FNWI), University of Amsterdam (UvA), 1098 XH, Amsterdam, the Netherlands Department of Aquatic Ecology and Ecotoxicology, Institute for Biodiversity and Ecosystem Dynamics (IBED), University of Amsterdam, Amsterdam, the Netherlands.
Nucleic Acids Res. 2014 Jun;42(11):e94. doi: 10.1093/nar/gku343. Epub 2014 Apr 25.
Structural variations in genomes are commonly studied by (micro)array-based comparative genomic hybridization. The data analysis methods to infer copy number variation in model organisms (human, mouse) are established. In principle, the procedures are based on signal ratios between test and reference samples and the order of the probe targets in the genome. These procedures are less applicable to experiments with non-model organisms, which frequently comprise non-sequenced genomes with an unknown order of probe targets. We therefore present an additional analysis approach, which does not depend on the structural information of a reference genome, and quantifies the presence or absence of a probe target in an unknown genome. The principle is that intensity values of target probes are compared with the intensities of negative-control probes and positive-control probes from a control hybridization, to determine if a probe target is absent or present. In a test, analyzing the genome content of a known bacterial strain: Staphylococcus aureus MRSA252, this approach proved to be successful, demonstrated by receiver operating characteristic area under the curve values larger than 0.9995. We show its usability in various applications, such as comparing genome content and validating next-generation sequencing reads from eukaryotic non-model organisms.
基因组结构变异通常通过基于(微)阵列的比较基因组杂交来研究。推断模式生物(人类、小鼠)中拷贝数变异的数据分析方法已经确立。原则上,这些程序基于测试样本和参考样本之间的信号比率以及基因组中探针靶标的顺序。这些程序不太适用于非模式生物的实验,因为非模式生物的基因组通常未测序,且探针靶标的顺序未知。因此,我们提出了一种额外的分析方法,该方法不依赖于参考基因组的结构信息,而是对未知基因组中探针靶标的存在与否进行量化。其原理是将靶标探针的强度值与对照杂交中阴性对照探针和阳性对照探针的强度进行比较,以确定探针靶标是否存在。在一项分析已知细菌菌株金黄色葡萄球菌MRSA252基因组含量的测试中,该方法被证明是成功的,曲线下面积的受试者工作特征值大于0.9995即证明了这一点。我们展示了其在各种应用中的可用性,例如比较基因组含量和验证来自真核非模式生物的下一代测序读数。