Sarkar Deepayan, Goldstein Steve, Schwartz David C, Newton Michael A
Theoretical Statistics and Mathematics Unit, Indian Statistical Institute, New Delhi, India.
J Comput Biol. 2012 May;19(5):478-92. doi: 10.1089/cmb.2011.0221. Epub 2012 Apr 16.
The Optical Mapping System constructs ordered restriction maps spanning entire genomes through the assembly and analysis of large datasets comprising individually analyzed genomic DNA molecules. Such restriction maps uniquely reveal mammalian genome structure and variation, but also raise computational and statistical questions beyond those that have been solved in the analysis of smaller, microbial genomes. We address the problem of how to filter maps that align poorly to a reference genome. We obtain map-specific thresholds that control errors and improve iterative assembly. We also show how an optimal self-alignment score provides an accurate approximation to the probability of alignment, which is useful in applications seeking to identify structural genomic abnormalities.
光学图谱系统通过对包含逐个分析的基因组DNA分子的大型数据集进行组装和分析,构建跨越整个基因组的有序限制性图谱。这样的限制性图谱不仅能独特地揭示哺乳动物基因组的结构和变异,还引发了一些计算和统计问题,这些问题超出了在较小的微生物基因组分析中已解决的问题。我们解决了如何筛选与参考基因组比对不佳的图谱这一问题。我们获得了控制错误并改进迭代组装的图谱特异性阈值。我们还展示了最优自比对分数如何为比对概率提供准确的近似值,这在寻求识别结构基因组异常的应用中很有用。