Center for Bioinformatics and Computational Biology, Institute for Advanced Computer Studies, University of Maryland, College Park, MD 20742, USA.
Genome Biol. 2010;11(3):R28. doi: 10.1186/gb-2010-11-3-r28. Epub 2010 Mar 10.
Diploid genomes with divergent chromosomes present special problems for assembly software as two copies of especially polymorphic regions may be mistakenly constructed, creating the appearance of a recent segmental duplication. We developed a method for identifying such false duplications and applied it to four vertebrate genomes. For each genome, we corrected mis-assemblies, improved estimates of the amount of duplicated sequence, and recovered polymorphisms between the sequenced chromosomes.
具有不同染色体的二倍体基因组为组装软件带来了特殊的问题,因为特别多态区域的两个副本可能会被错误地构建,从而产生最近的片段重复的假象。我们开发了一种识别这种假重复的方法,并将其应用于四个脊椎动物基因组。对于每个基因组,我们纠正了错误组装,改进了重复序列的估计量,并恢复了测序染色体之间的多态性。