Zimin Aleksey V, Smith Douglas R, Sutton Granger, Yorke James A
IPST, University of Maryland, College Park, Agencourt Bioscience Inc., Beverly, MA.
Bioinformatics. 2008 Jan 1;24(1):42-5. doi: 10.1093/bioinformatics/btm542. Epub 2007 Dec 5.
Many genomes are sequenced by a collaboration of several centers, and then each center produces an assembly using their own assembly software. The collaborators then pick the draft assembly that they judge to be the best and the information contained in the other assemblies is usually not used.
We have developed a technique that we call assembly reconciliation that can merge draft genome assemblies. It takes one draft assembly, detects apparent errors, and, when possible, patches the problem areas using pieces from alternative draft assemblies. It also closes gaps in places where one of the alternative assemblies has spanned the gap correctly.
Using the Assembly Reconciliation technique, we produced reconciled assemblies of six Drosophila species in collaboration with Agencourt Bioscience and The J. Craig Venter Institute. These assemblies are now the official (CAF1) assemblies used for analysis. We also produced a reconciled assembly of Rhesus Macaque genome, and this assembly is available from our website http://www.genome.umd.edu.
The reconciliation software is available for download from http://www.genome.umd.edu/software.htm
许多基因组是由多个中心合作测序的,然后每个中心使用自己的组装软件生成一个组装结果。合作者随后挑选出他们认为最好的草图组装结果,而其他组装结果中包含的信息通常不会被使用。
我们开发了一种称为组装协调的技术,它可以合并基因组草图组装结果。它选取一个草图组装结果,检测明显的错误,并在可能的情况下,使用来自替代草图组装结果的片段修补问题区域。它还会填补替代组装结果之一正确跨越缺口的地方的间隙。
使用组装协调技术,我们与安捷伦生物科学公司和J. 克雷格·文特尔研究所合作,生成了六种果蝇物种的协调组装结果。这些组装结果现在是用于分析的官方(CAF1)组装结果。我们还生成了恒河猴基因组的协调组装结果,该组装结果可从我们的网站http://www.genome.umd.edu获取。