Rouchka Eric C, Gish Warren, States David J
Department of Computer Science, Washington University, Washington University School of Medicine, St Louis, MO 63110, USA.
Nucleic Acids Res. 2002 Nov 15;30(22):5004-14. doi: 10.1093/nar/gkf633.
A fundamental problem in the human genome project is uncovering the correct assembly of the human genome. Many studies, including transcriptional analysis, SNP detection and characterization, gene finding and EST clustering, use genome assemblies as templates so it is important to determine the consistency among the various whole genome assemblies. A comparison of the order and orientation of the GenBank entries used to construct the NCBI and UCSC Goldenpath assemblies was made. In addition, a sequence level comparison was performed using MULTI, an efficient database search tool developed to make whole genome comparisons possible. The resulting comparisons show significant discrepancies in the sequence as well as in the order and orientation of GenBank entries used in constructing the NCBI and UCSC assemblies.
人类基因组计划中的一个基本问题是揭示人类基因组的正确组装。许多研究,包括转录分析、单核苷酸多态性(SNP)检测与表征、基因发现和表达序列标签(EST)聚类,都将基因组组装作为模板,因此确定各种全基因组组装之间的一致性非常重要。对用于构建NCBI和加州大学圣克鲁兹分校(UCSC)黄金路径组装的GenBank条目的顺序和方向进行了比较。此外,使用MULTI进行了序列水平的比较,MULTI是一种为实现全基因组比较而开发的高效数据库搜索工具。结果比较显示,在构建NCBI和UCSC组装时所使用的GenBank条目的序列、顺序和方向上存在显著差异。