Institute of Biological, Environmental and Rural Sciences, Aberystwyth University, Aberystwyth, SY23 3DA, UK.
School of Biomedical and Healthcare Sciences, Plymouth University Peninsula, Schools of Medicine and Dentistry, Plymouth PL6 8BU, UK.
Gigascience. 2018 May 1;7(5). doi: 10.1093/gigascience/giy017.
Cross-species whole-genome sequence alignment is a critical first step for genome comparative analyses, ranging from the detection of sequence variants to studies of chromosome evolution. Animal genomes are large and complex, and whole-genome alignment is a computationally intense process, requiring expensive high-performance computing systems due to the need to explore extensive local alignments. With hundreds of sequenced animal genomes available from multiple projects, there is an increasing demand for genome comparative analyses.
Here, we introduce G-Anchor, a new, fast, and efficient pipeline that uses a strictly limited but highly effective set of local sequence alignments to anchor (or map) an animal genome to another species' reference genome. G-Anchor makes novel use of a databank of highly conserved DNA sequence elements. We demonstrate how these elements may be aligned to a pair of genomes, creating anchors. These anchors enable the rapid mapping of scaffolds from a de novo assembled genome to chromosome assemblies of a reference species. Our results demonstrate that G-Anchor can successfully anchor a vertebrate genome onto a phylogenetically related reference species genome using a desktop or laptop computer within a few hours and with comparable accuracy to that achieved by a highly accurate whole-genome alignment tool such as LASTZ. G-Anchor thus makes whole-genome comparisons accessible to researchers with limited computational resources.
G-Anchor is a ready-to-use tool for anchoring a pair of vertebrate genomes. It may be used with large genomes that contain a significant fraction of evolutionally conserved DNA sequences and that are not highly repetitive, polypoid, or excessively fragmented. G-Anchor is not a substitute for whole-genome aligning software but can be used for fast and accurate initial genome comparisons. G-Anchor is freely available and a ready-to-use tool for the pairwise comparison of two genomes.
跨物种全基因组序列比对是基因组比较分析的关键第一步,从检测序列变异到染色体进化研究都需要进行全基因组比对。动物基因组庞大且复杂,全基因组比对是一个计算密集型的过程,由于需要探索广泛的局部比对,因此需要昂贵的高性能计算系统。随着来自多个项目的数百个已测序动物基因组的可用,对基因组比较分析的需求日益增加。
在这里,我们介绍了 G-Anchor,这是一种新的、快速且高效的流水线,它使用严格限制但非常有效的一组局部序列比对将动物基因组锚定(或映射)到另一个物种的参考基因组上。G-Anchor 新颖地利用了高度保守的 DNA 序列元件数据库。我们展示了如何将这些元素与一对基因组进行比对,创建锚点。这些锚点可用于将从头组装的基因组的支架快速映射到参考物种的染色体组装上。我们的结果表明,G-Anchor 可以使用桌面或笔记本电脑在几个小时内成功地将脊椎动物基因组锚定到系统发育上相关的参考物种基因组上,并且其准确性可与 LASTZ 等高度准确的全基因组比对工具相当。因此,G-Anchor 使具有有限计算资源的研究人员能够进行全基因组比较。
G-Anchor 是一种用于锚定一对脊椎动物基因组的即用型工具。它可以用于包含大量进化保守 DNA 序列且不是高度重复、多倍体或过度碎片化的大型基因组。G-Anchor 不是全基因组比对软件的替代品,但可用于快速准确的初始基因组比较。G-Anchor 是免费提供的,并且是用于比较两个基因组的即用型工具。