Department of Ecosystem and Public Health and Department of Comparative Biology and Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, T2N 4Z6, Canada Department of Ecosystem and Public Health and Department of Comparative Biology and Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, T2N 4Z6, Canada.
Department of Ecosystem and Public Health and Department of Comparative Biology and Experimental Medicine, Faculty of Veterinary Medicine, University of Calgary, Calgary, Alberta, T2N 4Z6, Canada.
Bioinformatics. 2014 Nov 15;30(22):3266-7. doi: 10.1093/bioinformatics/btu544. Epub 2014 Aug 12.
Gene models from draft genome assemblies of metazoan species are often incorrect, missing exons or entire genes, particularly for large gene families. Consequently, labour-intensive manual curation is often necessary. We present Figmop (Finding Genes using Motif Patterns) to help with the manual curation of gene families in draft genome assemblies. The program uses a pattern of short sequence motifs to identify putative genes directly from the genome sequence. Using a large gene family as a test case, Figmop was found to be more sensitive and specific than a BLAST-based approach. The visualization used allows the validation of potential genes to be carried out quickly and easily, saving hours if not days from an analysis.
Source code of Figmop is freely available for download at https://github.com/dave-the-scientist, implemented in C and Python and is supported on Linux, Unix and MacOSX.
Supplementary data are available at Bioinformatics online.
后生动物物种的草图基因组组装中的基因模型通常是不正确的,会缺失外显子或整个基因,特别是对于大型基因家族。因此,通常需要进行劳动密集型的手动整理。我们提出了 Figmop(使用模体模式查找基因)来帮助手动整理草图基因组组装中的基因家族。该程序使用短序列模体模式直接从基因组序列中识别可能的基因。使用一个大型基因家族作为测试案例,发现 Figmop 比基于 BLAST 的方法更敏感和更特异。使用的可视化方法允许快速轻松地验证潜在基因,从而节省分析时间(如果不是几天,则是数小时)。
Figmop 的源代码可在 https://github.com/dave-the-scientist 上免费下载,它是用 C 和 Python 编写的,支持 Linux、Unix 和 MacOSX。
补充数据可在 Bioinformatics 在线获得。