Department of Biosciences, Metapopulation Research Group, University of Helsinki, P.O. Box 65, FI-00014, Finland and Institute of Biotechnology, University of Helsinki, P.O. Box 56, FI-00014, Finland.
Bioinformatics. 2013 Dec 15;29(24):3128-34. doi: 10.1093/bioinformatics/btt563. Epub 2013 Sep 26.
Current high-throughput sequencing technologies allow cost-efficient genotyping of millions of single nucleotide polymorphisms (SNPs) for hundreds of samples. However, the tools that are currently available for constructing linkage maps are not well suited for large datasets. Linkage maps of large datasets would be helpful in de novo genome assembly by facilitating comprehensive genome validation and refinement by enabling chimeric scaffold detection, as well as in family-based linkage and association studies, quantitative trait locus mapping, analysis of genome synteny and other complex genomic data analyses.
We describe a novel tool, called Lepidoptera-MAP (Lep-MAP), for constructing accurate linkage maps with ultradense genome-wide SNP data. Lep-MAP is fast and memory efficient and largely automated, requiring minimal user interaction. It uses simultaneously data on multiple outbred families and can increase linkage map accuracy by taking into account achiasmatic meiosis, a special feature of Lepidoptera and some other taxa with no recombination in one sex (no recombination in females in Lepidoptera). We demonstrate that Lep-MAP outperforms other methods on real and simulated data. We construct a genome-wide linkage map of the Glanville fritillary butterfly (Melitaea cinxia) with over 40 000 SNPs. The data were generated with a novel in-house SOLiD restriction site-associated DNA tag sequencing protocol, which is described in the online supplementary material.
Java source code under GNU general public license with the compiled classes and the datasets are available from http://sourceforge.net/users/lep-map.
目前高通量测序技术允许对数百个样本进行成本效益高的数百万个单核苷酸多态性(SNP)的基因分型。然而,目前用于构建连锁图谱的工具并不适合于大型数据集。大型数据集的连锁图谱将有助于从头基因组组装,通过促进全面的基因组验证和细化,从而实现嵌合体支架检测,以及在基于家族的连锁和关联研究、数量性状位点作图、基因组同线性分析和其他复杂基因组数据分析中。
我们描述了一种名为鳞翅目图谱(Lep-MAP)的新工具,用于构建具有超密集全基因组 SNP 数据的精确连锁图谱。Lep-MAP 速度快,内存效率高,并且高度自动化,只需要最小的用户交互。它同时使用多个杂交家族的数据,并通过考虑无交叉减数分裂(鳞翅目和其他一些没有重组的类群中雄性的一种特殊特征)来提高连锁图谱的准确性。我们证明 Lep-MAP 在真实和模拟数据上的表现优于其他方法。我们构建了金斑喙凤蝶(Melitaea cinxia)的全基因组连锁图谱,该图谱包含超过 40000 个 SNP。数据是通过一种新的内部 SOLiD 限制位点相关 DNA 标签测序协议生成的,该协议在在线补充材料中描述。
Lep-MAP 是基于 GNU 通用公共许可证的 Java 源代码,带有编译的类和数据集,可以从 http://sourceforge.net/users/lep-map 获得。