Suppr超能文献

Megasat:从序列数据自动推断微卫星基因型

megasat: automated inference of microsatellite genotypes from sequence data.

作者信息

Zhan Luyao, Paterson Ian G, Fraser Bonnie A, Watson Beth, Bradbury Ian R, Nadukkalam Ravindran Praveen, Reznick David, Beiko Robert G, Bentzen Paul

机构信息

Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia, B3H 4R2, Canada.

Marine Gene Probe Laboratory, Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, Nova Scotia, B3H 4R2, Canada.

出版信息

Mol Ecol Resour. 2017 Mar;17(2):247-256. doi: 10.1111/1755-0998.12561. Epub 2016 Jul 19.

Abstract

megasat is software that enables genotyping of microsatellite loci using next-generation sequencing data. Microsatellites are amplified in large multiplexes, and then sequenced in pooled amplicons. megasat reads sequence files and automatically scores microsatellite genotypes. It uses fuzzy matches to allow for sequencing errors and applies decision rules to account for amplification artefacts, including nontarget amplification products, replication slippage during PCR (amplification stutter) and differential amplification of alleles. An important feature of megasat is the generation of histograms of the length-frequency distributions of amplification products for each locus and each individual. These histograms, analogous to electropherograms traditionally used to score microsatellite genotypes, enable rapid evaluation and editing of automatically scored genotypes. megasat is written in Perl, runs on Windows, Mac OS X and Linux systems, and includes a simple graphical user interface. We demonstrate megasat using data from guppy, Poecilia reticulata. We genotype 1024 guppies at 43 microsatellites per run on an Illumina MiSeq sequencer. We evaluated the accuracy of automatically called genotypes using two methods, based on pedigree and repeat genotyping data, and obtained estimates of mean genotyping error rates of 0.021 and 0.012. In both estimates, three loci accounted for a disproportionate fraction of genotyping errors; conversely, 26 loci were scored with 0-1 detected error (error rate ≤0.007). Our results show that with appropriate selection of loci, automated genotyping of microsatellite loci can be achieved with very high throughput, low genotyping error and very low genotyping costs.

摘要

Megasat是一款软件,可利用下一代测序数据对微卫星位点进行基因分型。微卫星在大型多重反应中进行扩增,然后对混合扩增产物进行测序。Megasat读取序列文件并自动对微卫星基因型进行评分。它使用模糊匹配来允许测序错误,并应用决策规则来处理扩增假象,包括非目标扩增产物、PCR过程中的复制滑动(扩增口吃)以及等位基因的差异扩增。Megasat的一个重要特征是为每个位点和每个个体生成扩增产物长度-频率分布的直方图。这些直方图类似于传统上用于对微卫星基因型进行评分的电泳图,能够快速评估和编辑自动评分的基因型。Megasat用Perl编写,可在Windows、Mac OS X和Linux系统上运行,并包括一个简单的图形用户界面。我们使用孔雀鱼(Poecilia reticulata)的数据展示了Megasat。我们在Illumina MiSeq测序仪上每次运行对1024条孔雀鱼的43个微卫星进行基因分型。我们基于系谱和重复基因分型数据使用两种方法评估了自动调用基因型的准确性,得到的平均基因分型错误率估计值分别为0.021和0.012。在这两个估计值中,三个位点的基因分型错误占比过高;相反,26个位点的检测错误为0 - 1个(错误率≤0.007)。我们的结果表明,通过适当选择位点,可以实现微卫星位点的自动化基因分型,具有非常高的通量、低基因分型错误和非常低的基因分型成本。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验