Megasat：从序列数据自动推断微卫星基因型

megasat: automated inference of microsatellite genotypes from sequence data.

作者信息

Zhan Luyao, Paterson Ian G, Fraser Bonnie A, Watson Beth, Bradbury Ian R, Nadukkalam Ravindran Praveen, Reznick David, Beiko Robert G, Bentzen Paul

机构信息

Faculty of Computer Science, Dalhousie University, 6050 University Avenue, Halifax, Nova Scotia, B3H 4R2, Canada.

Marine Gene Probe Laboratory, Department of Biology, Dalhousie University, 1355 Oxford Street, Halifax, Nova Scotia, B3H 4R2, Canada.

出版信息

Mol Ecol Resour. 2017 Mar;17(2):247-256. doi: 10.1111/1755-0998.12561. Epub 2016 Jul 19.

DOI:10.1111/1755-0998.12561

PMID:27333119

Abstract

megasat is software that enables genotyping of microsatellite loci using next-generation sequencing data. Microsatellites are amplified in large multiplexes, and then sequenced in pooled amplicons. megasat reads sequence files and automatically scores microsatellite genotypes. It uses fuzzy matches to allow for sequencing errors and applies decision rules to account for amplification artefacts, including nontarget amplification products, replication slippage during PCR (amplification stutter) and differential amplification of alleles. An important feature of megasat is the generation of histograms of the length-frequency distributions of amplification products for each locus and each individual. These histograms, analogous to electropherograms traditionally used to score microsatellite genotypes, enable rapid evaluation and editing of automatically scored genotypes. megasat is written in Perl, runs on Windows, Mac OS X and Linux systems, and includes a simple graphical user interface. We demonstrate megasat using data from guppy, Poecilia reticulata. We genotype 1024 guppies at 43 microsatellites per run on an Illumina MiSeq sequencer. We evaluated the accuracy of automatically called genotypes using two methods, based on pedigree and repeat genotyping data, and obtained estimates of mean genotyping error rates of 0.021 and 0.012. In both estimates, three loci accounted for a disproportionate fraction of genotyping errors; conversely, 26 loci were scored with 0-1 detected error (error rate ≤0.007). Our results show that with appropriate selection of loci, automated genotyping of microsatellite loci can be achieved with very high throughput, low genotyping error and very low genotyping costs.

摘要

Megasat是一款软件，可利用下一代测序数据对微卫星位点进行基因分型。微卫星在大型多重反应中进行扩增，然后对混合扩增产物进行测序。Megasat读取序列文件并自动对微卫星基因型进行评分。它使用模糊匹配来允许测序错误，并应用决策规则来处理扩增假象，包括非目标扩增产物、PCR过程中的复制滑动（扩增口吃）以及等位基因的差异扩增。Megasat的一个重要特征是为每个位点和每个个体生成扩增产物长度-频率分布的直方图。这些直方图类似于传统上用于对微卫星基因型进行评分的电泳图，能够快速评估和编辑自动评分的基因型。Megasat用Perl编写，可在Windows、Mac OS X和Linux系统上运行，并包括一个简单的图形用户界面。我们使用孔雀鱼（Poecilia reticulata）的数据展示了Megasat。我们在Illumina MiSeq测序仪上每次运行对1024条孔雀鱼的43个微卫星进行基因分型。我们基于系谱和重复基因分型数据使用两种方法评估了自动调用基因型的准确性，得到的平均基因分型错误率估计值分别为0.021和0.012。在这两个估计值中，三个位点的基因分型错误占比过高；相反，26个位点的检测错误为0 - 1个（错误率≤0.007）。我们的结果表明，通过适当选择位点，可以实现微卫星位点的自动化基因分型，具有非常高的通量、低基因分型错误和非常低的基因分型成本。

相似文献

megasat: automated inference of microsatellite genotypes from sequence data.

Mol Ecol Resour. 2017 Mar;17(2):247-256. doi: 10.1111/1755-0998.12561. Epub 2016 Jul 19.

Seq2Sat and SatAnalyzer toolkit: Towards comprehensive microsatellite genotyping from sequencing data.

Mol Ecol Resour. 2024 Apr;24(3):e13929. doi: 10.1111/1755-0998.13929. Epub 2024 Jan 30.

MicNeSs: genotyping microsatellite loci from a collection of (NGS) reads.

Mol Ecol Resour. 2016 Mar;16(2):524-33. doi: 10.1111/1755-0998.12467. Epub 2015 Oct 14.

Ultra-deep Illumina sequencing accurately identifies MHC class IIb alleles and provides evidence for copy number variation in the guppy (Poecilia reticulata).

Mol Ecol Resour. 2014 Jul;14(4):753-67. doi: 10.1111/1755-0998.12225. Epub 2014 Feb 5.

Automated genotyping of microsatellite loci from feces with high throughput sequences.

PLoS One. 2021 Oct 25;16(10):e0258906. doi: 10.1371/journal.pone.0258906. eCollection 2021.

Genotyping-in-Thousands by sequencing (GT-seq): A cost effective SNP genotyping method based on custom amplicon sequencing.

Mol Ecol Resour. 2015 Jul;15(4):855-67. doi: 10.1111/1755-0998.12357. Epub 2014 Dec 25.

High-throughput microsatellite genotyping in ecology: improved accuracy, efficiency, standardization and success with low-quantity and degraded DNA.

Mol Ecol Resour. 2017 May;17(3):492-507. doi: 10.1111/1755-0998.12594. Epub 2016 Aug 29.

Efficient development of highly polymorphic microsatellite markers based on polymorphic repeats in transcriptome sequences of multiple individuals.

Mol Ecol Resour. 2015 Jan;15(1):17-27. doi: 10.1111/1755-0998.12289. Epub 2014 Jun 28.

swinger: a user-friendly computer program to establish captive breeding groups that minimize relatedness without pedigree information.

Mol Ecol Resour. 2017 Mar;17(2):278-287. doi: 10.1111/1755-0998.12609. Epub 2016 Nov 14.

Using next-generation sequencing approaches to isolate simple sequence repeat (SSR) loci in the plant sciences.

Am J Bot. 2012 Feb;99(2):193-208. doi: 10.3732/ajb.1100394. Epub 2011 Dec 20.

引用本文的文献

Population Genetics of Atlantic Salmon () in Prince Edward Island, Canada.

Ecol Evol. 2025 May 14;15(5):e71285. doi: 10.1002/ece3.71285. eCollection 2025 May.

Development of a Genotyping-in-Thousands by Sequencing (GT-Seq) Panel for Identifying Individuals and Estimating Relatedness Among Alaska Black Bears ().

Ecol Evol. 2025 Apr 11;15(4):e71273. doi: 10.1002/ece3.71273. eCollection 2025 Apr.

STRyper: A macOS application for microsatellite genotyping and chromatogram management.

PLoS One. 2025 Feb 20;20(2):e0318806. doi: 10.1371/journal.pone.0318806. eCollection 2025.

gscramble: Simulation of Admixed Individuals Without Reuse of Genetic Material.

Mol Ecol Resour. 2025 May;25(4):e14069. doi: 10.1111/1755-0998.14069. Epub 2025 Jan 12.

Temporal Variability in Effective Size ( ) Identifies Potential Sources of Discrepancies Between Mark Recapture and Close Kin Mark Recapture Estimates of Population Abundance.

Mol Ecol Resour. 2025 Apr;25(3):e14047. doi: 10.1111/1755-0998.14047. Epub 2024 Nov 24.

Phenotypic correlates between clock genes and phenology among populations of Diederik cuckoo, .

Ecol Evol. 2024 Jul 31;14(8):e70117. doi: 10.1002/ece3.70117. eCollection 2024 Aug.

Southern marsh deer (Blastocerus dichotomus) populations assessed using Amplicon Sequencing on fecal samples.

Sci Rep. 2024 Jul 13;14(1):16169. doi: 10.1038/s41598-024-67062-1.

Captive rearing effects on the methylome of Atlantic salmon after oceanic migration: Sex-specificity and intergenerational stability.

Mol Ecol Resour. 2023 Feb 9. doi: 10.1111/1755-0998.13766.

Commerson's dolphin population structure: evidence for female phylopatry and male dispersal.

Sci Rep. 2022 Dec 23;12(1):22219. doi: 10.1038/s41598-022-26192-0.

Demographic resilience of brook trout populations subjected to experimental size-selective harvesting.

Evol Appl. 2022 Sep 18;15(11):1792-1805. doi: 10.1111/eva.13478. eCollection 2022 Nov.

文献AI研究员

20分钟写一篇综述，助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型，支持多种主流文档格式。

立即体验

Megasat：从序列数据自动推断微卫星基因型

megasat: automated inference of microsatellite genotypes from sequence data.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

文献AI研究员

用中文搜PubMed

文档翻译

Suppr 超能文献

相似文献

引用本文的文献