Suppr超能文献

TypeTE:一种从全基因组重测序数据中对移动元件插入进行基因分型的工具。

TypeTE: a tool to genotype mobile element insertions from whole genome resequencing data.

机构信息

Department of Molecular Biology and Genetics, 215 Tower Rd, Cornell University, Ithaca, NY 14853, USA.

Department of Human Genetics, University of Utah School of Medicine, Salt Lake City, UT 84112, USA.

出版信息

Nucleic Acids Res. 2020 Apr 6;48(6):e36. doi: 10.1093/nar/gkaa074.

Abstract

Alu retrotransposons account for more than 10% of the human genome, and insertions of these elements create structural variants segregating in human populations. Such polymorphic Alus are powerful markers to understand population structure, and they represent variants that can greatly impact genome function, including gene expression. Accurate genotyping of Alus and other mobile elements has been challenging. Indeed, we found that Alu genotypes previously called for the 1000 Genomes Project are sometimes erroneous, which poses significant problems for phasing these insertions with other variants that comprise the haplotype. To ameliorate this issue, we introduce a new pipeline - TypeTE - which genotypes Alu insertions from whole-genome sequencing data. Starting from a list of polymorphic Alus, TypeTE identifies the hallmarks (poly-A tail and target site duplication) and orientation of Alu insertions using local re-assembly to reconstruct presence and absence alleles. Genotype likelihoods are then computed after re-mapping sequencing reads to the reconstructed alleles. Using a high-quality set of PCR-based genotyping of >200 loci, we show that TypeTE improves genotype accuracy from 83% to 92% in the 1000 Genomes dataset. TypeTE can be readily adapted to other retrotransposon families and brings a valuable toolbox addition for population genomics.

摘要

Alu 逆转录转座子占人类基因组的 10%以上,这些元件的插入会产生在人类群体中分离的结构变异。这种多态性的 Alu 是了解群体结构的有力标记,它们代表着可以极大地影响基因组功能的变体,包括基因表达。Alu 和其他移动元件的准确基因分型一直具有挑战性。事实上,我们发现之前在 1000 基因组计划中调用的 Alu 基因型有时是错误的,这给这些插入与构成单倍型的其他变体的相位带来了重大问题。为了解决这个问题,我们引入了一个新的管道 - TypeTE - 它可以从全基因组测序数据中对 Alu 插入进行基因分型。从一组多态性 Alu 开始,TypeTE 使用局部重新组装来识别 Alu 插入的标志(多 A 尾巴和靶标位点重复)和方向,以重建存在和不存在的等位基因。然后,在将测序读数重新映射到重建的等位基因后,计算基因型的可能性。使用基于 PCR 的 >200 个位点的高质量基因分型集,我们表明 TypeTE 可以将 1000 基因组数据中的基因型准确性从 83%提高到 92%。TypeTE 可以很容易地适应其他逆转录转座子家族,并为群体基因组学带来了一个有价值的工具包。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/b648/7102983/42094d0efd08/gkaa074fig1.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验