Suppr超能文献

AluMine:用于发现多态性Alu元件插入的无比对方法。

AluMine: alignment-free method for the discovery of polymorphic Alu element insertions.

作者信息

Puurand Tarmo, Kukuškina Viktoria, Pajuste Fanny-Dhelia, Remm Maido

机构信息

Institute of Molecular and Cell Biology, University of Tartu, Tartu, Estonia.

出版信息

Mob DNA. 2019 Jul 18;10:31. doi: 10.1186/s13100-019-0174-3. eCollection 2019.

Abstract

BACKGROUND

Recently, alignment-free sequence analysis methods have gained popularity in the field of personal genomics. These methods are based on counting frequencies of short -mer sequences, thus allowing faster and more robust analysis compared to traditional alignment-based methods.

RESULTS

We have created a fast alignment-free method, AluMine, to analyze polymorphic insertions of Alu elements in the human genome. We tested the method on 2,241 individuals from the Estonian Genome Project and identified 28,962 potential polymorphic Alu element insertions. Each tested individual had on average 1,574 Alu element insertions that were different from those in the reference genome. In addition, we propose an alignment-free genotyping method that uses the frequency of insertion/deletion-specific 32-mer pairs to call the genotype directly from raw sequencing reads. Using this method, the concordance between the predicted and experimentally observed genotypes was 98.7%. The running time of the discovery pipeline is approximately 2 h per individual. The genotyping of potential polymorphic insertions takes between 0.4 and 4 h per individual, depending on the hardware configuration.

CONCLUSIONS

AluMine provides tools that allow discovery of novel Alu element insertions and/or genotyping of known Alu element insertions from personal genomes within few hours.

摘要

背景

最近,无比对序列分析方法在个人基因组学领域受到欢迎。这些方法基于短序列片段的频率计数,因此与传统的基于比对的方法相比,能够实现更快且更稳健的分析。

结果

我们创建了一种快速的无比对方法AluMine,用于分析人类基因组中Alu元件的多态性插入。我们在爱沙尼亚基因组计划的2241名个体上测试了该方法,识别出28962个潜在的多态性Alu元件插入。每个测试个体平均有1574个与参考基因组不同的Alu元件插入。此外,我们提出了一种无比对基因分型方法,该方法使用插入/缺失特异性32聚体对的频率直接从原始测序读数中调用基因型。使用这种方法,预测基因型与实验观察到的基因型之间的一致性为98.7%。发现流程的运行时间约为每个个体2小时。潜在多态性插入的基因分型每个个体需要0.4到4小时,具体取决于硬件配置。

结论

AluMine提供了一些工具,能够在几小时内从个人基因组中发现新的Alu元件插入和/或对已知的Alu元件插入进行基因分型。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/d7b3/6639938/21c9150411f9/13100_2019_174_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验