Suppr超能文献

[一种用于在高通量测序数据中无错误鉴定体细胞Alu插入的流程]

[A Pipeline for the Error-free Identification of Somatic Alu Insertions in High-throughput Sequencing Data].

作者信息

Nugmanov G A, Komkov A Y, Saliutina M V, Minervina A A, Lebedev Y B, Mamedov I Z

机构信息

Shemyakin-Ovchinnikov Institute of Bioorganic Chemistry, Russian Academy of Sciences, Moscow, 117997 Russia.

出版信息

Mol Biol (Mosk). 2019 Jan-Feb;53(1):154-165. doi: 10.1134/S0026898419010117.

Abstract

Retroelements are considered as one of the important sources of genomic variability in modern humans. It is known that transposition activity of retroelements in germline cells generates new insertions in various genomic loci and sometimes results in genetic diseases. Retroelements activity in somatic cells is restricted by different cellular mechanisms; however, there is an evidence for it in some tissue types. Somatic insertions can trigger tumorigenesis or participate in normal functioning such as generation of neurons' plasticity. In spite of the rapid development of high-throughput sequencing methods a confident detection of somatic insertions is still quite a challenging task. That, in part, is due to the absence of adequate bioinformatic tools for the analysis of sequencing data. Here, we propose an advanced computational pipeline for the identification of somatic insertions in datasets generated by selective amplification and high-throughput sequencing of genomic regions flanking insertions of AluYa5. Particular attention is paid for the identification of various artifacts arising in course of library preparation and the parameters for their filtration. Pipeline sensitivity is confirmed by in silico experiments with artificial datasets. Using the proposed pipeline we remove at least 80% of artifacts and preserve 75% of potentially somatic insertions. The approaches used in this work can be applied for the study of other mobile elements insertion variability.

摘要

逆转录元件被认为是现代人类基因组变异的重要来源之一。已知生殖细胞中逆转录元件的转座活性会在各种基因组位点产生新的插入,有时会导致遗传疾病。体细胞中逆转录元件的活性受到不同细胞机制的限制;然而,在某些组织类型中存在相关证据。体细胞插入可引发肿瘤发生或参与正常功能,如神经元可塑性的产生。尽管高通量测序方法迅速发展,但可靠地检测体细胞插入仍然是一项颇具挑战性的任务。部分原因在于缺乏用于分析测序数据的适当生物信息学工具。在此,我们提出一种先进的计算流程,用于识别通过对AluYa5插入侧翼基因组区域进行选择性扩增和高通量测序生成的数据集中的体细胞插入。特别关注文库制备过程中出现的各种假象的识别及其过滤参数。通过对人工数据集的计算机模拟实验证实了该流程的敏感性。使用所提出的流程,我们至少去除了80%的假象,并保留了75%的潜在体细胞插入。这项工作中使用的方法可应用于研究其他移动元件的插入变异性。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验