Suppr超能文献

使用Hydra-Multi进行基于人群的结构变异发现。

Population-based structural variation discovery with Hydra-Multi.

作者信息

Lindberg Michael R, Hall Ira M, Quinlan Aaron R

机构信息

Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA, Department of Medicine, The Genome Institute, Washington University School of Medicine, St. Louis MO, USA and Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA.

Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA, Department of Medicine, The Genome Institute, Washington University School of Medicine, St. Louis MO, USA and Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA, Department of Medicine, The Genome Institute, Washington University School of Medicine, St. Louis MO, USA and Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA, Department of Medicine, The Genome Institute, Washington University School of Medicine, St. Louis MO, USA and Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA Department of Biochemistry and Molecular Genetics, Center for Public Health Genomics, University of Virginia, Charlottesville, VA, USA, Department of Medicine, The Genome Institute, Washington University School of Medicine, St. Louis MO, USA and Department of Public Health Sciences, University of Virginia, Charlottesville, VA, USA.

出版信息

Bioinformatics. 2015 Apr 15;31(8):1286-9. doi: 10.1093/bioinformatics/btu771. Epub 2014 Dec 2.

Abstract

UNLABELLED

Current strategies for SNP and INDEL discovery incorporate sequence alignments from multiple individuals to maximize sensitivity and specificity. It is widely accepted that this approach also improves structural variant (SV) detection. However, multisample SV analysis has been stymied by the fundamental difficulties of SV calling, e.g. library insert size variability, SV alignment signal integration and detecting long-range genomic rearrangements involving disjoint loci. Extant tools suffer from poor scalability, which limits the number of genomes that can be co-analyzed and complicates analysis workflows. We have developed an approach that enables multisample SV analysis in hundreds to thousands of human genomes using commodity hardware. Here, we describe Hydra-Multi and measure its accuracy, speed and scalability using publicly available datasets provided by The 1000 Genomes Project and by The Cancer Genome Atlas (TCGA).

AVAILABILITY AND IMPLEMENTATION

Hydra-Multi is written in C++ and is freely available at https://github.com/arq5x/Hydra.

CONTACT

aaronquinlan@gmail.com or ihall@genome.wustl.edu

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

未标注

当前用于单核苷酸多态性(SNP)和插入缺失(INDEL)发现的策略纳入了多个个体的序列比对,以最大化敏感性和特异性。人们普遍认为这种方法也能改善结构变异(SV)检测。然而,多样本SV分析一直受到SV检测基本难题的阻碍,例如文库插入片段大小的变异性、SV比对信号整合以及检测涉及不连续位点的长程基因组重排。现有工具存在扩展性差的问题,这限制了可共同分析的基因组数量,并使分析工作流程复杂化。我们开发了一种方法,能够使用商用硬件对数百至数千个人类基因组进行多样本SV分析。在此,我们描述了Hydra-Multi,并使用千人基因组计划和癌症基因组图谱(TCGA)提供的公开可用数据集来衡量其准确性、速度和扩展性。

可用性与实现方式

Hydra-Multi用C++编写,可在https://github.com/arq5x/Hydra上免费获取。

联系方式

aaronquinlan@gmail.comihall@genome.wustl.edu

补充信息

补充数据可在《生物信息学》在线获取。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验