Suppr超能文献

CHOP:群体图中的单倍型感知路径索引。

CHOP: haplotype-aware path indexing in population graphs.

机构信息

Delft Bioinformatics Lab, Delft University of Technology, Van Mourik Broekmanweg 6, Delft, 2628 XE, The Netherlands.

Department of Clinical Genetics, VU University Medical Center, Van der Boechorststraat 7, Amsterdam, 1081 BT, The Netherlands.

出版信息

Genome Biol. 2020 Mar 11;21(1):65. doi: 10.1186/s13059-020-01963-y.

Abstract

The practical use of graph-based reference genomes depends on the ability to align reads to them. Performing substring queries to paths through these graphs lies at the core of this task. The combination of increasing pattern length and encoded variations inevitably leads to a combinatorial explosion of the search space. Instead of heuristic filtering or pruning steps to reduce the complexity, we propose CHOP, a method that constrains the search space by exploiting haplotype information, bounding the search space to the number of haplotypes so that a combinatorial explosion is prevented. We show that CHOP can be applied to large and complex datasets, by applying it on a graph-based representation of the human genome encoding all 80 million variants reported by the 1000 Genomes Project.

摘要

基于图的参考基因组的实际应用取决于将读取内容与它们对齐的能力。执行这些图中的路径的子字符串查询是此任务的核心。模式长度的增加和编码变化的组合不可避免地导致搜索空间的组合爆炸。我们没有采用启发式过滤或剪枝步骤来降低复杂度,而是提出了 CHOP,这是一种通过利用单倍型信息来约束搜索空间的方法,将搜索空间限制在单倍型的数量内,从而防止组合爆炸。我们通过将其应用于基于图形的人类基因组表示,该表示编码了 1000 基因组计划报告的所有 8000 万个变体,证明了 CHOP 可以应用于大型和复杂的数据集。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/fb18/7066762/8dc46f637e77/13059_2020_1963_Fig1_HTML.jpg

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验