Department of Human Genetics, Leiden University Medical Center, 2333 ZC Leiden, The Netherlands.
Leiden Institute of Advanced Computer Science, Leiden University, 2333 CA Leiden, The Netherlands.
Bioinformatics. 2023 Jan 1;39(1). doi: 10.1093/bioinformatics/btad001.
Beyond identifying genetic variants, we introduce a set of Boolean relations, which allows for a comprehensive classification of the relations of every pair of variants by taking all minimal alignments into account. We present an efficient algorithm to compute these relations, including a novel way of efficiently computing all minimal alignments within the best theoretical complexity bounds.
We show that these relations are common, and many non-trivial, for variants of the CFTR gene in dbSNP. Ultimately, we present an approach for the storing and indexing of variants in the context of a database that enables efficient querying for all these relations.
A Python implementation is available at https://github.com/mutalyzer/algebra/tree/v0.2.0 as well as an interface at https://mutalyzer.nl/algebra.
除了识别遗传变异,我们还引入了一组布尔关系,通过考虑所有最小比对,可以全面分类每对变异的关系。我们提出了一种有效的算法来计算这些关系,包括一种新颖的方法,可在最佳理论复杂度限制内高效计算所有最小比对。
我们表明,对于 dbSNP 中的 CFTR 基因变异,这些关系是常见的,而且很多是非平凡的。最终,我们提出了一种在数据库上下文中存储和索引变异的方法,该方法支持对所有这些关系的高效查询。
Python 实现可在 https://github.com/mutalyzer/algebra/tree/v0.2.0 获得,也可在 https://mutalyzer.nl/algebra 获得接口。