Wong Yan, Ignatieva Anastasia, Koskela Jere, Gorjanc Gregor, Wohns Anthony W, Kelleher Jerome
Big Data Institute, Li Ka Shing Centre for Health Information and Discovery, University of Oxford, UK.
School of Mathematics and Statistics, University of Glasgow, UK.
bioRxiv. 2024 Apr 23:2023.11.03.565466. doi: 10.1101/2023.11.03.565466.
As a result of recombination, adjacent nucleotides can have different paths of genetic inheritance and therefore the genealogical trees for a sample of DNA sequences vary along the genome. The structure capturing the details of these intricately interwoven paths of inheritance is referred to as an ancestral recombination graph (ARG). Classical formalisms have focused on mapping coalescence and recombination events to the nodes in an ARG. This approach is out of step with modern developments, which do not represent genetic inheritance in terms of these events or explicitly infer them. We present a simple formalism that defines an ARG in terms of specific genomes and their intervals of genetic inheritance, and show how it generalises these classical treatments and encompasses the outputs of recent methods. We discuss nuances arising from this more general structure, and argue that it forms an appropriate basis for a software standard in this rapidly growing field.
由于重组的结果,相邻核苷酸可能具有不同的遗传继承路径,因此DNA序列样本的谱系树在基因组中会有所不同。捕捉这些错综复杂的交织遗传路径细节的结构被称为祖先重组图(ARG)。经典形式主义专注于将合并和重组事件映射到ARG中的节点。这种方法与现代发展脱节,现代发展并不根据这些事件来表示遗传继承或明确推断它们。我们提出了一种简单的形式主义,根据特定基因组及其遗传继承区间来定义ARG,并展示它如何推广这些经典处理方法并涵盖近期方法的输出。我们讨论了这种更一般结构产生的细微差别,并认为它构成了这个快速发展领域软件标准的合适基础。