Johnston Iain G, Diaz-Uriarte Ramon
Department of Mathematics, University of Bergen, Realfagbygget, Bergen 5007, Norway.
Computational Biology Unit, University of Bergen, Thormøhlensgate 55, Bergen 5008, Norway.
Bioinformatics. 2024 Dec 26;41(1). doi: 10.1093/bioinformatics/btae737.
Accumulation models, where a system progressively acquires binary features over time, are common in the study of cancer progression, evolutionary biology, and other fields. Many approaches have been developed to infer the accumulation pathways by which features (e.g. mutations) are acquired over time. However, most of these approaches do not support reversibility: the loss of a feature once it has been acquired (e.g. the clearing of a mutation from a tumor or population).
Here, we demonstrate how the well-established Mk model from evolutionary biology, embedded on a hypercubic transition graph, can be used to infer the dynamics of accumulation processes, including the possibility of reversible transitions, from data which may be uncertain and cross-sectional, longitudinal, or phylogenetically/phylogenomically embedded. Positive and negative interactions between arbitrary sets of features (not limited to pairwise interactions) are supported. We demonstrate this approach with synthetic datasets and real data on bacterial drug resistance and cancer progression. While this implementation is limited in the number of features that can be considered, we discuss how this limitation may be relaxed to deal with larger systems.
The code implementing this setup in R is freely available at https://github.com/StochasticBiology/hypermk.
累积模型中,系统会随着时间的推移逐渐获得二元特征,在癌症进展、进化生物学及其他领域的研究中很常见。已经开发出许多方法来推断特征(如突变)随时间获得的累积途径。然而,这些方法大多不支持可逆性:即特征一旦获得便不会消失(例如肿瘤或群体中突变的清除)。
在此,我们展示了如何将进化生物学中成熟的Mk模型嵌入超立方过渡图,以便从可能不确定的横断面、纵向或系统发育/系统基因组嵌入数据中推断累积过程的动态,包括可逆转变的可能性。支持任意特征集之间的正向和负向相互作用(不限于成对相互作用)。我们用合成数据集以及关于细菌耐药性和癌症进展的真实数据展示了这种方法。虽然此实现方式在可考虑的特征数量上有限,但我们讨论了如何放宽这一限制以处理更大的系统。
在R中实现此设置的代码可在https://github.com/StochasticBiology/hypermk上免费获取。