Suppr超能文献

具有超前-滞后效应的时间序列的高斯过程及其在生物学数据中的应用

Gaussian processes for time series with lead-lag effects with applications to biology data.

作者信息

Mu Wancen, Chen Jiawen, Davis Eric S, Reed Kathleen, Phanstiel Douglas, Love Michael I, Li Didong

机构信息

Department of Biostatistics, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States.

Curriculum in Bioinformatics and Computational Biology, University of North Carolina at Chapel Hill, Chapel Hill, NC 27599, United States.

出版信息

Biometrics. 2025 Jan 7;81(1). doi: 10.1093/biomtc/ujae156.

Abstract

Investigating the relationship, particularly the lead-lag effect, between time series is a common question across various disciplines, especially when uncovering biological processes. However, analyzing time series presents several challenges. Firstly, due to technical reasons, the time points at which observations are made are not at uniform intervals. Secondly, some lead-lag effects are transient, necessitating time-lag estimation based on a limited number of time points. Thirdly, external factors also impact these time series, requiring a similarity metric to assess the lead-lag relationship. To counter these issues, we introduce a model grounded in the Gaussian process, affording the flexibility to estimate lead-lag effects for irregular time series. In addition, our method outputs dissimilarity scores, thereby broadening its applications to include tasks such as ranking or clustering multiple pairwise time series when considering their strength of lead-lag effects with external factors. Crucially, we offer a series of theoretical proofs to substantiate the validity of our proposed kernels and the identifiability of kernel parameters. Our model demonstrates advances in various simulations and real-world applications, particularly in the study of dynamic chromatin interactions, compared to other leading methods.

摘要

研究时间序列之间的关系,尤其是超前-滞后效应,是各个学科中常见的问题,在揭示生物过程时尤为如此。然而,分析时间序列存在若干挑战。首先,由于技术原因,进行观测的时间点并非等间隔的。其次,一些超前-滞后效应是短暂的,这就需要基于有限数量的时间点来估计时间滞后。第三,外部因素也会影响这些时间序列,需要一种相似性度量来评估超前-滞后关系。为应对这些问题,我们引入了一个基于高斯过程的模型,它能够灵活地估计不规则时间序列的超前-滞后效应。此外,我们的方法输出差异分数,从而拓宽了其应用范围,包括在考虑多个成对时间序列与外部因素的超前-滞后效应强度时进行排序或聚类等任务。至关重要的是,我们提供了一系列理论证明,以证实我们提出的核的有效性以及核参数的可识别性。与其他领先方法相比,我们的模型在各种模拟和实际应用中都有进展,特别是在动态染色质相互作用的研究中。

相似文献

5
Variable selection for clustering with Gaussian mixture models.用于高斯混合模型聚类的变量选择
Biometrics. 2009 Sep;65(3):701-9. doi: 10.1111/j.1541-0420.2008.01160.x. Epub 2009 Feb 4.

本文引用的文献

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验