Suppr超能文献

基于核测度的高维数据变化点检测及其在人类衰老脑数据中的应用

Change point detection for high dimensional data via kernel measure with application to human aging brain data.

作者信息

Wang Jinjuan, Li Na, Meng Zhen, Li Qizhai

机构信息

School of Mathematics and Statistics, Beijing Institute of Technology, Beijing, China.

School of Applied Science, Beijing Information Science and Technology University, Beijing, China.

出版信息

Stat Med. 2023 Nov 10;42(25):4644-4663. doi: 10.1002/sim.9881. Epub 2023 Aug 30.

Abstract

Identifying the existence and locations of change points has been a broadly encountered task in many statistical application areas. The existing change point detection methods may produce unsatisfactory results for high-dimensional data since certain distributional assumptions are made on data, which are hard to verify in practice. Moreover, some parameters (such as the number of change points) need to be estimated beforehand for some methods, making their powers sensitive to these values. Here, we propose a kernel-based -statistic to identify change points (KUCP) for high dimensional data, which is free of distributional assumptions and sup-parameter estimations. Specifically, we employ a kernel function to describe similarities among the subjects and construct a -statistic to test the existence of change point for a given location. The asymptotic properties of the -statistic are deduced. We also develop a procedure to locate the change points sequentially via a dichotomy algorithm. Extensive simulations demonstrate that KUCP has higher sensitivity in identifying existence of change points and higher accuracy in locating these change points than its counterparts. We further illustrate its practical utility by analyzing a gene expression data of human brain to detect the time point when gene expression profiles begin to change, which has been reported to be closely related with aging brain.

摘要

在许多统计应用领域,识别变化点的存在及其位置是一项广泛遇到的任务。现有的变化点检测方法对于高维数据可能会产生不尽人意的结果,因为这些方法对数据做了某些分布假设,而这些假设在实际中很难验证。此外,对于一些方法,某些参数(如变化点的数量)需要预先估计,这使得它们的功效对这些值很敏感。在此,我们提出一种基于核的用于识别高维数据变化点的统计量(KUCP),它无需分布假设和超参数估计。具体而言,我们使用核函数来描述样本之间的相似性,并构造一个统计量来检验给定位置变化点的存在性。推导了该统计量的渐近性质。我们还开发了一种通过二分算法顺序定位变化点的程序。大量模拟表明,与其他方法相比,KUCP在识别变化点的存在性方面具有更高的灵敏度,在定位这些变化点方面具有更高的准确性。我们通过分析人类大脑的基因表达数据来检测基因表达谱开始变化的时间点,进一步说明了它的实际效用,据报道该时间点与大脑衰老密切相关。

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验