• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于选择后推断的变点检测算法及其在拷贝数变异数据中的应用。

Post-selection inference for changepoint detection algorithms with application to copy number variation data.

机构信息

Department of Data Sciences and Operations, University of Southern California, Los Angeles, California, USA.

Department of Statistics, University of Pennsylvania, Philadelphia, Pennsylvania, USA.

出版信息

Biometrics. 2021 Sep;77(3):1037-1049. doi: 10.1111/biom.13422. Epub 2021 Jan 27.

DOI:10.1111/biom.13422
PMID:33434289
Abstract

Changepoint detection methods are used in many areas of science and engineering, for example, in the analysis of copy number variation data to detect abnormalities in copy numbers along the genome. Despite the broad array of available tools, methodology for quantifying our uncertainty in the strength (or the presence) of given changepoints post-selection are lacking. Post-selection inference offers a framework to fill this gap, but the most straightforward application of these methods results in low-powered hypothesis tests and leaves open several important questions about practical usability. In this work, we carefully tailor post-selection inference methods toward changepoint detection, focusing on copy number variation data. To accomplish this, we study commonly used changepoint algorithms: binary segmentation, as well as two of its most popular variants, wild and circular, and the fused lasso. We implement some of the latest developments in post-selection inference theory, mainly auxiliary randomization. This improves the power, which requires implementations of Markov chain Monte Carlo algorithms (importance sampling and hit-and-run sampling) to carry out our tests. We also provide recommendations for improving practical useability, detailed simulations, and example analyses on array comparative genomic hybridization as well as sequencing data.

摘要

变点检测方法被广泛应用于科学和工程的多个领域,例如,在分析拷贝数变异数据时,用于检测基因组上的拷贝数异常。尽管有许多可用的工具,但在选择后量化给定变点的强度(或存在)的不确定性的方法仍然缺乏。选择后推断为填补这一空白提供了一个框架,但这些方法最直接的应用导致了假设检验的低功效,并留下了关于实际可用性的几个重要问题。在这项工作中,我们针对变点检测仔细调整了选择后推断方法,重点关注拷贝数变异数据。为了实现这一目标,我们研究了常用的变点算法:二进制分割,以及它的两个最流行的变体:野生和循环,以及融合套索。我们实现了选择后推断理论的一些最新进展,主要是辅助随机化。这提高了功效,这需要实现马尔可夫链蒙特卡罗算法(重要性抽样和命中和运行抽样)来进行我们的测试。我们还提供了关于提高实际可用性、详细模拟以及在阵列比较基因组杂交和测序数据上的示例分析的建议。

相似文献

1
Post-selection inference for changepoint detection algorithms with application to copy number variation data.基于选择后推断的变点检测算法及其在拷贝数变异数据中的应用。
Biometrics. 2021 Sep;77(3):1037-1049. doi: 10.1111/biom.13422. Epub 2021 Jan 27.
2
Testing for a Change in Mean After Changepoint Detection.在变点检测后对均值变化进行检验。
J R Stat Soc Series B Stat Methodol. 2022 Sep;84(4):1082-1104. doi: 10.1111/rssb.12501. Epub 2022 Apr 12.
3
A survey of analysis software for array-comparative genomic hybridisation studies to detect copy number variation.用于检测拷贝数变异的阵列比较基因组杂交研究的分析软件综述。
Hum Genomics. 2010 Aug;4(6):421-7. doi: 10.1186/1479-7364-4-6-421.
4
Detection of copy number variation from array intensity and sequencing read depth using a stepwise Bayesian model.基于逐步贝叶斯模型,利用阵列强度和测序读取深度检测拷贝数变异。
BMC Bioinformatics. 2010 Oct 31;11:539. doi: 10.1186/1471-2105-11-539.
5
Detecting copy number variations from array CGH data based on a conditional random field model.基于条件随机场模型从阵列比较基因组杂交数据中检测拷贝数变异。
J Bioinform Comput Biol. 2010 Apr;8(2):295-314. doi: 10.1142/s021972001000480x.
6
Fast MCMC sampling for hidden Markov Models to determine copy number variations.快速马尔可夫链蒙特卡罗抽样法用于确定隐藏马尔可夫模型中的拷贝数变异。
BMC Bioinformatics. 2011 Nov 2;12:428. doi: 10.1186/1471-2105-12-428.
7
Sequential model selection-based segmentation to detect DNA copy number variation.基于序列模型选择的分割方法用于检测DNA拷贝数变异。
Biometrics. 2016 Sep;72(3):815-26. doi: 10.1111/biom.12478. Epub 2016 Mar 8.
8
HaplotypeCN: copy number haplotype inference with Hidden Markov Model and localized haplotype clustering.单倍型拷贝数:利用隐马尔可夫模型和局部单倍型聚类进行拷贝数单倍型推断
PLoS One. 2014 May 21;9(5):e96841. doi: 10.1371/journal.pone.0096841. eCollection 2014.
9
The use of ultra-dense array CGH analysis for the discovery of micro-copy number alterations and gene fusions in the cancer genome.超高密度阵列 CGH 分析在癌症基因组中发现微小拷贝数改变和基因融合。
BMC Med Genomics. 2011 Jan 27;4:16. doi: 10.1186/1755-8794-4-16.
10
Optimization of Signal Decomposition Matched Filtering (SDMF) for Improved Detection of Copy-Number Variations.用于改进拷贝数变异检测的信号分解匹配滤波(SDMF)优化
IEEE/ACM Trans Comput Biol Bioinform. 2016 May-Jun;13(3):584-91. doi: 10.1109/TCBB.2015.2448077.

引用本文的文献

1
Inferring independent sets of Gaussian variables after thresholding correlations.在对相关性进行阈值处理后推断高斯变量的独立集。
J Am Stat Assoc. 2025;120(549):370-381. doi: 10.1080/01621459.2024.2337158. Epub 2024 May 20.
2
Generalized data thinning using sufficient statistics.使用充分统计量的广义数据精简
J Am Stat Assoc. 2025;120(549):511-523. doi: 10.1080/01621459.2024.2353948. Epub 2024 Jun 13.
3
Testing for a difference in means of a single feature after clustering.聚类后对单个特征的均值差异进行检验。
Biostatistics. 2024 Dec 31;26(1). doi: 10.1093/biostatistics/kxae046.
4
Tree-Values: Selective Inference for Regression Trees.树值:回归树的选择性推断
J Mach Learn Res. 2022;23.
5
Selective inference for -means clustering.均值聚类的选择性推断。
J Mach Learn Res. 2023 May;24.
6
More Powerful Selective Inference for the Graph Fused Lasso.图融合套索的更强有力的选择性推断
J Comput Graph Stat. 2023;32(2):577-587. doi: 10.1080/10618600.2022.2097246. Epub 2022 Sep 6.
7
Testing for a difference in means of a single feature after clustering.聚类后对单个特征的均值差异进行检验。
ArXiv. 2023 Nov 27:arXiv:2311.16375v1.
8
Testing for a Change in Mean After Changepoint Detection.在变点检测后对均值变化进行检验。
J R Stat Soc Series B Stat Methodol. 2022 Sep;84(4):1082-1104. doi: 10.1111/rssb.12501. Epub 2022 Apr 12.
9
Quantifying uncertainty in spikes estimated from calcium imaging data.从钙成像数据估计的尖峰中量化不确定性。
Biostatistics. 2023 Apr 14;24(2):481-501. doi: 10.1093/biostatistics/kxab034.