• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于累积分段回归的基因组序列高效突变点检测

Efficient change-points detection for genomic sequences via cumulative segmented regression.

机构信息

School of Statistics and Mathematics; Interdisciplinary Research Institute of Data Science, Shanghai Lixin University of Accounting and Finance, Shanghai 201209, China.

Statistics and Mathematics School, Yunnan University of Finance and Economics, Kunming 650221, China.

出版信息

Bioinformatics. 2022 Jan 3;38(2):311-317. doi: 10.1093/bioinformatics/btab685.

DOI:10.1093/bioinformatics/btab685
PMID:34601562
Abstract

MOTIVATION

Knowing the number and the exact locations of multiple change points in genomic sequences serves several biological needs. The cumulative-segmented algorithm (cumSeg) has been recently proposed as a computationally efficient approach for multiple change-points detection, which is based on a simple transformation of data and provides results quite robust to model mis-specifications. However, the errors are also accumulated in the transformed model so that heteroscedasticity and serial correlation will show up, and thus the variations of the estimated change points will be quite different, while the locations of the change points should be of the same importance in the original genomic sequences.

RESULTS

In this study, we develop two new change-points detection procedures in the framework of cumulative segmented regression. Simulations reveal that the proposed methods not only improve the efficiency of each change point estimator substantially but also provide the estimators with similar variations for all the change points. By applying these proposed algorithms to Coriel and SNP genotyping data, we illustrate their performance on detecting copy number variations.

AVAILABILITY AND IMPLEMENTATION

The proposed algorithms are implemented in R program and the codes are provided in the online supplementary material.

SUPPLEMENTARY INFORMATION

Supplementary data are available at Bioinformatics online.

摘要

动机

了解基因组序列中多个变化点的数量和确切位置可以满足多种生物学需求。累积分段算法(cumSeg)最近被提出作为一种计算效率高的多变化点检测方法,它基于数据的简单变换,并为模型误指定提供了相当稳健的结果。然而,错误也在变换模型中积累,因此异方差和序列相关性将会出现,因此估计的变化点的变化将非常不同,而变化点的位置在原始基因组序列中应该具有相同的重要性。

结果

在这项研究中,我们在累积分段回归框架中开发了两种新的变化点检测程序。模拟结果表明,所提出的方法不仅可以大大提高每个变化点估计器的效率,而且还可以为所有变化点提供相似的变化估计器。通过将这些建议的算法应用于 Coriel 和 SNP 基因分型数据,我们说明了它们在检测拷贝数变异方面的性能。

可用性和实现

所提出的算法是用 R 程序实现的,代码在在线补充材料中提供。

补充信息

补充数据可在生物信息学在线获得。

相似文献

1
Efficient change-points detection for genomic sequences via cumulative segmented regression.基于累积分段回归的基因组序列高效突变点检测
Bioinformatics. 2022 Jan 3;38(2):311-317. doi: 10.1093/bioinformatics/btab685.
2
Modified screening and ranking algorithm for copy number variation detection.用于拷贝数变异检测的改进筛选与排序算法
Bioinformatics. 2015 May 1;31(9):1341-8. doi: 10.1093/bioinformatics/btu850. Epub 2014 Dec 25.
3
Efficient change point detection for genomic sequences of continuous measurements.高效的连续测量基因组序列的突变点检测。
Bioinformatics. 2011 Jan 15;27(2):161-6. doi: 10.1093/bioinformatics/btq647. Epub 2010 Nov 18.
4
Hybrid algorithms for multiple change-point detection in biological sequences.用于生物序列中多个变化点检测的混合算法。
Adv Exp Med Biol. 2015;823:41-61. doi: 10.1007/978-3-319-10984-8_3.
5
Integrating genomic correlation structure improves copy number variations detection.整合基因组相关结构可改善拷贝数变异检测。
Bioinformatics. 2021 Apr 20;37(3):312-317. doi: 10.1093/bioinformatics/btaa737.
6
Multiple Break-Points Detection in Array CGH Data via the Cross-Entropy Method.基于交叉熵方法的阵列比较基因组杂交数据中的多个断点检测
IEEE/ACM Trans Comput Biol Bioinform. 2015 Mar-Apr;12(2):487-98. doi: 10.1109/TCBB.2014.2361639.
7
VEGA: variational segmentation for copy number detection.VEGA:用于拷贝数检测的变分分割。
Bioinformatics. 2010 Dec 15;26(24):3020-7. doi: 10.1093/bioinformatics/btq586. Epub 2010 Oct 19.
8
A new statistic for efficient detection of repetitive sequences.一种用于高效检测重复序列的新统计方法。
Bioinformatics. 2019 Nov 1;35(22):4596-4606. doi: 10.1093/bioinformatics/btz262.
9
Integrative DNA copy number detection and genotyping from sequencing and array-based platforms.整合测序和基于阵列平台的 DNA 拷贝数检测和基因分型。
Bioinformatics. 2018 Jul 15;34(14):2349-2355. doi: 10.1093/bioinformatics/bty104.
10
Joint detection of copy number variations in parent-offspring trios.亲子三联体中拷贝数变异的联合检测。
Bioinformatics. 2016 Apr 15;32(8):1130-7. doi: 10.1093/bioinformatics/btv707. Epub 2015 Dec 7.

引用本文的文献

1
Quantifying the intra- and inter-species community interactions in microbiomes by dynamic covariance mapping.通过动态协方差映射量化微生物群落中种内和种间的群落相互作用。
Nat Commun. 2025 Jul 9;16(1):6314. doi: 10.1038/s41467-025-61368-y.