• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

使用断点注释学习拷贝数谱的平滑模型。

Learning smoothing models of copy number profiles using breakpoint annotations.

机构信息

INRIA Sierra Project-Team, Paris F-75013, France.

出版信息

BMC Bioinformatics. 2013 May 22;14:164. doi: 10.1186/1471-2105-14-164.

DOI:10.1186/1471-2105-14-164
PMID:23697330
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3712326/
Abstract

BACKGROUND

Many models have been proposed to detect copy number alterations in chromosomal copy number profiles, but it is usually not obvious to decide which is most effective for a given data set. Furthermore, most methods have a smoothing parameter that determines the number of breakpoints and must be chosen using various heuristics.

RESULTS

We present three contributions for copy number profile smoothing model selection. First, we propose to select the model and degree of smoothness that maximizes agreement with visual breakpoint region annotations. Second, we develop cross-validation procedures to estimate the error of the trained models. Third, we apply these methods to compare 17 smoothing models on a new database of 575 annotated neuroblastoma copy number profiles, which we make available as a public benchmark for testing new algorithms.

CONCLUSIONS

Whereas previous studies have been qualitative or limited to simulated data, our annotation-guided approach is quantitative and suggests which algorithms are fastest and most accurate in practice on real data. In the neuroblastoma data, the equivalent pelt.n and cghseg.k methods were the best breakpoint detectors, and exhibited reasonable computation times.

摘要

背景

许多模型已被提出用于检测染色体拷贝数图谱中的拷贝数改变,但通常难以确定对于给定数据集哪种模型最有效。此外,大多数方法都具有一个平滑参数,用于确定断点的数量,并且必须使用各种启发式方法进行选择。

结果

我们提出了三种用于拷贝数图谱平滑模型选择的方法。首先,我们提出选择与视觉断点区域注释最一致的模型和平滑程度。其次,我们开发了交叉验证程序来估计训练模型的误差。第三,我们应用这些方法在一个新的 575 个注释神经母细胞瘤拷贝数图谱的数据库上比较了 17 种平滑模型,我们将其作为一个公共基准,用于测试新算法。

结论

虽然以前的研究是定性的或仅限于模拟数据,但我们的基于注释的方法是定量的,并建议哪些算法在实际的真实数据中最快和最准确。在神经母细胞瘤数据中,等效的 pelt.n 和 cghseg.k 方法是最好的断点检测方法,并且计算时间合理。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/225a56f475f9/1471-2105-14-164-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/7252b81f2b64/1471-2105-14-164-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/ecc0ae4485ca/1471-2105-14-164-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/7c55589b0bd7/1471-2105-14-164-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/e2a49c85a984/1471-2105-14-164-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/bc4bc090b9f9/1471-2105-14-164-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/e65f5bd5bbb3/1471-2105-14-164-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/f9bfd1f66d89/1471-2105-14-164-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/5d8157bfbb0b/1471-2105-14-164-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/8f71c3d0b51e/1471-2105-14-164-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/225a56f475f9/1471-2105-14-164-10.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/7252b81f2b64/1471-2105-14-164-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/ecc0ae4485ca/1471-2105-14-164-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/7c55589b0bd7/1471-2105-14-164-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/e2a49c85a984/1471-2105-14-164-4.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/bc4bc090b9f9/1471-2105-14-164-5.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/e65f5bd5bbb3/1471-2105-14-164-6.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/f9bfd1f66d89/1471-2105-14-164-7.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/5d8157bfbb0b/1471-2105-14-164-8.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/8f71c3d0b51e/1471-2105-14-164-9.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/7baa/3712326/225a56f475f9/1471-2105-14-164-10.jpg

相似文献

1
Learning smoothing models of copy number profiles using breakpoint annotations.使用断点注释学习拷贝数谱的平滑模型。
BMC Bioinformatics. 2013 May 22;14:164. doi: 10.1186/1471-2105-14-164.
2
A segmental maximum a posteriori approach to genome-wide copy number profiling.一种用于全基因组拷贝数分析的分段最大后验概率方法。
Bioinformatics. 2008 Mar 15;24(6):751-8. doi: 10.1093/bioinformatics/btn003. Epub 2008 Jan 19.
3
Breakpoint identification and smoothing of array comparative genomic hybridization data.阵列比较基因组杂交数据的断点识别与平滑处理。
Bioinformatics. 2004 Dec 12;20(18):3636-7. doi: 10.1093/bioinformatics/bth355. Epub 2004 Jun 16.
4
Quantile smoothing of array CGH data.阵列比较基因组杂交数据的分位数平滑
Bioinformatics. 2005 Apr 1;21(7):1146-53. doi: 10.1093/bioinformatics/bti148. Epub 2004 Nov 30.
5
Detection of DNA copy number alterations using penalized least squares regression.使用惩罚最小二乘回归检测DNA拷贝数改变
Bioinformatics. 2005 Oct 15;21(20):3811-7. doi: 10.1093/bioinformatics/bti646. Epub 2005 Aug 30.
6
Sparse representation and Bayesian detection of genome copy number alterations from microarray data.基于微阵列数据的基因组拷贝数变异的稀疏表示与贝叶斯检测
Bioinformatics. 2008 Feb 1;24(3):309-18. doi: 10.1093/bioinformatics/btm601. Epub 2008 Jan 18.
7
Detection of low level genomic alterations by comparative genomic hybridization based on cDNA micro-arrays.基于cDNA微阵列的比较基因组杂交技术检测低水平基因组改变
Bioinformatics. 2005 Apr 1;21(7):1138-45. doi: 10.1093/bioinformatics/bti133. Epub 2004 Nov 11.
8
Improving genome annotations using phylogenetic profile anomaly detection.利用系统发育谱异常检测改进基因组注释。
Bioinformatics. 2005 Feb 15;21(4):464-70. doi: 10.1093/bioinformatics/bti027. Epub 2004 Sep 16.
9
Accurate detection of aneuploidies in array CGH and gene expression microarray data.在阵列比较基因组杂交和基因表达微阵列数据中准确检测非整倍体。
Bioinformatics. 2004 Dec 12;20(18):3533-43. doi: 10.1093/bioinformatics/bth440. Epub 2004 Jul 29.
10
Genomic Copy Number Profiling Using Circulating Free Tumor DNA Highlights Heterogeneity in Neuroblastoma.利用循环游离肿瘤 DNA 进行基因组拷贝数分析突显神经母细胞瘤的异质性。
Clin Cancer Res. 2016 Nov 15;22(22):5564-5573. doi: 10.1158/1078-0432.CCR-16-0500. Epub 2016 Jul 20.

引用本文的文献

1
A Graph-constrained Changepoint Detection Approach for ECG Segmentation.一种用于心电图分割的基于图约束的变点检测方法。
Annu Int Conf IEEE Eng Med Biol Soc. 2020 Jul;2020:332-336. doi: 10.1109/EMBC44109.2020.9175333.
2
On optimal multiple changepoint algorithms for large data.关于大数据的最优多变化点算法
Stat Comput. 2017;27(2):519-533. doi: 10.1007/s11222-016-9636-3. Epub 2016 Feb 15.
3
Adjacency-constrained hierarchical clustering of a band similarity matrix with application to genomics.带相似性矩阵的邻接约束层次聚类及其在基因组学中的应用

本文引用的文献

1
D³: Data-Driven Documents.D³:数据驱动文档。
IEEE Trans Vis Comput Graph. 2011 Dec;17(12):2301-9. doi: 10.1109/TVCG.2011.185.
2
RECONSTRUCTING DNA COPY NUMBER BY PENALIZED ESTIMATION AND IMPUTATION.通过惩罚估计和插补重建DNA拷贝数
Ann Appl Stat. 2010 Dec 1;4(4):1749-1773. doi: 10.1214/10-AOAS357.
3
Detection of recurrent rearrangement breakpoints from copy number data.从拷贝数数据中检测重现的重排断点。
Algorithms Mol Biol. 2019 Nov 15;14:22. doi: 10.1186/s13015-019-0157-4. eCollection 2019.
4
Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.使用视觉标签和监督式机器学习优化染色质免疫沉淀测序(ChIP-seq)峰检测工具
Bioinformatics. 2017 Feb 15;33(4):491-499. doi: 10.1093/bioinformatics/btw672.
5
MPAgenomics: an R package for multi-patient analysis of genomic markers.MPA基因组学:一个用于多患者基因组标记分析的R软件包。
BMC Bioinformatics. 2014 Dec 14;15(1):394. doi: 10.1186/s12859-014-0394-y.
6
Performance evaluation of DNA copy number segmentation methods.DNA拷贝数分割方法的性能评估。
Brief Bioinform. 2015 Jul;16(4):600-15. doi: 10.1093/bib/bbu026. Epub 2014 Sep 8.
7
Segmentor3IsBack: an R package for the fast and exact segmentation of Seq-data.Segmentor3回归:一个用于快速准确分割序列数据的R包。
Algorithms Mol Biol. 2014 Mar 10;9(1):6. doi: 10.1186/1748-7188-9-6.
8
SegAnnDB: interactive Web-based genomic segmentation.SegAnnDB:交互式基于网络的基因组分割。
Bioinformatics. 2014 Jun 1;30(11):1539-46. doi: 10.1093/bioinformatics/btu072. Epub 2014 Feb 3.
BMC Bioinformatics. 2011 Apr 21;12:114. doi: 10.1186/1471-2105-12-114.
4
Accumulation of segmental alterations determines progression in neuroblastoma.节段性改变的积累决定神经母细胞瘤的进展。
J Clin Oncol. 2010 Jul 1;28(19):3122-30. doi: 10.1200/JCO.2009.26.7955. Epub 2010 Jun 1.
5
Scoring diverse cellular morphologies in image-based screens with iterative feedback and machine learning.通过迭代反馈和机器学习在基于图像的筛选中对多种细胞形态进行评分。
Proc Natl Acad Sci U S A. 2009 Feb 10;106(6):1826-31. doi: 10.1073/pnas.0808843106. Epub 2009 Feb 2.
6
Overall genomic pattern is a predictor of outcome in neuroblastoma.整体基因组模式是神经母细胞瘤预后的一个预测指标。
J Clin Oncol. 2009 Mar 1;27(7):1026-33. doi: 10.1200/JCO.2008.16.0630. Epub 2009 Jan 26.
7
A fast and flexible method for the segmentation of aCGH data.一种用于阵列比较基因组杂交(aCGH)数据分割的快速灵活方法。
Bioinformatics. 2008 Aug 15;24(16):i139-45. doi: 10.1093/bioinformatics/btn272.
8
Sparse representation and Bayesian detection of genome copy number alterations from microarray data.基于微阵列数据的基因组拷贝数变异的稀疏表示与贝叶斯检测
Bioinformatics. 2008 Feb 1;24(3):309-18. doi: 10.1093/bioinformatics/btm601. Epub 2008 Jan 18.
9
Spatial smoothing and hot spot detection for CGH data using the fused lasso.使用融合套索对比较基因组杂交数据进行空间平滑和热点检测。
Biostatistics. 2008 Jan;9(1):18-29. doi: 10.1093/biostatistics/kxm013. Epub 2007 May 18.
10
A modified Bayes information criterion with applications to the analysis of comparative genomic hybridization data.一种修正的贝叶斯信息准则及其在比较基因组杂交数据分析中的应用。
Biometrics. 2007 Mar;63(1):22-32. doi: 10.1111/j.1541-0420.2006.00662.x.