• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过对多个序列的联合分割来重建 DNA 拷贝数。

Reconstructing DNA copy number by joint segmentation of multiple sequences.

机构信息

Department of Statistics, University of California, Los Angeles, CA, USA.

出版信息

BMC Bioinformatics. 2012 Aug 16;13:205. doi: 10.1186/1471-2105-13-205.

DOI:10.1186/1471-2105-13-205
PMID:22897923
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3534631/
Abstract

BACKGROUND

Variations in DNA copy number carry information on the modalities of genome evolution and mis-regulation of DNA replication in cancer cells. Their study can help localize tumor suppressor genes, distinguish different populations of cancerous cells, and identify genomic variations responsible for disease phenotypes. A number of different high throughput technologies can be used to identify copy number variable sites, and the literature documents multiple effective algorithms. We focus here on the specific problem of detecting regions where variation in copy number is relatively common in the sample at hand. This problem encompasses the cases of copy number polymorphisms, related samples, technical replicates, and cancerous sub-populations from the same individual.

RESULTS

We present a segmentation method named generalized fused lasso (GFL) to reconstruct copy number variant regions. GFL is based on penalized estimation and is capable of processing multiple signals jointly. Our approach is computationally very attractive and leads to sensitivity and specificity levels comparable to those of state-of-the-art specialized methodologies. We illustrate its applicability with simulated and real data sets.

CONCLUSIONS

The flexibility of our framework makes it applicable to data obtained with a wide range of technology. Its versatility and speed make GFL particularly useful in the initial screening stages of large data sets.

摘要

背景

DNA 拷贝数的变化携带了基因组进化和癌细胞中 DNA 复制失调的方式信息。对其进行研究有助于定位肿瘤抑制基因,区分癌细胞的不同群体,并识别导致疾病表型的基因组变异。有许多不同的高通量技术可用于识别拷贝数可变位点,并且文献中记录了多种有效的算法。我们在这里关注的是在手头样本中检测拷贝数变化相对常见的区域的具体问题。这个问题包括拷贝数多态性、相关样本、技术重复和来自同一个体的癌症亚群的情况。

结果

我们提出了一种名为广义融合套索(GFL)的分割方法,用于重建拷贝数变异区域。GFL 基于惩罚估计,能够联合处理多个信号。我们的方法在计算上非常有吸引力,并且可以达到与最先进的专业方法相当的灵敏度和特异性水平。我们使用模拟和真实数据集来说明其适用性。

结论

我们的框架的灵活性使其适用于各种技术获得的数据。其通用性和速度使得 GFL 在大型数据集的初始筛选阶段特别有用。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/5abc044cf4da/1471-2105-13-205-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/f637678e2fb8/1471-2105-13-205-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/06d78910718b/1471-2105-13-205-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/5abc044cf4da/1471-2105-13-205-3.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/f637678e2fb8/1471-2105-13-205-1.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/06d78910718b/1471-2105-13-205-2.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/a783/3534631/5abc044cf4da/1471-2105-13-205-3.jpg

相似文献

1
Reconstructing DNA copy number by joint segmentation of multiple sequences.通过对多个序列的联合分割来重建 DNA 拷贝数。
BMC Bioinformatics. 2012 Aug 16;13:205. doi: 10.1186/1471-2105-13-205.
2
Simple binary segmentation frameworks for identifying variation in DNA copy number.用于识别 DNA 拷贝数变异的简单二进制分割框架。
BMC Bioinformatics. 2012 Oct 30;13:277. doi: 10.1186/1471-2105-13-277.
3
Rapid, Paralog-Sensitive CNV Analysis of 2457 Human Genomes Using QuicK-mer2.利用 QuicK-mer2 快速、平行敏感的拷贝数变异分析 2457 个人类基因组
Genes (Basel). 2020 Jan 29;11(2):141. doi: 10.3390/genes11020141.
4
Personalized copy number and segmental duplication maps using next-generation sequencing.使用下一代测序技术构建个性化拷贝数和片段重复图谱。
Nat Genet. 2009 Oct;41(10):1061-7. doi: 10.1038/ng.437. Epub 2009 Aug 30.
5
Copynumber: Efficient algorithms for single- and multi-track copy number segmentation.拷贝数:单轨道和多轨道拷贝数分割的高效算法。
BMC Genomics. 2012 Nov 4;13:591. doi: 10.1186/1471-2164-13-591.
6
Repeat-aware evaluation of scaffolding tools.重复感知的支架工具评估。
Bioinformatics. 2018 Aug 1;34(15):2530-2537. doi: 10.1093/bioinformatics/bty131.
7
High-throughput genome scaffolding from in vivo DNA interaction frequency.基于体内 DNA 相互作用频率的高通量基因组支架搭建。
Nat Biotechnol. 2013 Dec;31(12):1143-7. doi: 10.1038/nbt.2768. Epub 2013 Nov 24.
8
Detection of DNA copy number alterations using penalized least squares regression.使用惩罚最小二乘回归检测DNA拷贝数改变
Bioinformatics. 2005 Oct 15;21(20):3811-7. doi: 10.1093/bioinformatics/bti646. Epub 2005 Aug 30.
9
A multilevel model to address batch effects in copy number estimation using SNP arrays.利用 SNP 芯片解决拷贝数估计中批次效应的多层模型。
Biostatistics. 2011 Jan;12(1):33-50. doi: 10.1093/biostatistics/kxq043. Epub 2010 Jul 12.
10
Copy number variation of individual cattle genomes using next-generation sequencing.利用下一代测序技术检测个体牛基因组的拷贝数变异。
Genome Res. 2012 Apr;22(4):778-90. doi: 10.1101/gr.133967.111. Epub 2012 Feb 2.

引用本文的文献

1
A semiparametric Bayesian model for comparing DNA copy numbers.一种用于比较DNA拷贝数的半参数贝叶斯模型。
Braz J Probab Stat. 2016 Aug;30(3):345-365. doi: 10.1214/15-bjps283. Epub 2016 Jul 29.
2
Breast tumours maintain a reservoir of subclonal diversity during expansion.乳腺肿瘤在扩增过程中维持亚克隆多样性的储备。
Nature. 2021 Apr;592(7853):302-308. doi: 10.1038/s41586-021-03357-x. Epub 2021 Mar 24.
3
EnsembleCNV: an ensemble machine learning algorithm to identify and genotype copy number variation using SNP array data.

本文引用的文献

1
A Path Algorithm for Constrained Estimation.一种用于约束估计的路径算法。
J Comput Graph Stat. 2013;22(2):261-283. doi: 10.1080/10618600.2012.681248.
2
Detecting simultaneous changepoints in multiple sequences.检测多个序列中的同时变化点。
Biometrika. 2010 Sep;97(3):631-645. doi: 10.1093/biomet/asq025. Epub 2010 Jun 16.
3
Using the R Package crlmm for Genotyping and Copy Number Estimation.使用R软件包crlmm进行基因分型和拷贝数估计。
EnsembleCNV:一种集成机器学习算法,用于使用 SNP 阵列数据识别和基因分型拷贝数变异。
Nucleic Acids Res. 2019 Apr 23;47(7):e39. doi: 10.1093/nar/gkz068.
4
ALLELE-SPECIFIC COPY NUMBER ESTIMATION BY WHOLE EXOME SEQUENCING.通过全外显子组测序进行等位基因特异性拷贝数估计
Ann Appl Stat. 2017 Jun;11(2):1169-1192. doi: 10.1214/17-AOAS1043. Epub 2017 Jul 20.
5
nbCNV: a multi-constrained optimization model for discovering copy number variants in single-cell sequencing data.nbCNV:一种用于在单细胞测序数据中发现拷贝数变异的多约束优化模型。
BMC Bioinformatics. 2016 Sep 17;17:384. doi: 10.1186/s12859-016-1239-7.
6
Next Generation Statistical Genetics: Modeling, Penalization, and Optimization in High-Dimensional Data.下一代统计遗传学:高维数据中的建模、惩罚与优化
Annu Rev Stat Appl. 2014 Jan 1;1(1):279-300. doi: 10.1146/annurev-statistics-022513-115638.
7
VTET: a variable threshold exact test for identifying disease-associated copy number variations enriched in short genomic regions.VTET:一种用于识别短基因组区域中富集的与疾病相关的拷贝数变异的可变阈值精确检验方法。
Front Genet. 2014 Mar 18;5:53. doi: 10.3389/fgene.2014.00053. eCollection 2014.
8
Segmentor3IsBack: an R package for the fast and exact segmentation of Seq-data.Segmentor3回归:一个用于快速准确分割序列数据的R包。
Algorithms Mol Biol. 2014 Mar 10;9(1):6. doi: 10.1186/1748-7188-9-6.
J Stat Softw. 2011 May 1;40(12):1-32.
4
Bayesian Nonparametric Hidden Markov Models with application to the analysis of copy-number-variation in mammalian genomes.应用于哺乳动物基因组拷贝数变异分析的贝叶斯非参数隐马尔可夫模型
J R Stat Soc Series B Stat Methodol. 2011 Jan 1;73(1):37-57. doi: 10.1111/j.1467-9868.2010.00756.x.
5
A fused lasso latent feature model for analyzing multi-sample aCGH data.用于分析多样本 aCGH 数据的融合套索潜在特征模型。
Biostatistics. 2011 Oct;12(4):776-91. doi: 10.1093/biostatistics/kxr012. Epub 2011 Jun 3.
6
RECONSTRUCTING DNA COPY NUMBER BY PENALIZED ESTIMATION AND IMPUTATION.通过惩罚估计和插补重建DNA拷贝数
Ann Appl Stat. 2010 Dec 1;4(4):1749-1773. doi: 10.1214/10-AOAS357.
7
Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.基于阵列的平台和调用算法的全面评估,用于检测拷贝数变异。
Nat Biotechnol. 2011 May 8;29(6):512-20. doi: 10.1038/nbt.1852.
8
Estimation of parent specific DNA copy number in tumors using high-density genotyping arrays.利用高密度基因分型阵列估计肿瘤中的亲本特异性 DNA 拷贝数。
PLoS Comput Biol. 2011 Jan 27;7(1):e1001060. doi: 10.1371/journal.pcbi.1001060.
9
Mapping copy number variation by population-scale genome sequencing.通过群体规模的基因组测序来绘制拷贝数变异图谱。
Nature. 2011 Feb 3;470(7332):59-65. doi: 10.1038/nature09708.
10
Association screening of common and rare genetic variants by penalized regression.通过惩罚回归进行常见和罕见遗传变异的关联筛选。
Bioinformatics. 2010 Oct 1;26(19):2375-82. doi: 10.1093/bioinformatics/btq448. Epub 2010 Aug 6.