• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

基于多样本的方法,利用高分辨率 aCGH 数据识别正常人类基因组结构中的常见 CNV。

A multi-sample based method for identifying common CNVs in normal human genomic structure using high-resolution aCGH data.

机构信息

Department of Computer Science, Yonsei University, Seoul, South Korea.

出版信息

PLoS One. 2011;6(10):e26975. doi: 10.1371/journal.pone.0026975. Epub 2011 Oct 31.

DOI:10.1371/journal.pone.0026975
PMID:22073121
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3205051/
Abstract

BACKGROUND

It is difficult to identify copy number variations (CNV) in normal human genomic data due to noise and non-linear relationships between different genomic regions and signal intensity. A high-resolution array comparative genomic hybridization (aCGH) containing 42 million probes, which is very large compared to previous arrays, was recently published. Most existing CNV detection algorithms do not work well because of noise associated with the large amount of input data and because most of the current methods were not designed to analyze normal human samples. Normal human genome analysis often requires a joint approach across multiple samples. However, the majority of existing methods can only identify CNVs from a single sample.

METHODOLOGY AND PRINCIPAL FINDINGS

We developed a multi-sample-based genomic variations detector (MGVD) that uses segmentation to identify common breakpoints across multiple samples and a k-means-based clustering strategy. Unlike previous methods, MGVD simultaneously considers multiple samples with different genomic intensities and identifies CNVs and CNV zones (CNVZs); CNVZ is a more precise measure of the location of a genomic variant than the CNV region (CNVR).

CONCLUSIONS AND SIGNIFICANCE

We designed a specialized algorithm to detect common CNVs from extremely high-resolution multi-sample aCGH data. MGVD showed high sensitivity and a low false discovery rate for a simulated data set, and outperformed most current methods when real, high-resolution HapMap datasets were analyzed. MGVD also had the fastest runtime compared to the other algorithms evaluated when actual, high-resolution aCGH data were analyzed. The CNVZs identified by MGVD can be used in association studies for revealing relationships between phenotypes and genomic aberrations. Our algorithm was developed with standard C++ and is available in Linux and MS Windows format in the STL library. It is freely available at: http://embio.yonsei.ac.kr/~Park/mgvd.php.

摘要

背景

由于噪声和不同基因组区域与信号强度之间的非线性关系,在正常人类基因组数据中识别拷贝数变异(CNV)较为困难。最近发表了一种高分辨率的阵列比较基因组杂交(aCGH),包含 4200 万个探针,与之前的阵列相比非常大。由于与大量输入数据相关的噪声,以及大多数当前方法并非专门为分析正常人类样本而设计,大多数现有的 CNV 检测算法都不能很好地工作。正常人类基因组分析通常需要跨多个样本的联合方法。然而,大多数现有的方法只能从单个样本中识别 CNV。

方法和主要发现

我们开发了一种基于多样本的基因组变异检测器(MGVD),该检测器使用分割来识别多个样本中的常见断点,并使用基于 K-均值的聚类策略。与以前的方法不同,MGVD 同时考虑具有不同基因组强度的多个样本,并识别 CNV 和 CNV 区(CNVZ);与 CNV 区域(CNVR)相比,CNVZ 是基因组变异位置的更精确度量。

结论和意义

我们设计了一种专门的算法,用于从超高分辨率多样本 aCGH 数据中检测常见的 CNV。MGVD 在模拟数据集上表现出高灵敏度和低假阳性率,在分析真实的高分辨率 HapMap 数据集时优于大多数当前方法。与评估的其他算法相比,当分析实际的高分辨率 aCGH 数据时,MGVD 的运行时间也最快。MGVD 识别的 CNVZ 可用于关联研究,以揭示表型与基因组异常之间的关系。我们的算法是用标准 C++开发的,可在 Linux 和 MS Windows 格式的 STL 库中使用。它可在以下网址免费获得:http://embio.yonsei.ac.kr/~Park/mgvd.php。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/4ffc94a49866/pone.0026975.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/5eb5d1615517/pone.0026975.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/d97d4ce3192e/pone.0026975.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/41f573074644/pone.0026975.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/df2354db18b1/pone.0026975.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/15f4d482a545/pone.0026975.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/2987a3983efe/pone.0026975.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/6940d022d021/pone.0026975.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/ba9284185e70/pone.0026975.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/fc5941ec3263/pone.0026975.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/f8e1b217d603/pone.0026975.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/4ffc94a49866/pone.0026975.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/5eb5d1615517/pone.0026975.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/d97d4ce3192e/pone.0026975.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/41f573074644/pone.0026975.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/df2354db18b1/pone.0026975.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/15f4d482a545/pone.0026975.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/2987a3983efe/pone.0026975.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/6940d022d021/pone.0026975.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/ba9284185e70/pone.0026975.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/fc5941ec3263/pone.0026975.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/f8e1b217d603/pone.0026975.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c87b/3205051/4ffc94a49866/pone.0026975.g011.jpg

相似文献

1
A multi-sample based method for identifying common CNVs in normal human genomic structure using high-resolution aCGH data.基于多样本的方法,利用高分辨率 aCGH 数据识别正常人类基因组结构中的常见 CNV。
PLoS One. 2011;6(10):e26975. doi: 10.1371/journal.pone.0026975. Epub 2011 Oct 31.
2
Accuracy of CNV Detection from GWAS Data.从 GWAS 数据中检测 CNV 的准确性。
PLoS One. 2011 Jan 13;6(1):e14511. doi: 10.1371/journal.pone.0014511.
3
Comprehensive performance comparison of high-resolution array platforms for genome-wide Copy Number Variation (CNV) analysis in humans.用于人类全基因组拷贝数变异(CNV)分析的高分辨率阵列平台的综合性能比较
BMC Genomics. 2017 Apr 24;18(1):321. doi: 10.1186/s12864-017-3658-x.
4
A fused lasso latent feature model for analyzing multi-sample aCGH data.用于分析多样本 aCGH 数据的融合套索潜在特征模型。
Biostatistics. 2011 Oct;12(4):776-91. doi: 10.1093/biostatistics/kxr012. Epub 2011 Jun 3.
5
Genome-wide mapping of copy number variation in humans: comparative analysis of high resolution array platforms.人类基因组拷贝数变异的全基因组图谱绘制:高分辨率阵列平台的比较分析。
PLoS One. 2011;6(11):e27859. doi: 10.1371/journal.pone.0027859. Epub 2011 Nov 30.
6
Identification of functional CNV region networks using a CNV-gene mapping algorithm in a genome-wide scale.利用全基因组范围内的 CNV-基因映射算法识别功能 CNV 区域网络。
Bioinformatics. 2012 Aug 1;28(15):2045-51. doi: 10.1093/bioinformatics/bts318. Epub 2012 May 30.
7
Evaluation of copy number variation detection for a SNP array platform.SNP 芯片平台拷贝数变异检测评估。
BMC Bioinformatics. 2014 Feb 21;15:50. doi: 10.1186/1471-2105-15-50.
8
Genome-wide algorithm for detecting CNV associations with diseases.全基因组算法检测与疾病相关的 CNV 关联。
BMC Bioinformatics. 2011 Aug 9;12:331. doi: 10.1186/1471-2105-12-331.
9
Identification of copy number variants from exome sequence data.从外显子序列数据中识别拷贝数变异
BMC Genomics. 2014 Aug 7;15(1):661. doi: 10.1186/1471-2164-15-661.
10
Penalized weighted low-rank approximation for robust recovery of recurrent copy number variations.用于稳健恢复复发性拷贝数变异的惩罚加权低秩逼近
BMC Bioinformatics. 2015 Dec 10;16:407. doi: 10.1186/s12859-015-0835-2.

引用本文的文献

1
Chromosomal quality control in hPSCs: A practical guide to SNP array analysis with GenomeStudio.人多能干细胞中的染色体质量控制:使用GenomeStudio进行SNP阵列分析的实用指南。
Front Cell Dev Biol. 2025 Jul 1;13:1599923. doi: 10.3389/fcell.2025.1599923. eCollection 2025.
2
MSeq-CNV: accurate detection of Copy Number Variation from Sequencing of Multiple samples.MSeq-CNV:从多个样本测序中准确检测拷贝数变异
Sci Rep. 2018 Mar 5;8(1):4009. doi: 10.1038/s41598-018-22323-8.

本文引用的文献

1
Detecting simultaneous changepoints in multiple sequences.检测多个序列中的同时变化点。
Biometrika. 2010 Sep;97(3):631-645. doi: 10.1093/biomet/asq025. Epub 2010 Jun 16.
2
A map of human genome variation from population-scale sequencing.人类基因组变异的图谱来自于基于人群的测序。
Nature. 2010 Oct 28;467(7319):1061-73. doi: 10.1038/nature09534.
3
Next-generation VariationHunter: combinatorial algorithms for transposon insertion discovery.下一代变异猎手:转座子插入发现的组合算法。
Bioinformatics. 2010 Jun 15;26(12):i350-7. doi: 10.1093/bioinformatics/btq216.
4
Discovery of common Asian copy number variants using integrated high-resolution array CGH and massively parallel DNA sequencing.利用整合的高分辨率 array CGH 和大规模并行 DNA 测序发现常见的亚洲拷贝数变异。
Nat Genet. 2010 May;42(5):400-5. doi: 10.1038/ng.555. Epub 2010 Apr 4.
5
CMDS: a population-based method for identifying recurrent DNA copy number aberrations in cancer from high-resolution data.CMDS:一种基于人群的方法,用于从高分辨率数据中识别癌症中的复发性 DNA 拷贝数异常。
Bioinformatics. 2010 Feb 15;26(4):464-9. doi: 10.1093/bioinformatics/btp708. Epub 2009 Dec 23.
6
Origins and functional impact of copy number variation in the human genome.人类基因组中拷贝数变异的起源和功能影响。
Nature. 2010 Apr 1;464(7289):704-12. doi: 10.1038/nature08516. Epub 2009 Oct 7.
7
Detection of recurrent copy number alterations in the genome: taking among-subject heterogeneity seriously.检测基因组中反复出现的拷贝数改变:认真对待个体间的异质性。
BMC Bioinformatics. 2009 Sep 23;10:308. doi: 10.1186/1471-2105-10-308.
8
Personalized copy number and segmental duplication maps using next-generation sequencing.使用下一代测序技术构建个性化拷贝数和片段重复图谱。
Nat Genet. 2009 Oct;41(10):1061-7. doi: 10.1038/ng.437. Epub 2009 Aug 30.
9
A highly annotated whole-genome sequence of a Korean individual.一名韩国个体的高度注释全基因组序列。
Nature. 2009 Aug 20;460(7258):1011-5. doi: 10.1038/nature08211. Epub 2009 Jul 8.
10
Computational methods for identification of recurrent copy number alteration patterns by array CGH.通过阵列比较基因组杂交鉴定复发性拷贝数改变模式的计算方法。
Cytogenet Genome Res. 2008;123(1-4):343-51. doi: 10.1159/000184726. Epub 2009 Mar 11.