• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

利用家系数据作为验证标准,评估遗传关联研究中拷贝数变异 calling 策略。

Using family data as a verification standard to evaluate copy number variation calling strategies for genetic association studies.

机构信息

Department of Biostatistics, University of Pittsburgh, Pittsburgh, PA 15261, USA.

出版信息

Genet Epidemiol. 2012 Apr;36(3):253-62. doi: 10.1002/gepi.21618.

DOI:10.1002/gepi.21618
PMID:22714937
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC3696390/
Abstract

A major concern for all copy number variation (CNV) detection algorithms is their reliability and repeatability. However, it is difficult to evaluate the reliability of CNV-calling strategies due to the lack of gold-standard data that would tell us which CNVs are real. We propose that if CNVs are called in duplicate samples, or inherited from parent to child, then these can be considered validated CNVs. We used two large family-based genome-wide association study (GWAS) datasets from the GENEVA consortium to look at concordance rates of CNV calls between duplicate samples, parent-child pairs, and unrelated pairs. Our goal was to make recommendations for ways to filter and use CNV calls in GWAS datasets that do not include family data. We used PennCNV as our primary CNV-calling algorithm, and tested CNV calls using different datasets and marker sets, and with various filters on CNVs and samples. Using the Illumina core HumanHap550 single nucleotide polymorphism (SNP) set, we saw duplicate concordance rates of approximately 55% and parent-child transmission rates of approximately 28% in our datasets. GC model adjustment and sample quality filtering had little effect on these reliability measures. Stratification on CNV size and DNA sample type did have some effect. Overall, our results show that it is probably not possible to find a CNV-calling strategy (including filtering and algorithm) that will give us a set of "reliable" CNV calls using current chip technologies. But if we understand the error process, we can still use CNV calls appropriately in genetic association studies.

摘要

所有拷贝数变异 (CNV) 检测算法的一个主要关注点是它们的可靠性和可重复性。然而,由于缺乏可以告诉我们哪些 CNV 是真实的金标准数据,因此很难评估 CNV 调用策略的可靠性。我们提出,如果在重复样本或从父母遗传到子女的样本中调用 CNV,则可以认为这些 CNV 是经过验证的。我们使用 GENEVA 联盟的两个大型基于家族的全基因组关联研究 (GWAS) 数据集,研究重复样本、父母-子女对和无关对之间 CNV 调用的一致性率。我们的目标是为 GWAS 数据集提供过滤和使用 CNV 调用的建议,这些数据集不包括家族数据。我们使用 PennCNV 作为我们的主要 CNV 调用算法,并使用不同的数据集和标记集以及对 CNV 和样本的各种过滤器来测试 CNV 调用。使用 Illumina 核心 HumanHap550 单核苷酸多态性 (SNP) 集,我们在数据集看到重复一致性率约为 55%,父母-子女传递率约为 28%。GC 模型调整和样本质量过滤对这些可靠性指标几乎没有影响。CNV 大小和 DNA 样本类型的分层确实有一定的影响。总体而言,我们的结果表明,使用当前的芯片技术,可能无法找到一种 CNV 调用策略(包括过滤和算法),可以为我们提供一组“可靠”的 CNV 调用。但是,如果我们了解错误过程,仍然可以在遗传关联研究中适当地使用 CNV 调用。

相似文献

1
Using family data as a verification standard to evaluate copy number variation calling strategies for genetic association studies.利用家系数据作为验证标准,评估遗传关联研究中拷贝数变异 calling 策略。
Genet Epidemiol. 2012 Apr;36(3):253-62. doi: 10.1002/gepi.21618.
2
Genome-wide algorithm for detecting CNV associations with diseases.全基因组算法检测与疾病相关的 CNV 关联。
BMC Bioinformatics. 2011 Aug 9;12:331. doi: 10.1186/1471-2105-12-331.
3
Rare CNVs in Suicide Attempt include Schizophrenia-Associated Loci and Neurodevelopmental Genes: A Pilot Genome-Wide and Family-Based Study.自杀未遂中的罕见拷贝数变异包括精神分裂症相关基因座和神经发育基因:一项全基因组和基于家系的初步研究。
PLoS One. 2016 Dec 28;11(12):e0168531. doi: 10.1371/journal.pone.0168531. eCollection 2016.
4
Cheek swabs, SNP chips, and CNVs: assessing the quality of copy number variant calls generated with subject-collected mail-in buccal brush DNA samples on a high-density genotyping microarray.颊拭子、SNP 芯片和 CNV:使用高密度基因分型微阵列评估通过邮寄采集的口腔刷 DNA 样本生成的拷贝数变异体调用的质量。
BMC Med Genet. 2012 Jun 26;13:51. doi: 10.1186/1471-2350-13-51.
5
Inheritance model introduces differential bias in CNV calls between parents and offspring.遗传模型会导致父母和子女之间的 CNV 调用存在差异。
Genet Epidemiol. 2012 Jul;36(5):488-98. doi: 10.1002/gepi.21643. Epub 2012 May 24.
6
Accuracy of CNV Detection from GWAS Data.从 GWAS 数据中检测 CNV 的准确性。
PLoS One. 2011 Jan 13;6(1):e14511. doi: 10.1371/journal.pone.0014511.
7
Evaluation of copy number variation detection for a SNP array platform.SNP 芯片平台拷贝数变异检测评估。
BMC Bioinformatics. 2014 Feb 21;15:50. doi: 10.1186/1471-2105-15-50.
8
The role of copy number variation in susceptibility to amyotrophic lateral sclerosis: genome-wide association study and comparison with published loci.拷贝数变异在肌萎缩侧索硬化易感性中的作用:全基因组关联研究及与已发表基因座的比较。
PLoS One. 2009 Dec 4;4(12):e8175. doi: 10.1371/journal.pone.0008175.
9
Effect of Combining Multiple CNV Defining Algorithms on the Reliability of CNV Calls from SNP Genotyping Data.多种拷贝数变异(CNV)定义算法相结合对基于单核苷酸多态性(SNP)基因分型数据的CNV检测可靠性的影响
Genomics Inform. 2012 Sep;10(3):194-9. doi: 10.5808/GI.2012.10.3.194. Epub 2012 Sep 28.
10
The effect of algorithms on copy number variant detection.算法对拷贝数变异检测的影响。
PLoS One. 2010 Dec 30;5(12):e14456. doi: 10.1371/journal.pone.0014456.

引用本文的文献

1
A comprehensive analysis of SNPs and CNVs identifies novel markers associated with disease outcomes in colorectal cancer.对 SNPs 和 CNVs 的全面分析确定了与结直肠癌疾病结果相关的新型标记物。
Mol Oncol. 2021 Dec;15(12):3329-3347. doi: 10.1002/1878-0261.13067. Epub 2021 Aug 5.
2
A Survey of Copy Number Variation in the Porcine Genome Detected From Whole-Genome Sequence.基于全基因组序列检测猪基因组拷贝数变异的研究
Front Genet. 2019 Aug 16;10:737. doi: 10.3389/fgene.2019.00737. eCollection 2019.
3
Copy Number Studies in Noisy Samples.

本文引用的文献

1
Comprehensive assessment of array-based platforms and calling algorithms for detection of copy number variants.基于阵列的平台和调用算法的全面评估,用于检测拷贝数变异。
Nat Biotechnol. 2011 May 8;29(6):512-20. doi: 10.1038/nbt.1852.
2
Comparative analyses of seven algorithms for copy number variant identification from single nucleotide polymorphism arrays.七种基于单核苷酸多态性微阵列的拷贝数变异识别算法的比较分析。
Nucleic Acids Res. 2010 May;38(9):e105. doi: 10.1093/nar/gkq040. Epub 2010 Feb 8.
3
Statistical issues in the analysis of DNA Copy Number Variations.
噪声样本中的拷贝数研究。
Microarrays (Basel). 2013 Nov 6;2(4):284-303. doi: 10.3390/microarrays2040284.
4
Genome-Wide Detection of CNVs and Their Association with Meat Tenderness in Nelore Cattle.内洛尔牛全基因组拷贝数变异检测及其与肉嫩度的关联
PLoS One. 2016 Jun 27;11(6):e0157711. doi: 10.1371/journal.pone.0157711. eCollection 2016.
5
Copy number variation associates with mortality in long-lived individuals: a genome-wide assessment.拷贝数变异与长寿个体的死亡率相关:一项全基因组评估。
Aging Cell. 2016 Feb;15(1):49-55. doi: 10.1111/acel.12407. Epub 2015 Oct 8.
6
Copy number variants encompassing Mendelian disease genes in a large multigenerational family segregating bipolar disorder.在一个患有双相情感障碍的大型多代家族中,包含孟德尔疾病基因的拷贝数变异。
BMC Genet. 2015 Mar 15;16:27. doi: 10.1186/s12863-015-0184-1.
7
Association of maternal CNVs in GSTT1/GSTT2 with smoking, preterm delivery, and low birth weight.母亲 GSTT1/GSTT2 上的 CNV 与吸烟、早产和低出生体重的关联。
Front Genet. 2013 Oct 28;4:196. doi: 10.3389/fgene.2013.00196. eCollection 2013.
DNA拷贝数变异分析中的统计学问题
Int J Comput Biol Drug Des. 2008;1(4):368-95. doi: 10.1504/IJCBDD.2008.022208.
4
Integrated study of copy number states and genotype calls using high-density SNP arrays.使用高密度SNP阵列对拷贝数状态和基因型调用进行综合研究。
Nucleic Acids Res. 2009 Sep;37(16):5365-77. doi: 10.1093/nar/gkp493. Epub 2009 Jul 6.
5
Genome instability, cancer and aging.基因组不稳定、癌症与衰老。
Biochim Biophys Acta. 2009 Oct;1790(10):963-9. doi: 10.1016/j.bbagen.2009.03.020. Epub 2009 Mar 31.
6
PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data.PennCNV:一种为在全基因组单核苷酸多态性基因分型数据中进行高分辨率拷贝数变异检测而设计的集成隐马尔可夫模型。
Genome Res. 2007 Nov;17(11):1665-74. doi: 10.1101/gr.6861907. Epub 2007 Oct 5.
7
The NCBI dbGaP database of genotypes and phenotypes.美国国立医学图书馆的基因型和表型数据库(NCBI dbGaP)。
Nat Genet. 2007 Oct;39(10):1181-6. doi: 10.1038/ng1007-1181.
8
Methods and strategies for analyzing copy number variation using DNA microarrays.使用DNA微阵列分析拷贝数变异的方法和策略。
Nat Genet. 2007 Jul;39(7 Suppl):S16-21. doi: 10.1038/ng2028.
9
High-resolution genomic profiling of chromosomal aberrations using Infinium whole-genome genotyping.使用Infinium全基因组基因分型技术对染色体畸变进行高分辨率基因组分析。
Genome Res. 2006 Sep;16(9):1136-48. doi: 10.1101/gr.5402306. Epub 2006 Aug 9.
10
Comparative analysis of algorithms for identifying amplifications and deletions in array CGH data.用于识别阵列比较基因组杂交(array CGH)数据中扩增和缺失的算法的比较分析。
Bioinformatics. 2005 Oct 1;21(19):3763-70. doi: 10.1093/bioinformatics/bti611. Epub 2005 Aug 4.