Suppr超能文献

全基因组关联研究:质量控制和基于人群的措施。

Genome-wide association studies: quality control and population-based measures.

机构信息

Institut für Medizinische Biometrie und Statistik, Universität zu Lübeck, Germany.

出版信息

Genet Epidemiol. 2009;33 Suppl 1(Suppl 1):S45-50. doi: 10.1002/gepi.20472.

Abstract

Genome-wide association studies, using hundreds of thousands of single-nucleotide polymorphism (SNP) markers, have become a standard approach for identifying disease susceptibility genes. The change in the technology poses substantial computational and statistical challenges that have been addressed in the quality control, imputation, and population-based measure groups of the Genetic Analysis Workshop 16. The computational challenges pertain to efficient memory management and computational speed of the statistical procedures, and we discuss an approach for efficient SNP storage. Accuracy and computational speed is relevant for genotype calling, and the results from a comparison of three calling algorithms are discussed. The first statistical challenge is related to statistical quality control, and we discuss two novel quality control procedures. These low-level analyses have an effect on subsequent preparatory steps for high-level analyses, e.g., the quality of genotype imputation approaches. After the conduct of a genome-wide association study with successful replication and/or validation, measures of diagnostic accuracy, including the area under the curve, are investigated. The area under the curve can be constructed from summary data in some situations. Finally, we discuss how the population-attributable risk of a genetic variant that is only measured in a reference data set can be determined.

摘要

全基因组关联研究使用数十万的单核苷酸多态性(SNP)标记,已成为鉴定疾病易感基因的标准方法。技术的变化带来了大量的计算和统计挑战,这些挑战在遗传分析研讨会 16 的质量控制、插补和基于人群的度量组中得到了解决。计算挑战涉及统计过程的有效内存管理和计算速度,我们讨论了一种有效的 SNP 存储方法。准确性和计算速度与基因型调用相关,我们讨论了三种调用算法的比较结果。第一个统计挑战与统计质量控制有关,我们讨论了两种新的质量控制程序。这些低层次的分析对后续的高级分析准备步骤有影响,例如基因型插补方法的质量。在进行全基因组关联研究并成功复制和/或验证后,会研究诊断准确性的度量,包括曲线下面积。在某些情况下,可以从汇总数据构建曲线下面积。最后,我们讨论了如何确定仅在参考数据集测量的遗传变异的人群归因风险。

相似文献

引用本文的文献

本文引用的文献

2
ACPA: automated cluster plot analysis of genotype data.ACPA:基因型数据的自动聚类图分析
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S58. doi: 10.1186/1753-6561-3-s7-s58.
6
Memory management in genome-wide association studies.全基因组关联研究中的记忆管理
BMC Proc. 2009 Dec 15;3 Suppl 7(Suppl 7):S54. doi: 10.1186/1753-6561-3-s7-s54.
9
Genome-wide association studies for discrete traits.全基因组关联研究离散性状。
Genet Epidemiol. 2009;33 Suppl 1(Suppl 1):S8-12. doi: 10.1002/gepi.20465.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验