Auer Paul L, Reiner Alex P, Wang Gao, Kang Hyun Min, Abecasis Goncalo R, Altshuler David, Bamshad Michael J, Nickerson Deborah A, Tracy Russell P, Rich Stephen S, Leal Suzanne M
Zilber School of Public Health, University of Wisconsin-Milwaukee, Milwaukee, WI 53205, USA; Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA.
Division of Public Health Sciences, Fred Hutchinson Cancer Research Center, Seattle, WA 98109, USA; Department of Epidemiology, School of Public Health, University of Washington, Seattle, WA 98195, USA.
Am J Hum Genet. 2016 Oct 6;99(4):791-801. doi: 10.1016/j.ajhg.2016.08.012. Epub 2016 Sep 22.
Massively parallel whole-genome sequencing (WGS) data have ushered in a new era in human genetics. These data are now being used to understand the role of rare variants in complex traits and to advance the goals of precision medicine. The technological and computing advances that have enabled us to generate WGS data on thousands of individuals have also outpaced our ability to perform analyses in scientifically and statistically rigorous and thoughtful ways. The past several years have witnessed the application of whole-exome sequencing (WES) to complex traits and diseases. From our analysis of NHLBI Exome Sequencing Project (ESP) data, not only have a number of important disease and complex trait association findings emerged, but our collective experience offers some valuable lessons for WGS initiatives. These include caveats associated with generating automated pipelines for quality control and analysis of rare variants; the importance of studying minority populations; sample size requirements and efficient study designs for identifying rare-variant associations; and the significance of incidental findings in population-based genetic research. With the ESP as an example, we offer guidance and a framework on how to conduct a large-scale association study in the era of WGS.
大规模平行全基因组测序(WGS)数据开启了人类遗传学的新纪元。这些数据如今正被用于理解罕见变异在复杂性状中的作用,并推动精准医学目标的实现。使我们能够对数以千计个体生成WGS数据的技术和计算进步,也超过了我们以科学、统计严谨且周全的方式进行分析的能力。在过去几年中,全外显子组测序(WES)已应用于复杂性状和疾病。通过对美国国立心肺血液研究所外显子组测序计划(ESP)数据的分析,不仅出现了一些重要的疾病和复杂性状关联研究结果,而且我们的共同经验为WGS计划提供了一些宝贵的经验教训。这些经验教训包括与生成用于罕见变异质量控制和分析的自动化流程相关的注意事项;研究少数族裔人群的重要性;识别罕见变异关联所需的样本量要求和高效研究设计;以及在基于人群的基因研究中偶然发现的意义。以ESP为例,我们提供了关于如何在WGS时代开展大规模关联研究的指导和框架。