Saw Swee Hock School of Public Health, National University of Singapore, Singapore 117597, Singapore.
Am J Hum Genet. 2013 Jan 10;92(1):52-66. doi: 10.1016/j.ajhg.2012.12.005. Epub 2013 Jan 3.
Whole-genome sequencing across multiple samples in a population provides an unprecedented opportunity for comprehensively characterizing the polymorphic variants in the population. Although the 1000 Genomes Project (1KGP) has offered brief insights into the value of population-level sequencing, the low coverage has compromised the ability to confidently detect rare and low-frequency variants. In addition, the composition of populations in the 1KGP is not complete, despite the fact that the study design has been extended to more than 2,500 samples from more than 20 population groups. The Malays are one of the Austronesian groups predominantly present in Southeast Asia and Oceania, and the Singapore Sequencing Malay Project (SSMP) aims to perform deep whole-genome sequencing of 100 healthy Malays. By sequencing at a minimum of 30× coverage, we have illustrated the higher sensitivity at detecting low-frequency and rare variants and the ability to investigate the presence of hotspots of functional mutations. Compared to the low-pass sequencing in the 1KGP, the deeper coverage allows more functional variants to be identified for each person. A comparison of the fidelity of genotype imputation of Malays indicated that a population-specific reference panel, such as the SSMP, outperforms a cosmopolitan panel with larger number of individuals for common SNPs. For lower-frequency (<5%) markers, a larger number of individuals might have to be whole-genome sequenced so that the accuracy currently afforded by the 1KGP can be achieved. The SSMP data are expected to be the benchmark for evaluating the value of deep population-level sequencing versus low-pass sequencing, especially in populations that are poorly represented in population-genetics studies.
对人群中的多个样本进行全基因组测序为全面描述人群中的多态性变异提供了前所未有的机会。尽管 1000 基因组计划(1KGP)提供了一些关于人群水平测序价值的简要见解,但低覆盖率降低了可靠检测稀有和低频变异的能力。此外,尽管该研究设计已扩展到来自 20 多个群体的 2500 多个样本,但 1KGP 中的人群组成并不完整。马来人是主要存在于东南亚和大洋洲的南岛语族群之一,而新加坡测序马来项目(SSMP)旨在对 100 名健康马来人进行深度全基因组测序。通过至少 30×的覆盖测序,我们说明了检测低频和稀有变异的更高敏感性,以及调查功能突变热点存在的能力。与 1KGP 中的低深度测序相比,更深的覆盖允许为每个人识别更多的功能变异。对马来人基因型推断保真度的比较表明,对于常见 SNP,特定于人群的参考面板(如 SSMP)优于具有更多个体的世界性面板。对于低频(<5%)标记,可能需要对更多个体进行全基因组测序,以便达到 1KGP 目前提供的准确性。SSMP 数据有望成为评估深度人群水平测序与低深度测序价值的基准,尤其是在人群遗传研究中代表性较差的人群中。