Department of Dermatology, University of Utah School of Medicine, Salt Lake City, Utah, United States of America.
PLoS One. 2011;6(8):e23221. doi: 10.1371/journal.pone.0023221. Epub 2011 Aug 5.
Massively Parallel Sequencing (MPS) allows sequencing of entire exomes and genomes to now be done at reasonable cost, and its utility for identifying genes responsible for rare Mendelian disorders has been demonstrated. However, for a complex disease, study designs need to accommodate substantial degrees of locus, allelic, and phenotypic heterogeneity, as well as complex relationships between genotype and phenotype. Such considerations include careful selection of samples for sequencing and a well-developed strategy for identifying the few "true" disease susceptibility genes from among the many irrelevant genes that will be found to harbor rare variants. To examine these issues we have performed simulation-based analyses in order to compare several strategies for MPS sequencing in complex disease. Factors examined include genetic architecture, sample size, number and relationship of individuals selected for sequencing, and a variety of filters based on variant type, multiple observations of genes and concordance of genetic variants within pedigrees. A two-stage design was assumed where genes from the MPS analysis of high-risk families are evaluated in a secondary screening phase of a larger set of probands with more modest family histories. Designs were evaluated using a cost function that assumes the cost of sequencing the whole exome is 400 times that of sequencing a single candidate gene. Results indicate that while requiring variants to be identified in multiple pedigrees and/or in multiple individuals in the same pedigree are effective strategies for reducing false positives, there is a danger of over-filtering so that most true susceptibility genes are missed. In most cases, sequencing more than two individuals per pedigree results in reduced power without any benefit in terms of reduced overall cost. Further, our results suggest that although no single strategy is optimal, simulations can provide important guidelines for study design.
大规模并行测序(MPS)可实现整个外显子组和基因组的测序,费用也较为合理,其用于鉴定导致罕见孟德尔疾病的基因的效用已得到证实。然而,对于复杂疾病,研究设计需要适应基因座、等位基因和表型异质性以及基因型与表型之间复杂关系的较大程度。这些考虑因素包括对测序样本的精心选择,以及从众多携带罕见变异的不相关基因中确定少数“真正”疾病易感性基因的完善策略。为了研究这些问题,我们进行了基于模拟的分析,以比较复杂疾病中 MPS 测序的几种策略。检查的因素包括遗传结构、样本量、测序个体的数量和关系,以及基于变异类型、基因的多次观察以及家系内遗传变异的一致性的各种过滤器。假设采用两阶段设计,即对高风险家族的 MPS 分析中的基因在更大的具有适度家族史的先证者的二次筛选阶段进行评估。设计是使用成本函数来评估的,假设整个外显子组测序的成本是单个候选基因测序成本的 40 倍。结果表明,要求在多个家系中或在同一家系的多个个体中识别变异是减少假阳性的有效策略,但存在过度过滤的危险,以至于大多数真正的易感性基因都会被遗漏。在大多数情况下,每个家系测序超过两个人不会降低总体成本,反而会降低效力。此外,我们的结果表明,尽管没有一种策略是最优的,但模拟可以为研究设计提供重要的指导。