Department of Molecular and Human Genetics, Baylor College of Medicine, Houston, TX 77030, USA.
Eur J Hum Genet. 2012 Apr;20(4):449-56. doi: 10.1038/ejhg.2011.211. Epub 2011 Dec 14.
For most complex trait association studies using next-generation sequencing, in addition to the primary phenotype of interest, many clinically important secondary traits are also available, which can be analyzed to map susceptibility genes. Owing to high sequencing costs, most studies use selected samples, and the sampling mechanisms of these studies can be complicated. When the primary and secondary traits are correlated, analyses of secondary phenotypes can cause spurious associations in selected samples and existing methods are inadequate to adjust for them. To address this problem, a likelihood-based method, MULTI-TRAIT-ASSOCIATION (MTA) was developed. MTA is flexible and can be applied to any study with known sampling mechanisms. It also allows efficient inferences of genetic parameters. To investigate the power of MTA and different study designs, extensive simulations were performed under rigorous population genetic and phenotypic models. It is demonstrated that there are great benefits for analyzing secondary phenotypes in selected samples. In particular, using case-control samples and samples with extreme primary phenotypes can be more powerful than analyzing random samples of equivalent size. One major challenge for sequence-based association studies is that most data sets are not of sufficient size to be adequately powered. By applying MTA, data sets ascertained under distinct mechanisms or targeted at different primary traits can be jointly analyzed to map common phenotypes and greatly increase power. The combined analysis can be performed using freely available data sets from public repositories, for example, dbGaP. In conclusion, MTA will have an important role in dissecting the etiology of complex traits.
对于使用下一代测序进行的大多数复杂性状关联研究,除了主要关注的表型外,通常还可以获得许多临床重要的次要表型,这些表型可以进行分析以定位易感基因。由于测序成本高,大多数研究都使用了选择的样本,并且这些研究的采样机制可能很复杂。当主要表型和次要表型相关时,对次要表型的分析可能会在选择的样本中引起虚假关联,并且现有方法不足以对此进行调整。为了解决这个问题,开发了一种基于似然的方法,即多性状关联(MTA)。MTA 灵活,可以应用于具有已知采样机制的任何研究。它还允许对遗传参数进行有效的推断。为了研究 MTA 和不同研究设计的功效,在严格的群体遗传和表型模型下进行了广泛的模拟。结果表明,在选择的样本中分析次要表型具有很大的益处。特别是,使用病例对照样本和具有极端主要表型的样本比分析具有同等大小的随机样本更有优势。基于序列的关联研究的一个主要挑战是,大多数数据集的规模不足以充分发挥作用。通过应用 MTA,可以联合分析根据不同机制确定的数据集或针对不同主要表型的数据集,以定位常见表型并大大提高功效。联合分析可以使用公共存储库(例如 dbGaP)中的免费可用数据集来执行。总之,MTA 将在剖析复杂性状的病因学方面发挥重要作用。