Division of Research Strategy, University College London, London, United Kingdom.
PLoS One. 2010 Jul 29;5(7):e11876. doi: 10.1371/journal.pone.0011876.
A consistent debate is ongoing on genome-wide association studies (GWAs). A key point is the capability to identify low-penetrance variations across the human genome. Among the phenomena reducing the power of these analyses, phenocopy level (PE) hampers very seriously the investigation of complex diseases, as well known in neurological disorders, cancer, and likely of primary importance in human ageing. PE seems to be the norm, rather than the exception, especially when considering the role of epigenetics and environmental factors towards phenotype. Despite some attempts, no recognized solution has been proposed, particularly to estimate the effects of phenocopies on the study planning or its analysis design. We present a simulation, where we attempt to define more precisely how phenocopy impacts on different analytical methods under different scenarios. With our approach the critical role of phenocopy emerges, and the more the PE level increases the more the initial difficulty in detecting gene-gene interactions is amplified. In particular, our results show that strong main effects are not hampered by the presence of an increasing amount of phenocopy in the study sample, despite progressively reducing the significance of the association, if the study is sufficiently powered. On the opposite, when purely epistatic effects are simulated, the capability of identifying the association depends on several parameters, such as the strength of the interaction between the polymorphic variants, the penetrance of the polymorphism and the alleles (minor or major) which produce the combined effect and their frequency in the population. We conclude that the neglect of the possible presence of phenocopies in complex traits heavily affects the analysis of their genetic data.
全基因组关联研究(GWAS)一直存在争议。一个关键点是识别人类基因组中低外显率变异的能力。在降低这些分析能力的现象中,表型复制水平(PE)严重阻碍了复杂疾病的研究,这在神经紊乱、癌症中众所周知,而且可能在人类衰老中具有重要意义。PE 似乎是常态,而不是例外,尤其是考虑到表观遗传学和环境因素对表型的作用。尽管有一些尝试,但尚未提出公认的解决方案,特别是在估计表型复制对研究规划或分析设计的影响方面。我们进行了模拟,试图更准确地定义表型复制在不同情况下对不同分析方法的影响。通过我们的方法,表型复制的关键作用凸显出来,PE 水平越高,在检测基因-基因相互作用时初始困难就越大。特别是,我们的结果表明,尽管关联的显著性逐渐降低,但如果研究具有足够的效力,则强主效不受研究样本中表型复制量增加的影响。相反,当模拟纯粹的上位性效应时,识别关联的能力取决于几个参数,例如多态变异体之间相互作用的强度、多态性的外显率以及产生组合效应的等位基因(次要或主要)及其在人群中的频率。我们得出结论,在复杂性状中忽视可能存在的表型复制会严重影响其遗传数据分析。