Schork Nicholas J, Greenwood Tiffany A, Braff David L
Department of Psychiatry, University of California, San Diego, 9500 Gilman Drive, La Jolla, CA 92093-0603, USA.
Schizophr Bull. 2007 Jan;33(1):95-104. doi: 10.1093/schbul/sbl045. Epub 2006 Oct 11.
Statistical genetics is a research field that focuses on mathematical models and statistical inference methodologies that relate genetic variations (ie, naturally occurring human DNA sequence variations or "polymorphisms") to particular traits or diseases (phenotypes) usually from data collected on large samples of families or individuals. The ultimate goal of such analysis is the identification of genes and genetic variations that influence disease susceptibility. Although of extreme interest and importance, the fact that many genes and environmental factors contribute to neuropsychiatric diseases of public health importance (eg, schizophrenia, bipolar disorder, and depression) complicates relevant studies and suggests that very sophisticated mathematical and statistical modeling may be required. In addition, large-scale contemporary human DNA sequencing and related projects, such as the Human Genome Project and the International HapMap Project, as well as the development of high-throughput DNA sequencing and genotyping technologies have provided statistical geneticists with a great deal of very relevant and appropriate information and resources. Unfortunately, the use of these resources and their interpretation are not straightforward when applied to complex, multifactorial diseases such as schizophrenia. In this brief and largely nonmathematical review of the field of statistical genetics, we describe many of the main concepts, definitions, and issues that motivate contemporary research. We also provide a discussion of the most pressing contemporary problems that demand further research if progress is to be made in the identification of genes and genetic variations that predispose to complex neuropsychiatric diseases.
统计遗传学是一个研究领域,它专注于数学模型和统计推断方法,这些方法通常从大量家庭或个体样本收集的数据中,将基因变异(即自然发生的人类DNA序列变异或“多态性”)与特定性状或疾病(表型)联系起来。此类分析的最终目标是识别影响疾病易感性的基因和基因变异。尽管极具吸引力且非常重要,但许多基因和环境因素会导致具有公共卫生重要性的神经精神疾病(如精神分裂症、双相情感障碍和抑郁症),这一事实使相关研究变得复杂,表明可能需要非常复杂的数学和统计建模。此外,大规模当代人类DNA测序及相关项目,如人类基因组计划和国际人类基因组单体型图计划,以及高通量DNA测序和基因分型技术的发展,为统计遗传学家提供了大量非常相关且合适的信息和资源。不幸的是,当应用于精神分裂症等复杂的多因素疾病时,这些资源的使用及其解读并非易事。在这篇对统计遗传学领域的简要且基本非数学的综述中,我们描述了许多推动当代研究的主要概念、定义和问题。我们还讨论了如果要在识别易患复杂神经精神疾病的基因和基因变异方面取得进展,最紧迫的当代问题,这些问题需要进一步研究。