School of Mathematics and Statistics, College of Science and Engineering, University of Glasgow, Glasgow G12 8QQ, UK.
J R Soc Interface. 2013 Sep 18;10(88):20130605. doi: 10.1098/rsif.2013.0605. Print 2013 Nov 6.
More than 20 human genetic diseases are associated with inheriting an unstable expanded DNA simple sequence tandem repeat, for example, CTG (cytosine-thymine-guanine) repeats in myotonic dystrophy type 1 (DM1) and CAG (cytosine-adenine-guanine) repeats in Huntington disease (HD). These sequences mutate by changing the number of repeats not just between generations, but also during the lifetime of affected individuals. Levels of somatic instability contribute to disease onset and progression but as changes are tissue-specific, age- and repeat length-dependent, interpretation of the level of somatic instability in an individual is confounded by these considerations. Mathematical models, fitted to CTG repeat length distributions derived from blood DNA, from a large cohort of DM1-affected or at risk individuals, have recently been used to quantify inherited repeat lengths and mutation rates. Taking into account age, the estimated mutation rates are lower than predicted among individuals with small alleles (inherited repeat lengths less than 100 CTGs), suggesting that these rates may be suppressed at the lower end of the disease-causing range. In this study, we propose that a length-specific effect operates within this range and tested this hypothesis using a model comparison approach. To calibrate the extended model, we used data derived from blood DNA from DM1 individuals and, for the first time, buccal DNA from HD individuals. In a novel application of this extended model, we identified individuals whose effective repeat length, with regards to somatic instability, is less than their actual repeat length. A plausible explanation for this distinction is that the expanded repeat tract is compromised by interruptions or other unusual features. We quantified effective length for a large cohort of DM1 individuals and showed that effective length better predicts age of onset than inherited repeat length, thus improving the genotype-phenotype correlation. Under the extended model, we removed some of the bias in mutation rates making them less length-dependent. Consequently, rates adjusted in this way will be better suited as quantitative traits to investigate cis- or trans-acting modifiers of somatic mosaicism, disease onset and progression.
超过 20 种人类遗传疾病与遗传不稳定的 DNA 简单序列串联重复有关,例如肌强直性营养不良 1 型(DM1)中的 CTG(胞嘧啶-胸腺嘧啶-鸟嘌呤)重复和亨廷顿病(HD)中的 CAG(胞嘧啶-腺嘌呤-鸟嘌呤)重复。这些序列通过改变重复次数而发生突变,不仅在代际之间,而且在受影响个体的一生中都会发生变化。体细胞不稳定性的水平导致疾病的发生和进展,但由于变化是组织特异性的、与年龄和重复长度相关的,因此个体体细胞不稳定性水平的解释受到这些因素的影响。最近,基于从血液 DNA 中获得的 CTG 重复长度分布,利用适合于数学模型来定量遗传重复长度和突变率。从较小等位基因(遗传重复长度小于 100 CTG)的个体中,考虑到年龄,估计的突变率低于预测值,这表明在致病范围内的较低端可能抑制了这些速率。在这项研究中,我们提出了一个在这个范围内起作用的长度特异性效应,并使用模型比较方法检验了这一假设。为了校准扩展模型,我们使用了从 DM1 个体的血液 DNA 中得出的数据,并且首次使用了来自 HD 个体的口腔 DNA。在这种扩展模型的新颖应用中,我们确定了那些在体细胞不稳定性方面,其有效重复长度小于实际重复长度的个体。对于这种区别的一个合理解释是,扩展的重复片段被中断或其他异常特征所破坏。我们对大量 DM1 个体的有效长度进行了量化,并发现有效长度比遗传重复长度更能预测发病年龄,从而提高了基因型-表型相关性。在扩展模型下,我们消除了一些突变率的偏差,使其不那么依赖于重复长度。因此,以这种方式调整的速率将更适合作为定量性状来研究顺式或反式作用的体细胞镶嵌体、疾病发病和进展的修饰剂。