Kids Neuroscience Centre, Kids Research, Children's Hospital at Westmead, Sydney, NSW2145, Australia.
Discipline of Child and Adolescent Health, Faculty of Health and Medicine, University of Sydney, Sydney, NSW2006, Australia.
Nat Commun. 2022 Mar 29;13(1):1655. doi: 10.1038/s41467-022-29271-y.
Predicting which cryptic-donors may be activated by a splicing variant in patient DNA is notoriously difficult. Through analysis of 5145 cryptic-donors (versus 86,963 decoy-donors not used; any GT or GC), we define an empirical method predicting cryptic-donor activation with 87% sensitivity and 95% specificity. Strength (according to four algorithms) and proximity to the annotated-donor appear important determinants of cryptic-donor activation. However, other factors such as splicing regulatory elements, which are difficult to identify, play an important role and are likely responsible for current prediction inaccuracies. We find that the most frequently recurring natural mis-splicing events at each exon-intron junction, summarised over 40,233 RNA-sequencing samples (40K-RNA), predict with accuracy which cryptic-donor will be activated in rare disease. 40K-RNA provides an accurate, evidence-based method to predict variant-activated cryptic-donors in genetic disorders, assisting pathology consideration of possible consequences of a variant for the encoded protein and RNA diagnostic testing strategies.
预测患者 DNA 中的剪接变异可能激活哪些隐匿供体是出了名的困难。通过对 5145 个隐匿供体(相对于 86963 个未使用的诱饵供体;任何 GT 或 GC)的分析,我们定义了一种预测隐匿供体激活的经验方法,其灵敏度为 87%,特异性为 95%。强度(根据四个算法)和与注释供体的接近度似乎是隐匿供体激活的重要决定因素。然而,其他难以识别的因素,如剪接调控元件,也起着重要作用,可能是导致目前预测不准确的原因。我们发现,在超过 40233 个 RNA 测序样本(40K-RNA)中,每个外显子-内含子接头处最常出现的自然错误剪接事件汇总,可以准确预测哪些隐匿供体将在罕见疾病中被激活。40K-RNA 为预测遗传疾病中变异激活的隐匿供体提供了一种准确的、基于证据的方法,有助于病理学家考虑变异对编码蛋白的可能影响,并为 RNA 诊断测试策略提供依据。