Xiong Hui Y, Alipanahi Babak, Lee Leo J, Bretschneider Hannes, Merico Daniele, Yuen Ryan K C, Hua Yimin, Gueroussov Serge, Najafabadi Hamed S, Hughes Timothy R, Morris Quaid, Barash Yoseph, Krainer Adrian R, Jojic Nebojsa, Scherer Stephen W, Blencowe Benjamin J, Frey Brendan J
Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Donnelly Centre for Cellular and Biomolecular Research, University of Toronto, Toronto, Ontario M5S 3E1, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada.
Department of Electrical and Computer Engineering, University of Toronto, Toronto, Ontario M5S 3G4, Canada. Program on Genetic Networks and Program on Neural Computation & Adaptive Perception, Canadian Institute for Advanced Research, Toronto, Ontario M5G 1Z8, Canada. Department of Computer Science, University of Toronto, Toronto, Ontario M5S 3G4, Canada.
Science. 2015 Jan 9;347(6218):1254806. doi: 10.1126/science.1254806. Epub 2014 Dec 18.
To facilitate precision medicine and whole-genome annotation, we developed a machine-learning technique that scores how strongly genetic variants affect RNA splicing, whose alteration contributes to many diseases. Analysis of more than 650,000 intronic and exonic variants revealed widespread patterns of mutation-driven aberrant splicing. Intronic disease mutations that are more than 30 nucleotides from any splice site alter splicing nine times as often as common variants, and missense exonic disease mutations that have the least impact on protein function are five times as likely as others to alter splicing. We detected tens of thousands of disease-causing mutations, including those involved in cancers and spinal muscular atrophy. Examination of intronic and exonic variants found using whole-genome sequencing of individuals with autism revealed misspliced genes with neurodevelopmental phenotypes. Our approach provides evidence for causal variants and should enable new discoveries in precision medicine.
为推动精准医学和全基因组注释,我们开发了一种机器学习技术,该技术可对基因变异影响RNA剪接的强度进行评分,而RNA剪接的改变会导致多种疾病。对超过65万个内含子和外显子变异的分析揭示了突变驱动的异常剪接的广泛模式。距离任何剪接位点超过30个核苷酸的内含子疾病突变改变剪接的频率是常见变异的九倍,而对蛋白质功能影响最小的错义外显子疾病突变改变剪接的可能性是其他突变的五倍。我们检测到了数以万计的致病突变,包括那些与癌症和脊髓性肌萎缩症相关的突变。对自闭症患者全基因组测序发现的内含子和外显子变异进行检查,发现了具有神经发育表型的剪接异常基因。我们的方法为因果变异提供了证据,应该能够在精准医学中带来新的发现。