St. Giles Laboratory of Human Genetics of Infectious Diseases, Rockefeller Branch, The Rockefeller University, New York, NY 10065.
Laboratory of Human Genetics of Infectious Diseases, Necker Branch, INSERM UMR1163, 75015 Paris, France.
Proc Natl Acad Sci U S A. 2022 Nov;119(44):e2211194119. doi: 10.1073/pnas.2211194119. Epub 2022 Oct 28.
Pre-messenger RNA splicing is initiated with the recognition of a single-nucleotide intronic branchpoint (BP) within a BP motif by spliceosome elements. Forty-eight rare variants in 43 human genes have been reported to alter splicing and cause disease by disrupting BP. However, until now, no computational approach was available to efficiently detect such variants in massively parallel sequencing data. We established a comprehensive human genome-wide BP database by integrating existing BP data and generating new BP data from RNA sequencing of lariat debranching enzyme DBR1-mutated patients and from machine-learning predictions. We characterized multiple features of BP in major and minor introns and found that BP and BP-2 (two nucleotides upstream of BP) positions exhibit a lower rate of variation in human populations and higher evolutionary conservation than the intronic background, while being comparable to the exonic background. We developed BPHunter as a genome-wide computational approach to systematically and efficiently detect intronic variants that may disrupt BP recognition. BPHunter retrospectively identified 40 of the 48 known pathogenic BP variants, in which we summarized a strategy for prioritizing BP variant candidates. The remaining eight variants all create AG-dinucleotides between the BP and acceptor site, which is the likely reason for missplicing. We demonstrated the practical utility of BPHunter prospectively by using it to identify a novel germline heterozygous BP variant of in a patient with critical COVID-19 pneumonia and a novel somatic intronic 59-nucleotide deletion of in a lymphoma patient, both of which were validated experimentally. BPHunter is publicly available from https://hgidsoft.rockefeller.edu/BPHunter and https://github.com/casanova-lab/BPHunter.
前信使 RNA 剪接是由剪接体元件识别内含子分支点 (BP) 基序内的单个核苷酸 BP 起始的。已经报道了 43 个人类基因中的 48 个罕见变异通过破坏 BP 改变剪接并导致疾病。然而,到目前为止,还没有计算方法可以有效地在大规模平行测序数据中检测到这种变体。我们通过整合现有的 BP 数据并从带有拉蒂分支酶 DBR1 突变的患者的 RNA 测序和机器学习预测中生成新的 BP 数据,建立了一个全面的人类全基因组 BP 数据库。我们对主要和次要内含子中的 BP 进行了多个特征分析,发现 BP 和 BP-2(BP 上游两个核苷酸)位置在人类群体中的变异率较低,进化保守性高于内含子背景,与外显子背景相当。我们开发了 BPHunter 作为一种全基因组计算方法,用于系统且有效地检测可能破坏 BP 识别的内含子变体。BPHunter 回顾性地鉴定了 48 个已知致病性 BP 变体中的 40 个,我们总结了一种优先考虑 BP 变体候选者的策略。其余 8 个变体都在 BP 和接受体之间创建了 AG-二核苷酸,这可能是导致错剪接的原因。我们通过前瞻性使用 BPHunter 来鉴定一位患有危重症 COVID-19 肺炎的患者中的 中的一个新的种系杂合 BP 变体和一位淋巴瘤患者中的 中的一个新的体细胞内含子 59 个核苷酸缺失变体,证明了其实际应用。BPHunter 可在 https://hgidsoft.rockefeller.edu/BPHunter 和 https://github.com/casanova-lab/BPHunter 上公开获得。