一份经过整理的致病性和可能致病性UTR变异体普查以及用于变异体效应预测的深度学习模型评估。

A curated census of pathogenic and likely pathogenic UTR variants and evaluation of deep learning models for variant effect prediction.

作者信息

Bohn Emma, Lau Tammy T Y, Wagih Omar, Masud Tehmina, Merico Daniele

机构信息

Deep Genomics Inc., Toronto, ON, Canada.

The Centre for Applied Genomics, Hospital for Sick Children, Toronto, ON, Canada.

出版信息

Front Mol Biosci. 2023 Sep 8;10:1257550. doi: 10.3389/fmolb.2023.1257550. eCollection 2023.

DOI:10.3389/fmolb.2023.1257550

PMID:37745687

原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC10517338/

Abstract

Variants in 5' and 3' untranslated regions (UTR) contribute to rare disease. While predictive algorithms to assist in classifying pathogenicity can potentially be highly valuable, the utility of these tools is often unclear, as it depends on carefully selected training and validation conditions. To address this, we developed a high confidence set of pathogenic (P) and likely pathogenic (LP) variants and assessed deep learning (DL) models for predicting their molecular effects. 3' and 5' UTR variants documented as P or LP (P/LP) were obtained from ClinVar and refined by reviewing the annotated variant effect and reassessing evidence of pathogenicity following published guidelines. Prediction scores from sequence-based DL models were compared between three groups: P/LP variants acting though the mechanism for which the model was designed (model-matched), those operating through other mechanisms (model-mismatched), and putative benign variants. PhyloP was used to compare conservation scores between P/LP and putative benign variants. 295 3' and 188 5' UTR variants were obtained from ClinVar, of which 26 3' and 68 5' UTR variants were classified as P/LP. Predictions by DL models achieved statistically significant differences when comparing modelmatched P/LP variants to both putative benign variants and modelmismatched P/LP variants, as well as when comparing all P/LP variants to putative benign variants. PhyloP conservation scores were significantly higher among P/LP compared to putative benign variants for both the 3' and 5' UTR. In conclusion, we present a high-confidence set of P/LP 3' and 5' UTR variants spanning a range of mechanisms and supported by detailed pathogenicity and molecular mechanism evidence curation. Predictions from DL models further substantiate these classifications. These datasets will support further development and validation of DL algorithms designed to predict the functional impact of variants that may be implicated in rare disease.

摘要

5'和3'非翻译区（UTR）的变异会导致罕见病。虽然有助于致病性分类的预测算法可能具有很高的价值，但这些工具的实用性往往不明确，因为这取决于精心选择的训练和验证条件。为了解决这个问题，我们开发了一组高置信度的致病性（P）和可能致病性（LP）变异，并评估了用于预测其分子效应的深度学习（DL）模型。从ClinVar获得记录为P或LP（P/LP）的3'和5'UTR变异，并通过审查注释的变异效应和按照已发表的指南重新评估致病性证据进行完善。在三组之间比较基于序列的DL模型的预测分数：通过模型设计机制起作用的P/LP变异（模型匹配）、通过其他机制起作用的变异（模型不匹配）和推定的良性变异。使用PhyloP比较P/LP和推定的良性变异之间的保守分数。从ClinVar获得了295个3'UTR变异和188个5'UTR变异，其中26个3'UTR变异和68个5'UTR变异被分类为P/LP。当将模型匹配的P/LP变异与推定的良性变异和模型不匹配的P/LP变异进行比较时，以及将所有P/LP变异与推定的良性变异进行比较时，DL模型的预测实现了统计学上的显著差异。对于3'和5'UTR，P/LP中的PhyloP保守分数显著高于推定的良性变异。总之，我们展示了一组高置信度的3'和5'UTR P/LP变异，涵盖一系列机制，并得到详细的致病性和分子机制证据整理的支持。DL模型的预测进一步证实了这些分类。这些数据集将支持旨在预测可能与罕见病相关的变异功能影响的DL算法的进一步开发和验证。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/6641/10517338/a8768948b2d4/fmolb-10-1257550-g001.jpg

相似文献

A curated census of pathogenic and likely pathogenic UTR variants and evaluation of deep learning models for variant effect prediction.一份经过整理的致病性和可能致病性UTR变异体普查以及用于变异体效应预测的深度学习模型评估。

Front Mol Biosci. 2023 Sep 8;10:1257550. doi: 10.3389/fmolb.2023.1257550. eCollection 2023.

Specifications of the ACMG/AMP Variant Classification Guidelines for Germline Variant Curation.ACMG/AMP 变异分类指南用于种系变异的临床解释：规范

Hum Mutat. 2023;2023. doi: 10.1155/2023/9537832. Epub 2023 Mar 29.

Curated multiple sequence alignment for the Adenomatous Polyposis Coli (APC) gene and accuracy of in silico pathogenicity predictions.精心挑选的腺瘤性结肠息肉病基因（APC）的多重序列比对和计算机预测致病性的准确性。

PLoS One. 2020 Aug 4;15(8):e0233673. doi: 10.1371/journal.pone.0233673. eCollection 2020.

Predicting functional UTR variants by integrating region-specific features.通过整合区域特异性特征预测功能性 UTR 变异。

Brief Bioinform. 2024 May 23;25(4). doi: 10.1093/bib/bbae248.

Gene-specific machine learning model to predict the pathogenicity of variants.用于预测变异致病性的基因特异性机器学习模型。

Front Genet. 2022 Sep 30;13:982930. doi: 10.3389/fgene.2022.982930. eCollection 2022.

Evaluation of in silico pathogenicity prediction tools for the classification of small in-frame indels.小框移码突变的计算机预测工具的分类评价。

BMC Med Genomics. 2023 Feb 28;16(1):36. doi: 10.1186/s12920-023-01454-6.

Reinterpretation of common pathogenic variants in ClinVar revealed a high proportion of downgrades.重新解读 ClinVar 中的常见致病性变异体，发现降级比例很高。

Sci Rep. 2020 Jan 15;10(1):331. doi: 10.1038/s41598-019-57335-5.

ClinVar and HGMD genomic variant classification accuracy has improved over time, as measured by implied disease burden.ClinVar 和 HGMD 基因组变异分类的准确性随着时间的推移有所提高，这可以通过潜在疾病负担来衡量。

Genome Med. 2023 Jul 13;15(1):51. doi: 10.1186/s13073-023-01199-y.

Disease-specific ACMG/AMP guidelines improve sequence variant interpretation for hearing loss.特定疾病的 ACMG/AMP 指南可改善听力损失相关序列变异的解读。

Genet Med. 2021 Nov;23(11):2208-2212. doi: 10.1038/s41436-021-01254-2. Epub 2021 Jul 6.

Analysis of hereditary cancer gene variant classifications from ClinVar indicates a need for regular reassessment of clinical assertions.来自ClinVar的遗传性癌症基因变异分类分析表明，需要定期重新评估临床论断。

Hum Mutat. 2022 Dec;43(12):2054-2062. doi: 10.1002/humu.24468. Epub 2022 Oct 2.

引用本文的文献

PEGylated iron oxide-gold core-shell nanoparticles for tumor-targeted delivery of Rapamycin.聚乙二醇化氧化铁-金核壳纳米颗粒用于雷帕霉素的肿瘤靶向递送。

3 Biotech. 2025 Jan;15(1):23. doi: 10.1007/s13205-024-04189-y. Epub 2024 Dec 25.

Quantifying negative selection in human 3' UTRs uncovers constrained targets of RNA-binding proteins.量化人类 3'UTR 中的负选择可揭示 RNA 结合蛋白的受限靶标。

Nat Commun. 2024 Jan 2;15(1):85. doi: 10.1038/s41467-023-44456-9.

本文引用的文献

A framework for individualized splice-switching oligonucleotide therapy.个体化剪接寡核苷酸治疗的框架。

Nature. 2023 Jul;619(7971):828-836. doi: 10.1038/s41586-023-06277-0. Epub 2023 Jul 12.

Insights into the Molecular Genetic of Hemophilia A and Hemophilia B: The Relevance of Genetic Testing in Routine Clinical Practice.血友病 A 和血友病 B 的分子遗传学研究进展：遗传检测在常规临床实践中的相关性。

Hamostaseologie. 2022 Dec;42(6):390-399. doi: 10.1055/a-1945-9429. Epub 2022 Dec 22.

The genetic and biochemical determinants of mRNA degradation rates in mammals.哺乳动物中 mRNA 降解速率的遗传和生化决定因素。

Genome Biol. 2022 Nov 23;23(1):245. doi: 10.1186/s13059-022-02811-x.

APPRIS principal isoforms and MANE Select transcripts define reference splice variants.APPRIS 主要异构体和 MANE Select 转录本定义参考剪接变体。

Bioinformatics. 2022 Sep 16;38(Suppl_2):ii89-ii94. doi: 10.1093/bioinformatics/btac473.

The hemoglobinopathies, molecular disease mechanisms and diagnostics.血红蛋白病，分子发病机制与诊断。

Int J Lab Hematol. 2022 Sep;44 Suppl 1(Suppl 1):28-36. doi: 10.1111/ijlh.13885.

Recommendations for clinical interpretation of variants found in non-coding regions of the genome.推荐对基因组非编码区域中发现的变异进行临床解读。

Genome Med. 2022 Jul 19;14(1):73. doi: 10.1186/s13073-022-01073-3.

Classification of non-coding variants with high pathogenic impact.高致病性非编码变异分类。

PLoS Genet. 2022 Apr 29;18(4):e1010191. doi: 10.1371/journal.pgen.1010191. eCollection 2022 Apr.

A joint NCBI and EMBL-EBI transcript set for clinical genomics and research.临床基因组学和研究用的 NCBI 和 EMBL-EBI 联合转录本集。

Nature. 2022 Apr;604(7905):310-315. doi: 10.1038/s41586-022-04558-8. Epub 2022 Apr 6.

APPRIS: selecting functionally important isoforms.APPRIS：选择具有重要功能的异构体。

Nucleic Acids Res. 2022 Jan 7;50(D1):D54-D59. doi: 10.1093/nar/gkab1058.

Effective gene expression prediction from sequence by integrating long-range interactions.通过整合长程相互作用，从序列中有效预测基因表达。

Nat Methods. 2021 Oct;18(10):1196-1203. doi: 10.1038/s41592-021-01252-x. Epub 2021 Oct 4.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

一份经过整理的致病性和可能致病性UTR变异体普查以及用于变异体效应预测的深度学习模型评估。

A curated census of pathogenic and likely pathogenic UTR variants and evaluation of deep learning models for variant effect prediction.

作者信息

机构信息

出版信息

相似文献

引用本文的文献

本文引用的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献

本文引用的文献