• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

从文本到翻译:使用语言模型对变异进行优先级排序以进行临床审查。

From Text to Translation: Using Language Models to Prioritize Variants for Clinical Review.

作者信息

Li Weijiang, Li Xiaomin, Lavallee Ethan, Saparov Alice, Zitnik Marinka, Cassa Christopher

机构信息

Division of Genetics, Department of Medicine, Brigham and Women's Hospital, Harvard Medical School, Boston, 02115, MA, United States.

School of Engineering and Applied Sciences, Harvard University, Boston, 02138, MA, United States.

出版信息

medRxiv. 2024 Dec 31:2024.12.31.24319792. doi: 10.1101/2024.12.31.24319792.

DOI:10.1101/2024.12.31.24319792
PMID:39802773
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11722495/
Abstract

Despite rapid advances in genomic sequencing, most rare genetic variants remain insufficiently characterized for clinical use, limiting the potential of personalized medicine. When classifying whether a variant is pathogenic, clinical labs adhere to diagnostic guidelines that comprehensively evaluate many forms of evidence including case data, computational predictions, and functional screening. While a substantial amount of clinical evidence has been developed for these variants, the majority cannot be definitively classified as 'pathogenic' or 'benign', and thus persist as 'Variants of Uncertain Significance' (VUS). We processed over 2.4 million plaintext variant summaries from ClinVar, employing sentence-level classification to remove content that does not contain evidence and removing uninformative summaries. We developed ClinVar-BERT to discern clinical evidence within these summaries by fine-tuning a BioBERT-based model with labeled records. When validated classifications from this model against orthogonal functional screening data, ClinVar-BERT significantly separated estimates of functional impact in clinically actionable genes, including (p = × ), (p = × ), and (p = × ). Additionally, ClinVar-BERT achieved an AUROC of 0.927 in classifying ClinVar VUS against this functional screening data. This suggests that ClinVar-BERT is capable of discerning evidence from diagnostic reports and can be used to prioritize variants for re-assessment by diagnostic labs and expert curation panels.

摘要

尽管基因组测序取得了快速进展,但大多数罕见基因变异在临床应用中的特征仍不够充分,限制了个性化医疗的潜力。在对变异是否致病进行分类时,临床实验室遵循诊断指南,全面评估多种证据形式,包括病例数据、计算预测和功能筛选。虽然已经为这些变异积累了大量临床证据,但大多数变异无法明确归类为“致病”或“良性”,因此仍作为“意义未明的变异”(VUS)存在。我们处理了来自ClinVar的超过240万条纯文本变异摘要,采用句子级分类去除不包含证据的内容,并去除无信息价值的摘要。我们开发了ClinVar-BERT,通过使用标记记录对基于BioBERT的模型进行微调,以辨别这些摘要中的临床证据。当将该模型的验证分类与正交功能筛选数据进行对比时,ClinVar-BERT显著区分了临床可操作基因中功能影响的估计值,包括(p = × )、(p = × )和(p = × )。此外,ClinVar-BERT在根据此功能筛选数据对ClinVar VUS进行分类时,AUROC达到了0.927。这表明ClinVar-BERT能够从诊断报告中辨别证据,可用于为诊断实验室和专家整理小组重新评估变异确定优先级。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/ebf6d3c18ae6/nihpp-2024.12.31.24319792v2-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/af5ba45df532/nihpp-2024.12.31.24319792v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/f139cf261685/nihpp-2024.12.31.24319792v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/36f0b0a08e03/nihpp-2024.12.31.24319792v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/865a58dba468/nihpp-2024.12.31.24319792v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/8279ed7a40a9/nihpp-2024.12.31.24319792v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/ebf6d3c18ae6/nihpp-2024.12.31.24319792v2-f0006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/af5ba45df532/nihpp-2024.12.31.24319792v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/f139cf261685/nihpp-2024.12.31.24319792v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/36f0b0a08e03/nihpp-2024.12.31.24319792v2-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/865a58dba468/nihpp-2024.12.31.24319792v2-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/8279ed7a40a9/nihpp-2024.12.31.24319792v2-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/3826/12234073/ebf6d3c18ae6/nihpp-2024.12.31.24319792v2-f0006.jpg

相似文献

1
From Text to Translation: Using Language Models to Prioritize Variants for Clinical Review.从文本到翻译:使用语言模型对变异进行优先级排序以进行临床审查。
medRxiv. 2024 Dec 31:2024.12.31.24319792. doi: 10.1101/2024.12.31.24319792.
2
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
3
Cost-effectiveness of using prognostic information to select women with breast cancer for adjuvant systemic therapy.利用预后信息为乳腺癌患者选择辅助性全身治疗的成本效益
Health Technol Assess. 2006 Sep;10(34):iii-iv, ix-xi, 1-204. doi: 10.3310/hta10340.
4
The Black Book of Psychotropic Dosing and Monitoring.《精神药物剂量与监测黑皮书》
Psychopharmacol Bull. 2024 Jul 8;54(3):8-59.
5
Can a Liquid Biopsy Detect Circulating Tumor DNA With Low-passage Whole-genome Sequencing in Patients With a Sarcoma? A Pilot Evaluation.液体活检能否通过低深度全基因组测序检测肉瘤患者的循环肿瘤DNA?一项初步评估。
Clin Orthop Relat Res. 2025 Jan 1;483(1):39-48. doi: 10.1097/CORR.0000000000003161. Epub 2024 Jun 21.
6
Short-Term Memory Impairment短期记忆障碍
7
Beckwith-Wiedemann Syndrome贝克威思-维德曼综合征
8
Sexual Harassment and Prevention Training性骚扰与预防培训
9
Autoimmune Lymphoproliferative Syndrome自身免疫性淋巴细胞增生综合征
10
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.

本文引用的文献

1
Distinct rates of VUS reclassification are observed when subclassifying VUS by evidence level.当根据证据水平对意义不明确的变异(VUS)进行亚分类时,观察到不同的VUS重新分类率。
Genet Med. 2025 Jun;27(6):101400. doi: 10.1016/j.gim.2025.101400. Epub 2025 Feb 28.
2
MaveDB 2024: a curated community database with over seven million variant effects from multiplexed functional assays.MaveDB 2024:一个经过整理的社区数据库,包含来自多重功能测定的超过700万个变异效应。
Genome Biol. 2025 Jan 21;26(1):13. doi: 10.1186/s13059-025-03476-y.
3
FUSE: Improving the estimation and imputation of variant impacts in functional screening.
FUSE:改进功能筛选中变体影响的估计和插补。
Cell Genom. 2024 Oct 9;4(10):100667. doi: 10.1016/j.xgen.2024.100667.
4
Joint genotypic and phenotypic outcome modeling improves base editing variant effect quantification.联合基因型和表型结果建模可提高碱基编辑变体效应定量。
Nat Genet. 2024 May;56(5):925-937. doi: 10.1038/s41588-024-01726-6. Epub 2024 Apr 24.
5
A call for spatial omics submissions.征集空间组学投稿。
Nat Genet. 2024 Jan;56(1):1. doi: 10.1038/s41588-023-01621-6.
6
Accurate proteome-wide missense variant effect prediction with AlphaMissense.使用 AlphaMissense 进行精确的全蛋白质错义变异效应预测。
Science. 2023 Sep 22;381(6664):eadg7492. doi: 10.1126/science.adg7492.
7
DNA-based screening and population health: a points to consider statement for programs and sponsoring organizations from the American College of Medical Genetics and Genomics (ACMG).基于DNA的筛查与群体健康:美国医学遗传学与基因组学学会(ACMG)为项目及赞助组织提供的一份需考虑要点声明
Genet Med. 2021 Jun;23(6):989-995. doi: 10.1038/s41436-020-01082-w. Epub 2021 Mar 16.
8
Strategic vision for improving human health at The Forefront of Genomics.基因组学前沿改善人类健康的战略愿景。
Nature. 2020 Oct;586(7831):683-692. doi: 10.1038/s41586-020-2817-4. Epub 2020 Oct 28.
9
MaveDB: an open-source platform to distribute and interpret data from multiplexed assays of variant effect.MaveDB:一个开源平台,用于分发和解释来自变异效应多重分析的数据。
Genome Biol. 2019 Nov 4;20(1):223. doi: 10.1186/s13059-019-1845-6.
10
BioBERT: a pre-trained biomedical language representation model for biomedical text mining.BioBERT:一种用于生物医学文本挖掘的预训练生物医学语言表示模型。
Bioinformatics. 2020 Feb 15;36(4):1234-1240. doi: 10.1093/bioinformatics/btz682.