• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

在个性化基因组序列上训练深度学习模型可改善变异效应预测。

Training deep learning models on personalized genomic sequences improves variant effect prediction.

作者信息

He Adam Y, Palamuttam Nathan P, Danko Charles G

机构信息

Cornell University, Ithaca, NY 14850.

出版信息

bioRxiv. 2025 Feb 15:2024.10.15.618510. doi: 10.1101/2024.10.15.618510.

DOI:10.1101/2024.10.15.618510
PMID:39463940
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC11507713/
Abstract

Sequence-to-function models have broad applications in interpreting the molecular impact of genetic variation, yet have been criticized for poor performance in this task. Here we show that training models on functional genomic data with matched personal genomes improves their performance at variant effect prediction. Variant effect representations are retained even when fine tuning models to unseen cellular contexts and experimental readouts. Our results have implications for interpreting trait-associated genetic variation.

摘要

序列到功能模型在解释遗传变异的分子影响方面有广泛应用,但因在这项任务中表现不佳而受到批评。在这里,我们表明,在具有匹配个人基因组的功能基因组数据上训练模型,可以提高它们在变异效应预测方面的性能。即使在将模型微调至未见过的细胞背景和实验读数时,变异效应表征也能得以保留。我们的结果对解释与性状相关的遗传变异具有启示意义。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d2/11867454/7c39a72ef33c/nihpp-2024.10.15.618510v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d2/11867454/9902f577ef6a/nihpp-2024.10.15.618510v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d2/11867454/7c39a72ef33c/nihpp-2024.10.15.618510v2-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d2/11867454/9902f577ef6a/nihpp-2024.10.15.618510v2-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/64d2/11867454/7c39a72ef33c/nihpp-2024.10.15.618510v2-f0002.jpg

相似文献

1
Training deep learning models on personalized genomic sequences improves variant effect prediction.在个性化基因组序列上训练深度学习模型可改善变异效应预测。
bioRxiv. 2025 Feb 15:2024.10.15.618510. doi: 10.1101/2024.10.15.618510.
2
Enhancing recognition and interpretation of functional phenotypic sequences through fine-tuning pre-trained genomic models.通过微调预先训练的基因组模型来增强对功能表型序列的识别和解释。
J Transl Med. 2024 Aug 12;22(1):756. doi: 10.1186/s12967-024-05567-z.
3
Positional embeddings and zero-shot learning using BERT for molecular-property prediction.使用BERT进行位置嵌入和零样本学习以预测分子性质
J Cheminform. 2025 Feb 5;17(1):17. doi: 10.1186/s13321-025-00959-9.
4
Biologically relevant transfer learning improves transcription factor binding prediction.生物相关的迁移学习可提高转录因子结合预测。
Genome Biol. 2021 Sep 27;22(1):280. doi: 10.1186/s13059-021-02499-5.
5
Benchmarking reveals superiority of deep learning variant callers on bacterial nanopore sequence data.基准测试显示深度学习变异调用程序在细菌纳米孔测序数据上的优越性。
Elife. 2024 Oct 10;13:RP98300. doi: 10.7554/eLife.98300.
6
Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings.用于从DNA序列预测个人基因表达的深度神经网络基准测试凸显了不足之处。
bioRxiv. 2023 Sep 28:2023.03.16.532969. doi: 10.1101/2023.03.16.532969.
7
Comparing Pre-trained and Feature-Based Models for Prediction of Alzheimer's Disease Based on Speech.基于语音比较预训练模型和基于特征的模型对阿尔茨海默病的预测
Front Aging Neurosci. 2021 Apr 27;13:635945. doi: 10.3389/fnagi.2021.635945. eCollection 2021.
8
Cross-protein transfer learning substantially improves disease variant prediction.跨蛋白迁移学习显著提高了疾病变异体预测的性能。
Genome Biol. 2023 Aug 7;24(1):182. doi: 10.1186/s13059-023-03024-6.
9
A merged molecular representation learning for molecular properties prediction with a web-based service.基于网络服务的分子性质预测的融合分子表示学习。
Sci Rep. 2021 May 26;11(1):11028. doi: 10.1038/s41598-021-90259-7.
10
EvoAug: improving generalization and interpretability of genomic deep neural networks with evolution-inspired data augmentations.EvoAug:利用受进化启发的数据增强方法提高基因组深度学习神经网络的泛化能力和可解释性。
Genome Biol. 2023 May 5;24(1):105. doi: 10.1186/s13059-023-02941-w.

本文引用的文献

1
Rewriting regulatory DNA to dissect and reprogram gene expression.重写调控性DNA以剖析和重新编程基因表达。
Cell. 2025 Apr 14. doi: 10.1016/j.cell.2025.03.034.
2
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.将DNA序列预测RNA测序覆盖度作为基因调控的统一模型。
Nat Genet. 2025 Apr;57(4):949-961. doi: 10.1038/s41588-024-02053-6. Epub 2025 Jan 8.
3
Machine-guided design of cell-type-targeting cis-regulatory elements.机器引导的细胞类型靶向顺式调控元件设计。
Nature. 2024 Oct;634(8036):1211-1220. doi: 10.1038/s41586-024-08070-z. Epub 2024 Oct 23.
4
Sequence basis of transcription initiation in the human genome.人类基因组中转录起始的序列基础。
Science. 2024 Apr 26;384(6694):eadj0116. doi: 10.1126/science.adj0116.
5
Cell-type-directed design of synthetic enhancers.合成增强子的细胞类型定向设计。
Nature. 2024 Feb;626(7997):212-220. doi: 10.1038/s41586-023-06936-2. Epub 2023 Dec 12.
6
Targeted design of synthetic enhancers for selected tissues in the Drosophila embryo.针对果蝇胚胎中特定组织的合成增强子的靶向设计。
Nature. 2024 Feb;626(7997):207-211. doi: 10.1038/s41586-023-06905-9. Epub 2023 Dec 12.
7
Personal transcriptome variation is poorly explained by current genomic deep learning models.当前的基因组深度学习模型对个体转录组变异的解释能力较差。
Nat Genet. 2023 Dec;55(12):2056-2059. doi: 10.1038/s41588-023-01574-w. Epub 2023 Nov 30.
8
Benchmarking of deep neural networks for predicting personal gene expression from DNA sequence highlights shortcomings.用于从DNA序列预测个人基因表达的深度神经网络基准测试凸显了不足之处。
Nat Genet. 2023 Dec;55(12):2060-2064. doi: 10.1038/s41588-023-01524-6. Epub 2023 Nov 30.
9
Current sequence-based models capture gene expression determinants in promoters but mostly ignore distal enhancers.目前基于序列的模型可以捕捉启动子中的基因表达决定因素,但大多忽略了远端增强子。
Genome Biol. 2023 Mar 27;24(1):56. doi: 10.1186/s13059-023-02899-9.
10
Sequence-based modeling of three-dimensional genome architecture from kilobase to chromosome scale.基于序列的从千碱基到染色体尺度的三维基因组结构建模。
Nat Genet. 2022 May;54(5):725-734. doi: 10.1038/s41588-022-01065-4. Epub 2022 May 12.