• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

开发一个用于整合多种基因组模式和全面基因组知识的通用人工智能模型。

Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge.

作者信息

Zhang Zhenhao, Bao Xinyu, Jiang Linghua, Luo Xin, Wang Yichun, Comai Annelise, Waldhaus Joerg, Hansen Anders S, Li Wenbo, Liu Jie

机构信息

Department of Computational Medicine and Bioinformatics, University of Michigan, Ann Arbor, MI, USA.

Department of Computer Science and Engineering, University of Michigan, Ann Arbor, MI, USA.

出版信息

bioRxiv. 2025 May 14:2025.05.08.652986. doi: 10.1101/2025.05.08.652986.

DOI:10.1101/2025.05.08.652986
PMID:40462903
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC12132192/
Abstract

Advances in next-generation sequencing technologies have vastly expanded the availability of diverse genomic, epigenomic and transcriptomic data, presenting the opportunity to develop a general AI model that integrates comprehensive genomic knowledge into a unified model. Unlike previous predictive models, which are typically specialized to certain tasks, our general AI model unifies a wide range of genomic modalities, such as nascent RNA and ultra-high-resolution chromatin organization, within a multi-task architecture. Using ATAC-seq and DNA sequences as inputs, we incorporated diverse genomic modalities as output, and the model exhibits strong generalizability across different cell types and tissues in all tasks we trained. It accurately predicts gene-level transcription measured by various nascent RNA assays, and effectively captures enhancer-associated transcription. Additionally, it also accurately captures the potential functions of non-coding genetic variants and regulatory elements. Additionally, we extended the model trained on human data to a mouse general model, achieving accurate predictions of genomic modalities, such as high resolution chromatin contact maps with limited data availability, which are further validated using an established mouse inner-ear study. This comprehensive approach offers a powerful tool for understanding genome regulation in both human and mouse species.

摘要

新一代测序技术的进步极大地扩展了各种基因组、表观基因组和转录组数据的可得性,为开发一种将全面的基因组知识整合到统一模型中的通用人工智能模型提供了契机。与以往通常专门用于特定任务的预测模型不同,我们的通用人工智能模型在多任务架构中统一了广泛的基因组模式,如新生RNA和超高分辨率染色质组织。以ATAC序列和DNA序列作为输入,我们将各种基因组模式作为输出纳入其中,并且该模型在我们训练的所有任务中,在不同细胞类型和组织中均表现出很强的通用性。它能准确预测通过各种新生RNA检测方法测得的基因水平转录,并有效捕捉增强子相关转录。此外,它还能准确捕捉非编码基因变异和调控元件的潜在功能。此外,我们将基于人类数据训练的模型扩展为小鼠通用模型,在数据可用性有限的情况下,实现了对基因组模式的准确预测,如高分辨率染色质接触图谱,这在一项既定的小鼠内耳研究中得到了进一步验证。这种综合方法为理解人类和小鼠物种的基因组调控提供了一个强大的工具。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/2701b1e26e8e/nihpp-2025.05.08.652986v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/ad231b1ccbeb/nihpp-2025.05.08.652986v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/980f635a3b79/nihpp-2025.05.08.652986v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/3702d29850e4/nihpp-2025.05.08.652986v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/d2fe0c019137/nihpp-2025.05.08.652986v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/2701b1e26e8e/nihpp-2025.05.08.652986v1-f0005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/ad231b1ccbeb/nihpp-2025.05.08.652986v1-f0001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/980f635a3b79/nihpp-2025.05.08.652986v1-f0002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/3702d29850e4/nihpp-2025.05.08.652986v1-f0003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/d2fe0c019137/nihpp-2025.05.08.652986v1-f0004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/5a73/12132192/2701b1e26e8e/nihpp-2025.05.08.652986v1-f0005.jpg

相似文献

1
Developing a general AI model for integrating diverse genomic modalities and comprehensive genomic knowledge.开发一个用于整合多种基因组模式和全面基因组知识的通用人工智能模型。
bioRxiv. 2025 May 14:2025.05.08.652986. doi: 10.1101/2025.05.08.652986.
2
Short-Term Memory Impairment短期记忆障碍
3
Comparison of Two Modern Survival Prediction Tools, SORG-MLA and METSSS, in Patients With Symptomatic Long-bone Metastases Who Underwent Local Treatment With Surgery Followed by Radiotherapy and With Radiotherapy Alone.两种现代生存预测工具 SORG-MLA 和 METSSS 在接受手术联合放疗和单纯放疗治疗有症状长骨转移患者中的比较。
Clin Orthop Relat Res. 2024 Dec 1;482(12):2193-2208. doi: 10.1097/CORR.0000000000003185. Epub 2024 Jul 23.
4
A New Measure of Quantified Social Health Is Associated With Levels of Discomfort, Capability, and Mental and General Health Among Patients Seeking Musculoskeletal Specialty Care.一种新的量化社会健康指标与寻求肌肉骨骼专科护理的患者的不适程度、能力以及心理和总体健康水平相关。
Clin Orthop Relat Res. 2025 Apr 1;483(4):647-663. doi: 10.1097/CORR.0000000000003394. Epub 2025 Feb 5.
5
Management of urinary stones by experts in stone disease (ESD 2025).结石病专家对尿路结石的管理(2025年结石病专家共识)
Arch Ital Urol Androl. 2025 Jun 30;97(2):14085. doi: 10.4081/aiua.2025.14085.
6
Signs and symptoms to determine if a patient presenting in primary care or hospital outpatient settings has COVID-19.在基层医疗机构或医院门诊环境中,如果患者出现以下症状和体征,可判断其是否患有 COVID-19。
Cochrane Database Syst Rev. 2022 May 20;5(5):CD013665. doi: 10.1002/14651858.CD013665.pub3.
7
The Lived Experience of Autistic Adults in Employment: A Systematic Search and Synthesis.成年自闭症患者的就业生活经历:系统检索与综述
Autism Adulthood. 2024 Dec 2;6(4):495-509. doi: 10.1089/aut.2022.0114. eCollection 2024 Dec.
8
Leveraging a foundation model zoo for cell similarity search in oncological microscopy across devices.利用基础模型库进行跨设备肿瘤显微镜检查中的细胞相似性搜索。
Front Oncol. 2025 Jun 18;15:1480384. doi: 10.3389/fonc.2025.1480384. eCollection 2025.
9
Topical versus systemic antibiotics for chronic suppurative otitis media.用于慢性化脓性中耳炎的局部用抗生素与全身用抗生素对比
Cochrane Database Syst Rev. 2025 Jun 9;6(6):CD013053. doi: 10.1002/14651858.CD013053.pub3.
10
Survivor, family and professional experiences of psychosocial interventions for sexual abuse and violence: a qualitative evidence synthesis.性虐待和暴力的心理社会干预的幸存者、家庭和专业人员的经验:定性证据综合。
Cochrane Database Syst Rev. 2022 Oct 4;10(10):CD013648. doi: 10.1002/14651858.CD013648.pub2.

本文引用的文献

1
Evaluating the representational power of pre-trained DNA language models for regulatory genomics.评估预训练DNA语言模型在调控基因组学中的表征能力。
Genome Biol. 2025 Jul 14;26(1):203. doi: 10.1186/s13059-025-03674-8.
2
Massively parallel characterization of transcriptional regulatory elements.转录调控元件的大规模并行表征
Nature. 2025 Mar;639(8054):411-420. doi: 10.1038/s41586-024-08430-9. Epub 2025 Jan 15.
3
Predicting RNA-seq coverage from DNA sequence as a unifying model of gene regulation.将DNA序列预测RNA测序覆盖度作为基因调控的统一模型。
Nat Genet. 2025 Apr;57(4):949-961. doi: 10.1038/s41588-024-02053-6. Epub 2025 Jan 8.
4
A foundation model of transcription across human cell types.一种跨人类细胞类型的转录基础模型。
Nature. 2025 Jan;637(8047):965-973. doi: 10.1038/s41586-024-08391-z. Epub 2025 Jan 8.
5
Nucleotide Transformer: building and evaluating robust foundation models for human genomics.核苷酸变换器:构建和评估用于人类基因组学的强大基础模型。
Nat Methods. 2025 Feb;22(2):287-297. doi: 10.1038/s41592-024-02523-z. Epub 2024 Nov 28.
6
Deciphering the impact of genomic variation on function.解读基因组变异对功能的影响。
Nature. 2024 Sep;633(8028):47-57. doi: 10.1038/s41586-024-07510-0. Epub 2024 Sep 4.
7
CUX1 regulates human hematopoietic stem cell chromatin accessibility via the BAF complex.CUX1 通过 BAF 复合物调节人类造血干细胞染色质可及性。
Cell Rep. 2024 May 28;43(5):114227. doi: 10.1016/j.celrep.2024.114227. Epub 2024 May 11.
8
Sequence basis of transcription initiation in the human genome.人类基因组中转录起始的序列基础。
Science. 2024 Apr 26;384(6694):eadj0116. doi: 10.1126/science.adj0116.
9
DNA language models are powerful predictors of genome-wide variant effects.DNA 语言模型是全基因组变异效应的有力预测因子。
Proc Natl Acad Sci U S A. 2023 Oct 31;120(44):e2311219120. doi: 10.1073/pnas.2311219120. Epub 2023 Oct 26.
10
TGF-β signaling in health and disease.转化生长因子-β 信号在健康和疾病中的作用。
Cell. 2023 Sep 14;186(19):4007-4037. doi: 10.1016/j.cell.2023.07.036.