• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

通过多模态基础模型实现通用人工智能。

Towards artificial general intelligence via a multimodal foundation model.

机构信息

Gaoling School of Artificial Intelligence, Renmin University of China, Beijing, China.

Beijing Key Laboratory of Big Data Management and Analysis Methods, Beijing, China.

出版信息

Nat Commun. 2022 Jun 2;13(1):3094. doi: 10.1038/s41467-022-30761-2.

DOI:10.1038/s41467-022-30761-2
PMID:35655064
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC9163040/
Abstract

The fundamental goal of artificial intelligence (AI) is to mimic the core cognitive activities of human. Despite tremendous success in the AI research, most of existing methods have only single-cognitive ability. To overcome this limitation and take a solid step towards artificial general intelligence (AGI), we develop a foundation model pre-trained with huge multimodal data, which can be quickly adapted for various downstream cognitive tasks. To achieve this goal, we propose to pre-train our foundation model by self-supervised learning with weak semantic correlation data crawled from the Internet and show that promising results can be obtained on a wide range of downstream tasks. Particularly, with the developed model-interpretability tools, we demonstrate that strong imagination ability is now possessed by our foundation model. We believe that our work makes a transformative stride towards AGI, from our common practice of "weak or narrow AI" to that of "strong or generalized AI".

摘要

人工智能(AI)的根本目标是模仿人类的核心认知活动。尽管在 AI 研究中取得了巨大的成功,但现有的大多数方法仅具有单一的认知能力。为了克服这一局限性,并朝着人工通用智能(AGI)迈出坚实的一步,我们开发了一种基于大规模多模态数据进行预训练的基础模型,该模型可以快速适应各种下游认知任务。为了实现这一目标,我们提出通过使用从互联网上抓取的弱语义相关性数据进行自监督学习来预训练我们的基础模型,并表明可以在广泛的下游任务中获得有前途的结果。特别是,通过开发的模型可解释性工具,我们证明了我们的基础模型现在具有强大的想象力能力。我们相信,我们的工作朝着 AGI 迈出了变革性的一步,从我们通常的“弱或狭义 AI”实践转变为“强或广义 AI”。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/0fbfe4f57fde/41467_2022_30761_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/369ffe9f8d19/41467_2022_30761_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/6bc577596bfc/41467_2022_30761_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/07255416064e/41467_2022_30761_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/b3fcfdf6f78f/41467_2022_30761_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/f382ce5a0c2d/41467_2022_30761_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/0fbfe4f57fde/41467_2022_30761_Fig6_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/369ffe9f8d19/41467_2022_30761_Fig1_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/6bc577596bfc/41467_2022_30761_Fig2_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/07255416064e/41467_2022_30761_Fig3_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/b3fcfdf6f78f/41467_2022_30761_Fig4_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/f382ce5a0c2d/41467_2022_30761_Fig5_HTML.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/c783/9163040/0fbfe4f57fde/41467_2022_30761_Fig6_HTML.jpg

相似文献

1
Towards artificial general intelligence via a multimodal foundation model.通过多模态基础模型实现通用人工智能。
Nat Commun. 2022 Jun 2;13(1):3094. doi: 10.1038/s41467-022-30761-2.
2
Revolutionizing Digital Pathology With the Power of Generative Artificial Intelligence and Foundation Models.利用生成式人工智能和基础模型推动数字病理学革命。
Lab Invest. 2023 Nov;103(11):100255. doi: 10.1016/j.labinv.2023.100255. Epub 2023 Sep 26.
3
Preparing for Artificial General Intelligence (AGI) in Health Professions Education: AMEE Guide No. 172.为健康专业教育中的人工通用智能(AGI)做准备:AMEE 指南第 172 号。
Med Teach. 2024 Oct;46(10):1258-1271. doi: 10.1080/0142159X.2024.2387802. Epub 2024 Aug 8.
4
ARTIFICIAL INTELLIGENCE IN MEDICAL PRACTICE: REGULATIVE ISSUES AND PERSPECTIVES.人工智能在医学实践中的应用:监管问题与展望。
Wiad Lek. 2020;73(12 cz 2):2722-2727.
5
How Can the Current State of AI Guide Future Conversations of General Intelligence?当前的人工智能状态如何引导关于通用智能的未来讨论?
J Intell. 2024 Mar 20;12(3):36. doi: 10.3390/jintelligence12030036.
6
Future Medical Artificial Intelligence Application Requirements and Expectations of Physicians in German University Hospitals: Web-Based Survey.德国大学医院的未来医学人工智能应用要求和医生期望:基于网络的调查。
J Med Internet Res. 2021 Mar 5;23(3):e26646. doi: 10.2196/26646.
7
Forecasting emergent risks in advanced AI systems: an analysis of a future road transport management system.预测先进 AI 系统中的紧急风险:对未来道路运输管理系统的分析。
Ergonomics. 2023 Nov;66(11):1750-1767. doi: 10.1080/00140139.2023.2286907. Epub 2024 Jan 2.
8
Development and evaluation of a live birth prediction model for evaluating human blastocysts from a retrospective study.从回顾性研究中评估人类囊胚的活产预测模型的开发和评估。
Elife. 2023 Feb 22;12:e83662. doi: 10.7554/eLife.83662.
9
Artificial Intelligence Can Generate Fraudulent but Authentic-Looking Scientific Medical Articles: Pandora's Box Has Been Opened.人工智能可以生成虚假但看起来真实的科学医学文章:潘多拉的盒子已经被打开。
J Med Internet Res. 2023 May 31;25:e46924. doi: 10.2196/46924.
10
Artificial intelligence-assisted dermatology diagnosis: From unimodal to multimodal.人工智能辅助皮肤病诊断:从单模态到多模态。
Comput Biol Med. 2023 Oct;165:107413. doi: 10.1016/j.compbiomed.2023.107413. Epub 2023 Sep 1.

引用本文的文献

1
Optical generative models.光学生成模型。
Nature. 2025 Aug;644(8078):903-911. doi: 10.1038/s41586-025-09446-5. Epub 2025 Aug 27.
2
Histological Image Classification Between Follicular Lymphoma and Reactive Lymphoid Tissue Using Deep Learning and Explainable Artificial Intelligence (XAI).使用深度学习和可解释人工智能(XAI)对滤泡性淋巴瘤和反应性淋巴组织进行组织学图像分类
Cancers (Basel). 2025 Jul 22;17(15):2428. doi: 10.3390/cancers17152428.
3
Advanced Design for High-Performance and AI Chips.高性能与人工智能芯片的先进设计

本文引用的文献

1
Zero and Few Shot Learning With Semantic Feature Synthesis and Competitive Learning.基于语义特征合成与竞争学习的零样本和少样本学习
IEEE Trans Pattern Anal Mach Intell. 2021 Jul;43(7):2510-2523. doi: 10.1109/TPAMI.2020.2965534. Epub 2021 Jun 8.
2
Deep learning.深度学习。
Nature. 2015 May 28;521(7553):436-44. doi: 10.1038/nature14539.
3
Explicit encoding of multimodal percepts by single neurons in the human brain.人类大脑中单神经元对多模态知觉的明确定位。
Nanomicro Lett. 2025 Jul 29;18(1):13. doi: 10.1007/s40820-025-01850-w.
4
Seamless optical cloud computing across edge-metro network for generative AI.用于生成式人工智能的跨边缘-城域网的无缝光学云计算
Nat Commun. 2025 Jul 2;16(1):6097. doi: 10.1038/s41467-025-61495-6.
5
Ubiquitous memory augmentation via mobile multimodal embedding system.通过移动多模态嵌入系统实现无处不在的记忆增强。
Nat Commun. 2025 Jun 19;16(1):5339. doi: 10.1038/s41467-025-60802-5.
6
Foundation models and intelligent decision-making: Progress, challenges, and perspectives.基础模型与智能决策:进展、挑战与展望
Innovation (Camb). 2025 May 12;6(6):100948. doi: 10.1016/j.xinn.2025.100948. eCollection 2025 Jun 2.
7
A Multimodal Large Language Model Framework for Intelligent Perception and Decision-Making in Smart Manufacturing.一种用于智能制造中智能感知与决策的多模态大语言模型框架。
Sensors (Basel). 2025 May 13;25(10):3072. doi: 10.3390/s25103072.
8
Advancements in Medical Radiology Through Multimodal Machine Learning: A Comprehensive Overview.通过多模态机器学习实现医学放射学的进展:全面概述
Bioengineering (Basel). 2025 Apr 30;12(5):477. doi: 10.3390/bioengineering12050477.
9
A Perspective on Foundation Models in Chemistry.化学领域基础模型的视角
JACS Au. 2025 Mar 25;5(4):1499-1518. doi: 10.1021/jacsau.4c01160. eCollection 2025 Apr 28.
10
AI-assisted Diagnosis of Nonmelanoma Skin Cancer in Resource-Limited Settings.资源有限环境下非黑色素瘤皮肤癌的人工智能辅助诊断
Cancer Epidemiol Biomarkers Prev. 2025 Jul 1;34(7):1080-1088. doi: 10.1158/1055-9965.EPI-25-0132.
Curr Biol. 2009 Aug 11;19(15):1308-13. doi: 10.1016/j.cub.2009.06.060. Epub 2009 Jul 23.
4
Invariant visual representation by single neurons in the human brain.人类大脑中单个神经元的不变视觉表征。
Nature. 2005 Jun 23;435(7045):1102-7. doi: 10.1038/nature03687.