Suppr超能文献

聊天机器人生成的150万条素材叙述。

1.5 million materials narratives generated by chatbots.

作者信息

Park Yang Jeong, Jerng Sung Eun, Yoon Sungroh, Li Ju

机构信息

Massachusetts Institute of Technology, Department of Nuclear Science and Engineering, Cambridge, 02139, USA.

Massachusetts Institute of Technology, Department of Materials Science and Engineering, Cambridge, 02139, USA.

出版信息

Sci Data. 2024 Sep 28;11(1):1060. doi: 10.1038/s41597-024-03886-w.

Abstract

The advent of artificial intelligence (AI) has enabled a comprehensive exploration of materials for various applications. However, AI models often prioritize frequently encountered material examples in the scientific literature, limiting the selection of suitable candidates based on inherent physical and chemical attributes. To address this imbalance, we generated a dataset consisting of 1,453,493 natural language-material narratives from OQMD, Materials Project, JARVIS, and AFLOW2 databases based on ab initio calculation results that are more evenly distributed across the periodic table. The generated text narratives were then scored by both human experts and GPT-4, based on three rubrics: technical accuracy, language and structure, and relevance and depth of content, showing similar scores but with human-scored depth of content being the most lagging. The integration of multimodal data sources and large language models holds immense potential for AI frameworks to aid the exploration and discovery of solid-state materials for specific applications of interest.

摘要

人工智能(AI)的出现使得人们能够全面探索适用于各种应用的材料。然而,AI模型通常优先考虑科学文献中经常出现的材料示例,这限制了基于固有物理和化学属性来选择合适的候选材料。为了解决这种不平衡,我们基于从头算计算结果生成了一个数据集,该数据集包含来自OQMD、材料项目、JARVIS和AFLOW2数据库的1,453,493条自然语言-材料叙述,这些叙述在元素周期表上的分布更加均匀。然后,由人类专家和GPT-4根据三个标准对生成的文本叙述进行评分:技术准确性、语言和结构,以及内容的相关性和深度,结果显示两者得分相似,但人类评分的内容深度最为滞后。多模态数据源和大语言模型的整合为AI框架助力探索和发现用于特定感兴趣应用的固态材料具有巨大潜力。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/9625/11439064/ebcbd800331a/41597_2024_3886_Fig1_HTML.jpg

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验