Chen Xiuying, Wang Tairan, Guo Taicheng, Guo Kehan, Zhou Juexiao, Li Haoyang, Song Zirui, Gao Xin, Zhang Xiangliang
Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE.
King Abdullah University of Science and Technology, Jeddah, Saudi Arabia.
Commun Chem. 2025 Jan 5;8(1):4. doi: 10.1038/s42004-024-01394-x.
While the abilities of language models are thoroughly evaluated in areas like general domains and biomedicine, academic chemistry remains less explored. Chemical QA tools also play a crucial role in both education and research by effectively translating complex chemical information into an understandable format. Addressing this gap, we introduce ScholarChemQA, a large-scale QA dataset constructed from chemical papers. Specifically, the questions are from paper titles with a question mark, and the multi-choice answers are reasoned out based on the corresponding abstracts. This dataset reflects typical real-world challenges, including an imbalanced data distribution and a substantial amount of unlabeled data that can be potentially useful. Correspondingly, we introduce a ChemMatch model, specifically designed to effectively answer chemical questions by fully leveraging our collected data. Experiments show that Large Language Models (LLMs) still have significant room for improvement in the field of chemistry. Moreover, ChemMatch significantly outperforms recent similar-scale baselines: https://github.com/iriscxy/chemmatch .
虽然语言模型的能力在通用领域和生物医学等领域得到了充分评估,但学术化学领域的探索仍较少。化学问答工具通过有效地将复杂的化学信息转化为可理解的格式,在教育和研究中也发挥着关键作用。为了填补这一空白,我们引入了ScholarChemQA,这是一个从化学论文构建的大规模问答数据集。具体来说,问题来自带有问号的论文标题,多项选择题答案是根据相应摘要推理得出的。该数据集反映了典型的现实世界挑战,包括数据分布不均衡以及大量可能有用的未标记数据。相应地,我们引入了ChemMatch模型,专门设计用于通过充分利用我们收集的数据来有效回答化学问题。实验表明,大语言模型(LLMs)在化学领域仍有很大的改进空间。此外,ChemMatch显著优于最近类似规模的基线:https://github.com/iriscxy/chemmatch 。