Suppr超能文献

超低数据量情况下的分子性质预测

Molecular property prediction in the ultra-low data regime.

作者信息

Eraqi Basem A, Khizbullin Dmitrii, Nagaraja Shashank S, Sarathy S Mani

机构信息

Clean Energy Research Platform, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.

Center of Excellence in Generative AI, King Abdullah University of Science and Technology, Thuwal, Saudi Arabia.

出版信息

Commun Chem. 2025 Jul 8;8(1):201. doi: 10.1038/s42004-025-01592-1.

Abstract

Data scarcity remains a major obstacle to effective machine learning in molecular property prediction and design, affecting diverse domains such as pharmaceuticals, solvents, polymers, and energy carriers. Although multi-task learning (MTL) can leverage correlations among properties to improve predictive performance, imbalanced training datasets often degrade its efficacy through negative transfer. Here, we present adaptive checkpointing with specialization (ACS), a training scheme for multi-task graph neural networks that mitigates detrimental inter-task interference while preserving the benefits of MTL. We validate ACS on multiple molecular property benchmarks, where it consistently surpasses or matches the performance of recent supervised methods. To illustrate its practical utility, we deploy ACS in a real-world scenario of predicting sustainable aviation fuel properties, showing that it can learn accurate models with as few as 29 labeled samples. By enabling reliable property prediction in low-data regimes, ACS broadens the scope and accelerates the pace of artificial intelligence-driven materials discovery and design.

摘要

数据稀缺仍然是分子性质预测和设计中有效机器学习的主要障碍,影响着制药、溶剂、聚合物和能量载体等多个领域。尽管多任务学习(MTL)可以利用性质之间的相关性来提高预测性能,但不平衡的训练数据集往往会通过负迁移降低其有效性。在这里,我们提出了带专业化的自适应检查点(ACS),这是一种用于多任务图神经网络的训练方案,它可以减轻有害的任务间干扰,同时保留MTL的优势。我们在多个分子性质基准上验证了ACS,它始终超过或匹配最近监督方法的性能。为了说明其实际效用,我们将ACS应用于预测可持续航空燃料性质的实际场景中,结果表明它可以用少至29个标记样本学习到准确的模型。通过在低数据情况下实现可靠的性质预测,ACS拓宽了人工智能驱动的材料发现和设计的范围并加快了其步伐。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验