Shirasawa Raku, Takemura Ichiro, Hattori Shinnosuke, Nagata Yuuya
Advanced Research Laboratory, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan.
Tokyo Laboratory 26, R&D Center, Sony Group Corporation, Atsugi Tec. 4-14-1 Asahi-cho, Atsugi-shi, Kanagawa, 243-0014, Japan.
Commun Chem. 2022 Nov 22;5(1):158. doi: 10.1038/s42004-022-00770-9.
Acceleration of material discovery has been tackled by informatics and laboratory automation. Here we show a semi-automated material exploration scheme to modelize the solubility of tetraphenylporphyrin derivatives. The scheme involved the following steps: definition of a practical chemical search space, prioritization of molecules in the space using an extended algorithm for submodular function maximization without requiring biased variable selection or pre-existing data, synthesis & automated measurement, and machine-learning model estimation. The optimal evaluation order selected using the algorithm covered several similar molecules (32% of all targeted molecules, whereas that obtained by random sampling and uncertainty sampling was ~7% and ~4%, respectively) with a small number of evaluations (10 molecules: 0.13% of all targeted molecules). The derived binary classification models predicted 'good solvents' with an accuracy >0.8. Overall, we confirmed the effectivity of the proposed semi-automated scheme in early-stage material search projects for accelerating a wider range of material research.
信息学和实验室自动化技术已被用于加速材料发现。在此,我们展示了一种半自动化材料探索方案,用于模拟四苯基卟啉衍生物的溶解度。该方案包括以下步骤:定义一个实用的化学搜索空间,使用一种扩展的次模函数最大化算法对该空间中的分子进行优先级排序,无需偏向变量选择或预先存在的数据,进行合成与自动测量,以及机器学习模型估计。使用该算法选择的最优评估顺序涵盖了几个相似分子(占所有目标分子的32%,而通过随机抽样和不确定性抽样获得的比例分别约为7%和4%),且评估次数较少(10个分子:占所有目标分子的0.13%)。所推导的二元分类模型预测 “良溶剂” 的准确率 >0.8。总体而言,我们证实了所提出的半自动化方案在早期材料搜索项目中的有效性,可加速更广泛的材料研究。