用模型融合提高高通量钙钛矿合成的数据和预测质量。

Improving Data and Prediction Quality of High-Throughput Perovskite Synthesis with Model Fusion.

机构信息

Laboratory of Informatics and Data Mining (LIDM), Department of Computer and Information Science, Fordham University, 113 West 60th Street, New York, New York 10023, United States.

Molecular Foundry, Lawrence Berkeley National Laboratory, 1 Cyclotron Road, Berkeley, California 94720, United States.

出版信息

J Chem Inf Model. 2021 Apr 26;61(4):1593-1602. doi: 10.1021/acs.jcim.0c01307. Epub 2021 Apr 2.

DOI:10.1021/acs.jcim.0c01307

PMID:33797887

Abstract

Combinatorial fusion analysis (CFA) is an approach for combining multiple scoring systems using the rank-score characteristic function and cognitive diversity measure. One example is to combine diverse machine learning models to achieve better prediction quality. In this work, we apply CFA to the synthesis of metal halide perovskites containing organic ammonium cations via inverse temperature crystallization. Using a data set generated by high-throughput experimentation, four individual models (support vector machines, random forests, weighted logistic classifier, and gradient boosted trees) were developed. We characterize each of these scoring systems and explore 66 possible combinations of the models. When measured by the precision on predicting crystal formation, the majority of the combination models improves the individual model results. The best combination models outperform the best individual models by 3.9 percentage points in precision. In addition to improving prediction quality, we demonstrate how the fusion models can be used to identify mislabeled input data and address issues of data quality. In particular, we identify example cases where all single models and all fusion models do not give the correct prediction. Experimental replication of these syntheses reveals that these compositions are sensitive to modest temperature variations across the different locations of the heating element that can hinder or enhance the crystallization process. In summary, we demonstrate that model fusion using CFA can not only identify a previously unconsidered influence on reaction outcome but also be used as a form of quality control for high-throughput experimentation.

摘要

组合融合分析（CFA）是一种使用秩评分特征函数和认知多样性度量来组合多个评分系统的方法。一个例子是将不同的机器学习模型组合起来以获得更好的预测质量。在这项工作中，我们通过逆温度结晶将 CFA 应用于含有有机铵阳离子的卤化金属钙钛矿的合成。使用高通量实验生成的数据集，我们开发了四个单独的模型（支持向量机、随机森林、加权逻辑分类器和梯度提升树）。我们对每个评分系统进行了特征描述，并探索了模型之间 66 种可能的组合。当通过预测晶体形成的精度来衡量时，大多数组合模型都提高了单个模型的结果。最佳组合模型在精度上比最佳单个模型高出 3.9 个百分点。除了提高预测质量外，我们还展示了融合模型如何用于识别标记错误的输入数据并解决数据质量问题。特别是，我们确定了在所有单个模型和所有融合模型都无法给出正确预测的情况下的示例情况。这些合成的实验复制表明，这些组成对加热元件不同位置的适度温度变化很敏感，这可能会阻碍或增强结晶过程。总之，我们证明了使用 CFA 的模型融合不仅可以识别对反应结果的以前未考虑的影响，还可以用作高通量实验的质量控制形式。

相似文献

Improving Data and Prediction Quality of High-Throughput Perovskite Synthesis with Model Fusion.用模型融合提高高通量钙钛矿合成的数据和预测质量。

J Chem Inf Model. 2021 Apr 26;61(4):1593-1602. doi: 10.1021/acs.jcim.0c01307. Epub 2021 Apr 2.

Improving SDG Classification Precision Using Combinatorial Fusion.利用组合融合提高可持续发展目标分类精度。

Sensors (Basel). 2022 Jan 29;22(3):1067. doi: 10.3390/s22031067.

Maximizing lipocalin prediction through balanced and diversified training set and decision fusion.通过平衡且多样化的训练集和决策融合实现脂蛋白预测最大化。

Comput Biol Chem. 2015 Dec;59 Pt A:101-10. doi: 10.1016/j.compbiolchem.2015.09.011. Epub 2015 Sep 28.

Using Data Mining To Search for Perovskite Materials with Higher Specific Surface Area.利用数据挖掘寻找比表面积更高的钙钛矿材料。

J Chem Inf Model. 2018 Dec 24;58(12):2420-2427. doi: 10.1021/acs.jcim.8b00436. Epub 2018 Dec 4.

Crystallization Dynamics of Organolead Halide Perovskite by Real-Time X-ray Diffraction.实时 X 射线衍射研究有机卤化铅钙钛矿的成核动力学。

Nano Lett. 2015 Aug 12;15(8):5630-4. doi: 10.1021/acs.nanolett.5b02402. Epub 2015 Aug 3.

The Development of Target-Specific Machine Learning Models as Scoring Functions for Docking-Based Target Prediction.基于对接的靶标预测中目标特异性机器学习模型作为评分函数的发展。

J Chem Inf Model. 2019 Mar 25;59(3):1238-1252. doi: 10.1021/acs.jcim.8b00773. Epub 2019 Mar 18.

Improving the Spatial Prediction of Soil Organic Carbon Stocks in a Complex Tropical Mountain Landscape by Methodological Specifications in Machine Learning Approaches.通过机器学习方法中的方法规范改进复杂热带山地景观中土壤有机碳储量的空间预测

PLoS One. 2016 Apr 29;11(4):e0153673. doi: 10.1371/journal.pone.0153673. eCollection 2016.

Ultrasmooth organic-inorganic perovskite thin-film formation and crystallization for efficient planar heterojunction solar cells.高效平面异质结太阳能电池用超平滑有机-无机钙钛矿薄膜的形成与结晶。

Nat Commun. 2015 Jan 30;6:6142. doi: 10.1038/ncomms7142.

Machine learning algorithms applied to a prediction of personal overall thermal comfort using skin temperatures and occupants' heating behavior.机器学习算法应用于通过皮肤温度和居住者的供暖行为预测个人整体热舒适度。

Appl Ergon. 2020 May;85:103078. doi: 10.1016/j.apergo.2020.103078. Epub 2020 Feb 19.

Classification of Biodegradable Substances Using Balanced Random Trees and Boosted C5.0 Decision Trees.使用平衡随机树和提升 C5.0 决策树对可生物降解物质进行分类。

Int J Environ Res Public Health. 2020 Dec 13;17(24):9322. doi: 10.3390/ijerph17249322.

引用本文的文献

How to accelerate the inorganic materials synthesis: from computational guidelines to data-driven method?如何加速无机材料合成：从计算指南到数据驱动方法？

Natl Sci Rev. 2025 Mar 4;12(4):nwaf081. doi: 10.1093/nsr/nwaf081. eCollection 2025 Apr.

Improving SDG Classification Precision Using Combinatorial Fusion.利用组合融合提高可持续发展目标分类精度。

Sensors (Basel). 2022 Jan 29;22(3):1067. doi: 10.3390/s22031067.

文献检索

告别复杂PubMed语法，用中文像聊天一样搜索，搜遍4000万医学文献。AI智能推荐，让科研检索更轻松。

立即免费搜索

文件翻译

保留排版，准确专业，支持PDF/Word/PPT等文件格式，支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述，25分钟生成高质量综述，智能提取关键信息，辅助科研写作。

立即免费体验

用模型融合提高高通量钙钛矿合成的数据和预测质量。

Improving Data and Prediction Quality of High-Throughput Perovskite Synthesis with Model Fusion.

机构信息

出版信息

相似文献

引用本文的文献

文献检索

文件翻译

深度研究

Suppr 超能文献

相似文献

引用本文的文献