CSIRO Data61, Door 34 Goods Shed Village St, Docklands, Victoria, Australia.
Nanoscale Horiz. 2021 Mar 1;6(3):277-282. doi: 10.1039/d0nh00637h. Epub 2021 Feb 2.
Machine learning classification is a useful technique to predict structure/property relationships in samples of nanomaterials where distributions of sizes and mixtures of shapes are persistent. The separation of classes, however, can either be supervised based on domain knowledge (human intelligence), or based entirely on unsupervised machine learning (artificial intelligence). This raises the questions as to which approach is more reliable, and how they compare? In this study we combine an ensemble data set of electronic structure simulations of the size, shape and peak wavelength for the optical emission of hydrogen passivated silicon quantum dots with artificial neural networks to explore the utility of different types of classes. By comparing the domain-driven and data-driven approaches we find there is a disconnect between what we see (optical emission) and assume (that a particular color band represents a special class), and what the data supports. Contrary to expectation, controlling a limited set of structural characteristics is not specific enough to classify a quantum dot based on color, even though it is experimentally intuitive.
机器学习分类是一种有用的技术,可以预测纳米材料样本中的结构/性质关系,其中尺寸分布和形状混合是持续存在的。然而,类别的分离可以基于监督(基于领域知识,即人类智能),也可以完全基于无监督机器学习(人工智能)。这就提出了一个问题,即哪种方法更可靠,它们如何比较?在这项研究中,我们将一系列大小、形状和峰值波长的电子结构模拟数据集与人工神经网络相结合,以探索不同类型的类别的效用。通过比较基于领域的和基于数据的方法,我们发现我们所看到的(光发射)和假设(特定颜色带代表特殊类别)与数据支持之间存在脱节。与预期相反,控制一组有限的结构特征不足以根据颜色对量子点进行分类,即使从实验上看这是直观的。