Suppr超能文献

绘制材料和分子图谱。

Mapping Materials and Molecules.

机构信息

Department of Chemistry, University of Cambridge, Lensfield Road, Cambridge CB2 1EW, United Kingdom.

Cavendish Laboratory, University of Cambridge, J. J. Thomson Avenue, Cambridge CB3 0HE, United Kingdom.

出版信息

Acc Chem Res. 2020 Sep 15;53(9):1981-1991. doi: 10.1021/acs.accounts.0c00403. Epub 2020 Aug 14.

Abstract

The visualization of data is indispensable in scientific research, from the early stages when human insight forms to the final step of communicating results. In computational physics, chemistry and materials science, it can be as simple as making a scatter plot or as straightforward as looking through the snapshots of atomic positions manually. However, as a result of the "big data" revolution, these conventional approaches are often inadequate. The widespread adoption of high-throughput computation for materials discovery and the associated community-wide repositories have given rise to data sets that contain an enormous number of compounds and atomic configurations. A typical data set contains thousands to millions of atomic structures, along with a diverse range of properties such as formation energies, band gaps, or bioactivities.It would thus be desirable to have a data-driven and automated framework for visualizing and analyzing such structural data sets. The key idea is to construct a low-dimensional representation of the data, which facilitates navigation, reveals underlying patterns, and helps to identify data points with unusual attributes. Such data-intensive maps, often employing machine learning methods, are appearing more and more frequently in the literature. However, to the wider community, it is not always transparent how these maps are made and how they should be interpreted. Furthermore, while these maps undoubtedly serve a decorative purpose in academic publications, it is not always apparent what extra information can be garnered from reading or making them.This Account attempts to answer such questions. We start with a concise summary of the theory of representing chemical environments, followed by the introduction of a simple yet practical conceptual approach for generating structure maps in a generic and automated manner. Such analysis and mapping is made nearly effortless by employing the newly developed software tool ASAP. To showcase the applicability to a wide variety of systems in chemistry and materials science, we provide several illustrative examples, including crystalline and amorphous materials, interfaces, and organic molecules. In these examples, the maps not only help to sift through large data sets but also reveal hidden patterns that could be easily missed using conventional analyses.The explosion in the amount of computed information in chemistry and materials science has made visualization into a science in itself. Not only have we benefited from exploiting these visualization methods in previous works, we also believe that the automated mapping of data sets will in turn stimulate further creativity and exploration, as well as ultimately feed back into future advances in the respective fields.

摘要

数据可视化在科学研究中不可或缺,从人类洞察力形成的早期阶段到最终的结果交流阶段都是如此。在计算物理、化学和材料科学中,它可以简单到制作散点图,也可以直接手动查看原子位置的快照。然而,由于“大数据”革命,这些传统方法往往不够用。高通量计算在材料发现中的广泛采用,以及与之相关的社区范围的存储库,产生了包含大量化合物和原子结构的数据集。一个典型的数据集包含数千到数百万个原子结构,以及各种性质,如形成能、能带隙或生物活性。因此,最好有一个数据驱动和自动化的框架来可视化和分析这种结构数据集。关键思想是构建数据的低维表示,这有利于导航、揭示潜在模式,并有助于识别具有异常属性的数据点。这种数据密集型地图,通常采用机器学习方法,在文献中越来越常见。然而,对于更广泛的社区来说,这些地图是如何制作的,以及应该如何解释,并不总是透明的。此外,虽然这些地图在学术出版物中无疑具有装饰性目的,但从阅读或制作它们中可以获得哪些额外信息并不总是显而易见的。本专题试图回答这些问题。我们首先简要总结了表示化学环境的理论,然后介绍了一种简单而实用的通用和自动化生成结构地图的概念方法。通过使用新开发的软件工具 ASAP,几乎可以毫不费力地进行这种分析和映射。为了展示其在化学和材料科学中各种系统的适用性,我们提供了几个说明性示例,包括晶体和非晶体材料、界面和有机分子。在这些示例中,地图不仅有助于筛选大型数据集,还揭示了使用传统分析容易错过的隐藏模式。化学和材料科学中计算信息的爆炸式增长使得可视化本身成为一门科学。我们不仅从以前的工作中受益于利用这些可视化方法,而且还相信数据集的自动映射反过来也将激发进一步的创造力和探索,最终反馈到各自领域的未来进展中。

文献AI研究员

20分钟写一篇综述,助力文献阅读效率提升50倍。

立即体验

用中文搜PubMed

大模型驱动的PubMed中文搜索引擎

马上搜索

文档翻译

学术文献翻译模型,支持多种主流文档格式。

立即体验