Sullivan Brendan, Langan Patricia S, Archibald Rick, Coates Leighton, Vadavasi Venu Gopal, Lynch Vickie
Neutron Scattering Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
Computer Science and Mathematics Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA.
IEEE ACM Int Symp Clust Cloud Grid Comput. 2019 May;2019:549-555. doi: 10.1109/CCGRID.2019.00070. Epub 2019 Jul 4.
Crystallography is the powerhouse technique for molecular structure determination, with applications in fields ranging from energy storage to drug design. Accurate structure determination, however, relies partly on determining the precise locations and integrated intensities of Bragg peaks in the resulting data. Here, we describe a method for Bragg peak integration that is accomplished using neural networks. The network is based on a U-Net and identifies peaks in three-dimensional reciprocal space through segmentation, allowing prediction of the full 3D peak shape from noisy data that is commonly difficult to process. The procedure for generating appropriate training sets is detailed. Trained networks achieve Dice coefficients of 0.82 and mean IoUs of 0.69. Carrying out integration over entire datasets, it is demonstrated that integrating neural network-predicted peaks results in improved intensity statistics. Furthermore, using a second dataset, the possibility of transfer learning between datasets is shown. Given the ubiquity and growing complexity of crystallography, we anticipate integration by machine learning to play an increasingly important role across the physical sciences. These early results demonstrate the applicability of deep learning techniques for integrating crystallography data and suggest a possible role in the next generation of crystallography experiments.
晶体学是确定分子结构的强大技术,在从能量存储到药物设计等众多领域都有应用。然而,精确的结构确定部分依赖于确定所得数据中布拉格峰的精确位置和积分强度。在此,我们描述了一种使用神经网络完成布拉格峰积分的方法。该网络基于U-Net,通过分割在三维倒易空间中识别峰,从而能够从通常难以处理的噪声数据中预测完整的三维峰形状。详细介绍了生成合适训练集的过程。训练后的网络的骰子系数达到0.82,平均交并比为0.69。在整个数据集上进行积分后表明,对神经网络预测的峰进行积分可改善强度统计。此外,使用第二个数据集展示了在不同数据集之间进行迁移学习的可能性。鉴于晶体学的普遍性和日益增加的复杂性,我们预计机器学习积分将在整个物理科学领域发挥越来越重要的作用。这些早期结果证明了深度学习技术在整合晶体学数据方面的适用性,并暗示了其在下一代晶体学实验中可能发挥的作用。