Suppr超能文献

大规模全频GW计算的GPU加速

GPU Acceleration of Large-Scale Full-Frequency GW Calculations.

作者信息

Yu Victor Wen-Zhe, Govoni Marco

机构信息

Materials Science Division, Argonne National Laboratory, Lemont, Illinois 60439, United States.

Pritzker School of Molecular Engineering, The University of Chicago, Chicago, Illinois 60637, United States.

出版信息

J Chem Theory Comput. 2022 Aug 9;18(8):4690-4707. doi: 10.1021/acs.jctc.2c00241. Epub 2022 Aug 1.

Abstract

Many-body perturbation theory is a powerful method to simulate electronic excitations in molecules and materials starting from the output of density functional theory calculations. By implementing the theory efficiently so as to run at scale on the latest leadership high-performance computing systems it is possible to extend the scope of GW calculations. We present a GPU acceleration study of the full-frequency GW method as implemented in the WEST code. Excellent performance is achieved through the use of (i) optimized GPU libraries, e.g., cuFFT and cuBLAS, (ii) a hierarchical parallelization strategy that minimizes CPU-CPU, CPU-GPU, and GPU-GPU data transfer operations, (iii) nonblocking MPI communications that overlap with GPU computations, and (iv) mixed precision in selected portions of the code. A series of performance benchmarks has been carried out on leadership high-performance computing systems, showing a substantial speedup of the GPU-accelerated version of WEST with respect to its CPU version. Good strong and weak scaling is demonstrated using up to 25 920 GPUs. Finally, we showcase the capability of the GPU version of WEST for large-scale, full-frequency GW calculations of realistic systems, e.g., a nanostructure, an interface, and a defect, comprising up to 10 368 valence electrons.

摘要

多体微扰理论是一种从密度泛函理论计算输出出发模拟分子和材料中电子激发的强大方法。通过有效地实现该理论,以便在最新的领先高性能计算系统上大规模运行,可以扩展GW计算的范围。我们展示了在WEST代码中实现的全频GW方法的GPU加速研究。通过使用(i)优化的GPU库,如cuFFT和cuBLAS,(ii)一种将CPU-CPU、CPU-GPU和GPU-GPU数据传输操作降至最低的分层并行化策略,(iii)与GPU计算重叠的非阻塞MPI通信,以及(iv)代码选定部分的混合精度,实现了出色的性能。在领先的高性能计算系统上进行了一系列性能基准测试,结果表明WEST的GPU加速版本相对于其CPU版本有显著的加速。使用多达259​​20个GPU展示了良好的强缩放和弱缩放性能。最后,我们展示了WEST的GPU版本用于对包含多达10368个价电子的实际系统(如纳米结构、界面和缺陷)进行大规模全频GW计算的能力。

相似文献

1
GPU Acceleration of Large-Scale Full-Frequency GW Calculations.大规模全频GW计算的GPU加速
J Chem Theory Comput. 2022 Aug 9;18(8):4690-4707. doi: 10.1021/acs.jctc.2c00241. Epub 2022 Aug 1.
5
GPU Optimizations for a Production Molecular Docking Code.用于生产分子对接代码的GPU优化
IEEE Conf High Perform Extreme Comput. 2014 Sep;2014. doi: 10.1109/HPEC.2014.7040981.
8
Hybrid CPU/GPU Integral Engine for Strong-Scaling Ab Initio Methods.用于强缩放从头算方法的混合CPU/GPU集成引擎。
J Chem Theory Comput. 2017 Jul 11;13(7):3153-3159. doi: 10.1021/acs.jctc.6b01166. Epub 2017 Jun 21.

引用本文的文献

1
Sustainable chemistry with plasmonic photocatalysts.等离子体光催化剂助力可持续化学。
Nanophotonics. 2023 May 30;12(14):2745-2762. doi: 10.1515/nanoph-2023-0149. eCollection 2023 Jul.
2
Large-scale photonic inverse design: computational challenges and breakthroughs.大规模光子逆设计:计算挑战与突破
Nanophotonics. 2024 Jun 7;13(20):3765-3792. doi: 10.1515/nanoph-2024-0127. eCollection 2024 Aug.
3
Complementary probes for the electrochemical interface.用于电化学界面的互补探针。
Nat Rev Chem. 2024 Mar;8(3):159-178. doi: 10.1038/s41570-024-00575-5. Epub 2024 Feb 22.

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验