• 文献检索
  • 文档翻译
  • 深度研究
  • 学术资讯
  • Suppr Zotero 插件Zotero 插件
  • 邀请有礼
  • 套餐&价格
  • 历史记录
应用&插件
Suppr Zotero 插件Zotero 插件浏览器插件Mac 客户端Windows 客户端微信小程序
定价
高级版会员购买积分包购买API积分包
服务
文献检索文档翻译深度研究API 文档MCP 服务
关于我们
关于 Suppr公司介绍联系我们用户协议隐私条款
关注我们

Suppr 超能文献

核心技术专利:CN118964589B侵权必究
粤ICP备2023148730 号-1Suppr @ 2026

文献检索

告别复杂PubMed语法,用中文像聊天一样搜索,搜遍4000万医学文献。AI智能推荐,让科研检索更轻松。

立即免费搜索

文件翻译

保留排版,准确专业,支持PDF/Word/PPT等文件格式,支持 12+语言互译。

免费翻译文档

深度研究

AI帮你快速写综述,25分钟生成高质量综述,智能提取关键信息,辅助科研写作。

立即免费体验

SPEC CPU2006 和 SPEC CPU2017 在英特尔至强 Skylake-SP 上的内存层次结构特征。

Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.

机构信息

Departamento de Informática e Ingeniería de Sistemas - Aragón Institute for Engineering Research (I3A), Universidad de Zaragoza, Zaragoza, Spain.

出版信息

PLoS One. 2019 Aug 1;14(8):e0220135. doi: 10.1371/journal.pone.0220135. eCollection 2019.

DOI:10.1371/journal.pone.0220135
PMID:31369592
原文链接:https://pmc.ncbi.nlm.nih.gov/articles/PMC6675054/
Abstract

SPEC CPU is one of the most common benchmark suites used in computer architecture research. CPU2017 has recently been released to replace CPU2006. In this paper we present a detailed evaluation of the memory hierarchy performance for both the CPU2006 and single-threaded CPU2017 benchmarks. The experiments were executed on an Intel Xeon Skylake-SP, which is the first Intel processor to implement a mostly non-inclusive last-level cache (LLC). We present a classification of the benchmarks according to their memory pressure and analyze the performance impact of different LLC sizes. We also test all the hardware prefetchers showing they improve performance in most of the benchmarks. After comprehensive experimentation, we can highlight the following conclusions: i) almost half of SPEC CPU benchmarks have very low miss ratios in the second and third level caches, even with small LLC sizes and without hardware prefetching, ii) overall, the SPEC CPU2017 benchmarks demand even less memory hierarchy resources than the SPEC CPU2006 ones, iii) hardware prefetching is very effective in reducing LLC misses for most benchmarks, even with the smallest LLC size, and iv) from the memory hierarchy standpoint the methodologies commonly used to select benchmarks or simulation points do not guarantee representative workloads.

摘要

SPEC CPU 是计算机体系结构研究中最常用的基准套件之一。CPU2017 最近发布,以取代 CPU2006。在本文中,我们对 CPU2006 和单线程 CPU2017 基准测试的内存层次性能进行了详细评估。实验在英特尔至强 Skylake-SP 上执行,这是第一款实现大部分非包含性最后一级缓存(LLC)的英特尔处理器。我们根据内存压力对基准测试进行分类,并分析不同 LLC 大小对性能的影响。我们还测试了所有硬件预取器,发现它们在大多数基准测试中都能提高性能。经过全面的实验,我们可以得出以下结论:i)几乎一半的 SPEC CPU 基准测试在二级和三级缓存中的缺失率非常低,即使 LLC 尺寸较小且没有硬件预取,ii)总体而言,SPEC CPU2017 基准测试比 SPEC CPU2006 基准测试对内存层次资源的需求更少,iii)硬件预取对于大多数基准测试非常有效,可以减少 LLC 缺失,即使 LLC 尺寸最小,iv)从内存层次的角度来看,常用的选择基准测试或模拟点的方法并不能保证代表性的工作负载。

https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/3e13f14ddace/pone.0220135.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/69b177fb7c02/pone.0220135.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/42e7f9d2cc07/pone.0220135.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/b979fecbb736/pone.0220135.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/0dee08fcdd13/pone.0220135.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/8840925d29d8/pone.0220135.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/1a523e162339/pone.0220135.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/6ec72b5472b2/pone.0220135.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/77d5740ed84c/pone.0220135.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/aaa429756ad5/pone.0220135.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/db735e8d10d5/pone.0220135.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/0aa19b7f8248/pone.0220135.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/3e13f14ddace/pone.0220135.g012.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/69b177fb7c02/pone.0220135.g001.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/42e7f9d2cc07/pone.0220135.g002.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/b979fecbb736/pone.0220135.g003.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/0dee08fcdd13/pone.0220135.g004.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/8840925d29d8/pone.0220135.g005.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/1a523e162339/pone.0220135.g006.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/6ec72b5472b2/pone.0220135.g007.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/77d5740ed84c/pone.0220135.g008.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/aaa429756ad5/pone.0220135.g009.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/db735e8d10d5/pone.0220135.g010.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/0aa19b7f8248/pone.0220135.g011.jpg
https://cdn.ncbi.nlm.nih.gov/pmc/blobs/901b/6675054/3e13f14ddace/pone.0220135.g012.jpg

相似文献

1
Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.SPEC CPU2006 和 SPEC CPU2017 在英特尔至强 Skylake-SP 上的内存层次结构特征。
PLoS One. 2019 Aug 1;14(8):e0220135. doi: 10.1371/journal.pone.0220135. eCollection 2019.
2
Correction: Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.更正:英特尔至强铂金8100系列处理器上SPEC CPU2006和SPEC CPU2017的内存层次结构特性
PLoS One. 2024 May 9;19(5):e0303712. doi: 10.1371/journal.pone.0303712. eCollection 2024.
3
Challenges and opportunities for the simulation of calcium waves on modern multi-core and many-core parallel computing platforms.在现代多核和众核并行计算平台上模拟钙波所面临的挑战与机遇。
Int J Numer Method Biomed Eng. 2021 Nov;37(11):e3244. doi: 10.1002/cnm.3244. Epub 2019 Sep 2.
4
Combining instruction prefetching with partial cache locking to improve WCET in real-time systems.将指令预取与部分缓存锁定相结合以提高实时系统中的最坏情况执行时间。
PLoS One. 2013 Dec 26;8(12):e82975. doi: 10.1371/journal.pone.0082975. eCollection 2013.
5
Accelerating Sequence Alignments Based on FM-Index Using the Intel KNL Processor.基于FM索引并使用英特尔KNL处理器加速序列比对
IEEE/ACM Trans Comput Biol Bioinform. 2020 Jul-Aug;17(4):1093-1104. doi: 10.1109/TCBB.2018.2884701. Epub 2018 Dec 6.
6
Off-chip prefetching based on Hidden Markov Model for non-volatile memory architectures.基于隐马尔可夫模型的非易失性存储架构的片外预取。
PLoS One. 2021 Sep 14;16(9):e0257047. doi: 10.1371/journal.pone.0257047. eCollection 2021.
7
Analysing the performance of personal computers based on Intel microprocessors for sequence aligning bioinformatics applications.分析基于英特尔微处理器的个人计算机在生物信息学序列比对应用中的性能。
Int J Bioinform Res Appl. 2007;3(2):187-205. doi: 10.1504/IJBRA.2007.013602.
8
L2C2: Last-level compressed-contents non-volatile cache and a procedure to forecast performance and lifetime.L2C2:末级压缩内容非易失性高速缓存和一种预测性能及寿命的方法。
PLoS One. 2023 Feb 7;18(2):e0278346. doi: 10.1371/journal.pone.0278346. eCollection 2023.
9
Cooperative and out-of-core execution of the irregular wavefront propagation pattern on hybrid machines with Intel Xeon Phi™.在配备英特尔至强融核™的混合机上对不规则波前传播模式进行协同和核外执行。
Concurr Comput. 2018 Jul 25;30(14). doi: 10.1002/cpe.4425. Epub 2018 Jan 24.
10
Accelerating the Pace of Protein Functional Annotation With Intel Xeon Phi Coprocessors.利用英特尔至强融核协处理器加快蛋白质功能注释的步伐。
IEEE Trans Nanobioscience. 2015 Jun;14(4):429-439. doi: 10.1109/TNB.2015.2403776. Epub 2015 Mar 5.

引用本文的文献

1
Correction: Memory hierarchy characterization of SPEC CPU2006 and SPEC CPU2017 on the Intel Xeon Skylake-SP.更正:英特尔至强铂金8100系列处理器上SPEC CPU2006和SPEC CPU2017的内存层次结构特性
PLoS One. 2024 May 9;19(5):e0303712. doi: 10.1371/journal.pone.0303712. eCollection 2024.
2
L2C2: Last-level compressed-contents non-volatile cache and a procedure to forecast performance and lifetime.L2C2:末级压缩内容非易失性高速缓存和一种预测性能及寿命的方法。
PLoS One. 2023 Feb 7;18(2):e0278346. doi: 10.1371/journal.pone.0278346. eCollection 2023.