Porrmann Florian, Pilz Sarah, Stella Alessandra, Kleinjohann Alexander, Denker Michael, Hagemeyer Jens, Rückert Ulrich
Cognitronics and Sensor Systems, CITEC, Bielefeld University, Bielefeld, Germany.
Institute of Neuroscience and Medicine (INM-6) and Institute for Advanced Simulation (IAS-6) and JARA-Institute Brain Structure-Function Relationships (INM-10), Jülich Research Center, Jülich, Germany.
Front Neuroinform. 2021 Sep 16;15:723406. doi: 10.3389/fninf.2021.723406. eCollection 2021.
The (spatio-temporal pike ttern etection and valuation) method was developed to find reoccurring spatio-temporal patterns in neuronal spike activity (parallel spike trains). However, depending on the number of spike trains and the length of recording, this method can exhibit long runtimes. Based on a realistic benchmark data set, we identified that the combination of pattern mining (using the algorithm) and the result filtering account for 85-90% of the method's total runtime. Therefore, in this paper, we propose a customized implementation tailored to the requirements of , which significantly accelerates pattern mining and result filtering. Our version allows for parallel and distributed execution, and due to the improvements made, an execution on heterogeneous and low-power embedded devices is now also possible. The implementation has been evaluated using a traditional workstation based on an Intel Broadwell Xeon E5-1650 v4 as a baseline. Furthermore, the heterogeneous microserver platform RECS|Box has been used for evaluating the implementation on two HiSilicon Hi1616 (Kunpeng 916), an Intel Coffee Lake-ER Xeon E-2276ME, an Intel Broadwell Xeon D-D1577, and three NVIDIA Tegra devices (Jetson AGX Xavier, Jetson Xavier NX, and Jetson TX2). Depending on the platform, our implementation is between 27 and 200 times faster than the original implementation. At the same time, the energy consumption was reduced by up to two orders of magnitude.
(时空尖峰模式检测与评估)方法的开发旨在发现神经元尖峰活动(并行尖峰序列)中反复出现的时空模式。然而,根据尖峰序列的数量和记录长度,该方法可能会有较长的运行时间。基于一个真实的基准数据集,我们发现模式挖掘(使用该算法)和结果过滤的组合占该方法总运行时间的85 - 90%。因此,在本文中,我们提出了一种针对该需求定制的实现方式,它显著加速了模式挖掘和结果过滤。我们的版本允许并行和分布式执行,并且由于所做的改进,现在也能够在异构和低功耗嵌入式设备上执行。该实现已使用基于英特尔至强E5-1650 v4的传统工作站作为基线进行评估。此外,异构微服务器平台RECS|Box已用于在两台海思Hi1616(鲲鹏916)、一颗英特尔酷睿湖-ER至强E-2276ME、一颗英特尔至强D-D1577以及三款英伟达Tegra设备(Jetson AGX Xavier、Jetson Xavier NX和Jetson TX2)上评估该实现。根据平台不同,我们的实现比原始实现快27至200倍。同时,能耗降低了多达两个数量级。