Lee Jeonghun, Song Tingting, He Jiayuan, Kandeepan Sithamparanathan, Wang Ke
Opt Express. 2021 Aug 2;29(16):26165-26182. doi: 10.1364/OE.427250.
The optical wireless communication (OWC) system has been widely studied as a promising solution for high-speed indoor applications. The transmitter diversity scheme has been proposed to improve the performance of high-speed OWC systems. However, the transmitter diversity is vulnerable to the delay of multiple channels. Recently neural networks have been studied to realize delay-tolerant indoor OWC systems, where long-short term memory (LSTM) and attention-augmented LSTM (ALSTM) recurrent neural networks (RNNs) have shown their capabilities. However, they have high computation complexity and long computation latency. In this paper, we propose a low complexity delay-tolerant RNN scheme for indoor OWC systems. In particular, an RNN with parallelized structure is proposed to reduce the computation cost. The proposed RNN schemes show comparable capability to the more complicated ALSTM, where a bit-error-rate (BER) performance within the forward-error-correction (FEC) limit is achieved for up to 5.5 symbol periods delays. In addition, previously studied LSTM/ALSTM schemes are implemented using high-end GPUs, which have high cost, high power consumption, and long processing latency. To solve these practical limitations, in this paper we further propose and demonstrate the FPGA-based RNN hardware accelerator for delay-tolerant indoor OWC systems. To optimize the processing latency and power consumption, we also propose two optimization methods: the parallel implementation with triple-phase clocking and the stream-in based computation with additive input data insertion. Results show that the FPGA-based RNN hardware accelerator with the proposed optimization methods achieves 96.75% effective latency reduction and 90.7% lower energy consumption per symbol compared with the FPGA-based RNN hardware accelerator without optimization. Compared to the GPU implementation, the latency is reduced by about 61% and the power consumption is reduced by about 58.1%.
光无线通信(OWC)系统作为一种适用于高速室内应用的有前景的解决方案,已得到广泛研究。为提高高速OWC系统的性能,人们提出了发射机分集方案。然而,发射机分集易受多径信道延迟的影响。近来,人们研究了神经网络以实现抗延迟的室内OWC系统,其中长短期记忆(LSTM)和注意力增强型LSTM(ALSTM)循环神经网络(RNN)已展现出其能力。然而,它们具有高计算复杂度和长计算延迟。在本文中,我们提出了一种用于室内OWC系统的低复杂度抗延迟RNN方案。具体而言,提出了一种具有并行结构的RNN以降低计算成本。所提出的RNN方案展现出与更复杂的ALSTM相当的能力,在高达5.5个符号周期延迟的情况下,实现了前向纠错(FEC)限制内的误码率(BER)性能。此外,先前研究的LSTM/ALSTM方案是使用高端GPU实现的,成本高、功耗大且处理延迟长。为解决这些实际限制,在本文中我们进一步提出并演示了用于抗延迟室内OWC系统的基于FPGA的RNN硬件加速器。为优化处理延迟和功耗,我们还提出了两种优化方法:三相时钟并行实现和基于流输入的加法输入数据插入计算。结果表明,与未优化的基于FPGA 的RNN硬件加速器相比,采用所提出优化方法的基于FPGA的RNN硬件加速器有效延迟降低了96.75%,每符号能耗降低了90.7%。与GPU实现相比,延迟降低了约61%,功耗降低了约58.1%。